10,000 Matching Annotations
  1. Nov 2025
    1. Author response:

      The following is the authors’ response to the current reviews.

      Reviewer #1 (Public Review):

      Comments on revisions:

      This revision addressed all my previous comments.

      Reviewer #3 (Public Review):

      Comments on revisions:

      The authors addressed my comments and it is ready for publication.

      We are grateful for the reviewers’ effort and are encouraged by their generally positive assessment of our manuscript.

      Reviewer #1 (Recommendations For The Authors):

      This revision addressed all my previous comments. The only new issue concerns the authors’ response to the following comment of reviewer 3:

      (2) Authors note ”monovalent positive salt ions such as Na+ can be attracted, somewhat counterintuitively, into biomolecular condensates scaffolded by positively-charged polyelectrolytic IDRs in the presence of divalent counterions”. This may be due to the fact that the divalent negative counterions present in the dense phase (as seen in the ternary phase diagrams) also recruit a small amount of Na+.

      Author reply: The reviewer’s comment is valid, as a physical explanation for this prediction is called for. Accordingly, the following sentence is added to p. 10, lines 27-29: ...

      Here are my comments on this issue. Most IDPs with a net positive charge still have negatively charged residues, which in theory can bind cations. In fact, Caprin1 has 3 negatively charged residues (same as A1-LCD). All-atom simulations of MacAinsh et al (ref 72) have shown that these negatively charged residues bind Na+; I assume this effect can be captured by the coarsegrained models in the present study. Moreover, all-atom simulations showed that Na+ has a strong tendency to be coordinated by backbone carbonyls, which of course are present on all residues. Suggestions:

      (a) The authors may want to analyze the binding partners of Na+. Are they predominantly the3 negatively charged residues, or divalent counterions, or both?

      (b) The authors may want to discuss the potential underestimation of Na+ inside Caprin1 condensates due to the lack of explicit backbone carbonyls that can coordinate Na+ in their models. A similar problem applies to backbone amides that can coordinate anions, but to a lesser extent (see Fig. 3A of ref 72).

      The reviewer’s comments are well taken. Regarding the statement in the revised manuscript “This phenomenon arises because the positively charge monovalent salt ions are attracted to the negatively charged divalent counterions in the protein-condensed phase.”, it should be first noted that the statement was inferred from the model observation that Na+ is depleted in condensed Caprin1 (Fig. 2a) when the counterion is monovalent (an observation that was stated almost immediately preceding the quoted statement). To make this logical connection clearer as well as to address the reviewer’s point about the presence of negatively charged residues in Caprin1, we have modified this statement in the Version of Record (VOR) as follows:

      “This phenomenon most likely arises from the attraction of the positively charge monovalent salt ions to the negatively charged divalent counterions in the proteincondensed phase because although the three negatively charged D residues in Caprin1 can attract Na+, it is notable that Na+ is depleted in condensed Caprin1 when the counterion is monovalent (Fig. 2a).”

      The reviewer’s suggestion (a) of collecting statistics of Na+ interactions in the Caprin1 condensate is valuable and should be attempted in future studies since it is beyond the scope of the present work. Thus far, our coarse-grained molecular dynamics has considered only monovalent Cl− counterions. We do not have simulation data for divalent counterions.

      Following the reviewer’s suggestion (b), we have now added the following sentence in Discussion under the subheading “Effects of salt on biomolecular LLPS”:

      “In this regard, it should be noted that positively and negatively charged salt ions can also coordinate with backbone carbonyls and amides, respectively, in addition to coordinating with charged amino acid sidechains (MacAinsh et al., eLife 2024). The impact of such effects, which are not considered in the present coarse-grained models, should be ascertained by further investigations using atomic simulations (MacAinsh et al., eLife 2024; Rauscher & Pom`es, eLife 2017; Zheng et al., J Phys Chem B 2020).”

      Here we have added a reference to Rauscher & Pom`es, eLife 2017 to more accurately reflect progress made in atomic simulations of biomolecular condensates.

      More generally, regarding the reviewer’s comments on the merits of coarse-grained versus atomic approaches, we re-emphasize, as stated in our paper, that these approaches are complementary. Atomic approaches undoubtedly afford structurally and energetically high-resolution information. However, as it stands, simulations of the assembly-disassembly process of biomolecular condensate are nonideal because of difficulties in achieving equilibration even for a small model system with < 10 protein chains (MacAinsh et al., eLife 2024) although well-equilibrated simulations are possible for a reasonably-sized system with ∼ 30 chains when the main focus is on the condensed phase (Rauscher & Pom`es, eLife 2017). In this context, coarse-grained models are valuable for assessing the energetic role of salt ions in the thermodynamic stability of biomolecular condensates of physically reasonable sizes under equilibrium conditions.

      In addition to the above minor additions, we have also added citations in the VOR to two highly relevant recent papers: Posey et al., J Am Chem Soc 2024 for salt-dependent biomolecular condensation (mentioned in Dicussion under subheadings “Tielines in protein-salt phase diagrams” and “Counterion valency” together with added references to Hribar et al., J Am Chem Soc 2002 and Nostro & Ninham, Chem Rev 2012 for the Hofmeister phenomena discussed by Posey et al.) and Zhu et al., J Mol Cell Biol 2024 for ATP-modulated reentrant behavior (mentioned in Introduction). We have also added back a reference to our previous work Lin et al., J Mol Liq 2017 to provide more background information for our formulation.

      Reviewer #2 (Recommendations For The Authors):

      The authors have done a great job addressing previous comments.

      We thank this reviewer for his/her effort and are encouraged by the positive assessment of our revised manuscript.

      ---

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The authors used multiple approaches to study salt effects in liquid-liquid phase separation (LLPS). Results on both wild-type Caprin1 and mutants and on different types of salts contribute to a comprehensive understanding.

      Strengths:

      The main strength of this work is the thoroughness of investigation. This aspect is highlighted by the multiple approaches used in the study, and reinforced by the multiple protein variants and different salts studied.

      We are encouraged by this positive overall assessment.

      Weaknesses: (1) The multiple computational approaches are a strength, but they’re cruder than explicit-solvent all-atom molecular dynamics (MD) simulations and may miss subtle effects of salts. In particular, all-atom MD simulations demonstrate that high salt strengthens pi-types of interactions (ref. 42 and MacAinsh et al, https://www.biorxiv.org/content/10.1101/2024.05.26.596000v3).

      The relative strengths and limitations of coarse-grained vs all-atom simulation are now more prominently discussed beginning at the bottom of p. 5 through the first 8 lines of p. 6 of the revised manuscript (page numbers throughout this letter refer to those in the submitted pdf file of the revised manuscript), with MacAinsh et al. included in this added discussion (cited as ref. 72 in the revised manuscript). The fact that coarse-grained simulation may not provide insights into more subtle structural and energetic effects afforded by all-atom simulations with regard to π-related interaction is now further emphasized on p. 11 (lines 23–30), with reference to MacAinsh et al. as well as original ref. 42 (Krainer et al., now ref. 50 in the revised manuscript).

      (2) The paper can be improved by distilling the various results into a simple set of conclusions. By example, based on salt effects revealed by all-atom MD simulations, MacAinsh et al. presented a sequence-based predictor for classes of salt dependence. Wild-type Caprin1 fits right into the “high net charg”e class, with a high net charge and a high aromatic content, showing no LLPS at 0 NaCl and an increasing tendency of LLPS with increasing NaCl. In contrast, pY-Caprin1 belongs to the “screening” class, with a high level of charged residues and showing a decreasing tendency of LLPS.

      This is a helpful suggestion. We have now added a subsection with heading “Overview of key observations from complementary approaches” at the beginning of the “Results” section on p. 6 (lines 18–37) and the first line of p. 7. In the same vein, a few concise sentences to summarize our key results are added to the first paragraph of “Discussion” (p. 18, lines 23– 26). In particular, the relationship of Caprin1 and pY-Caprin1 with the recent classification by MacAinsh et al. (ref. 72) in terms of “high net charge” and “screening” classes is now also stated, as suggested by this reviewer, on p. 18 under “Discussion” (lines 26–30).

      (3) Mechanistic interpretations can be further simplified or clarified. (i) Reentrant salt effects (e.g., Fig. 4a) are reported but no simple explanation seems to have been provided. Fig. 4a,b look very similar to what has been reported as strong-attraction promotor and weak-attraction suppressor, respectively (ref. 50; see also PMC5928213 Fig. 2d,b). According to the latter two studies, the “reentrant” behavior of a strong-attraction promotor, CL- in the present case, is due to Cl-mediated attraction at low to medium [NaCl] and repulsion between Cl- ions at high salt. Do the authors agree with this explanation? If not, could they provide another simple physical explanation? (ii) The authors attributed the promotional effect of Cl- to counterionbridged interchain contacts, based on a single instance. There is another simple explanation, i.e., neutralization of the net charge on Caprin1. The authors should analyze their simulation results to distinguish net charge neutralization and interchain bridging; see MacAinsh et al.

      The relationship of Cl− in bridging and neutralizing configurations, respectively, with the classification of “strong-attraction promoter” and “weak-attraction suppressor” by Zhou and coworkers is now stated on p. 13 (lines 29–31), with reference to original ref. 50 by Ghosh, Mazarakos & Zhou (now ref. 59 in the revised manuscript) as well as the earlier patchy particle model study PMC5928213 by Nguemaha & Zhou, now cited as ref. 58 in the revised manuscript. After receiving this referee report, we have conducted an extensive survey of our coarse-grained MD data to provide a quantitative description of the prevalence of counterion (Cl−) bridging interactions linking positively charged arginines (Arg+s) on different Caprin1 chains in the condensed phase (using the [Na+] = 0 case as an example). The newly compiled data is reported under a new subsection heading “Explicit-ion MD offers insights into counterion-mediated interchain bridging interactions among condensed Caprin1 molecules” on p. 12 (last five lines)–p. 14 (first 10 lines) [∼ 1_._5 additional page] as well as a new Fig. 6 to depict the statistics of various Arg+–Cl−–Arg+ configurations, with the conclusion that a vast majority (at least 87%) of Cl− counterions in the Caprin1-condensed phase engage in favorable condensation-driving interchain bridging interactions.

      (4) The authors presented ATP-Mg both as a single ion and as two separate ions; there is no explanation of which of the two versions reflects reality. When presenting ATP-Mg as a single ion, it’s as though it forms a salt with Na+. I assume NaCl, ATP, and MgCl2 were used in the experiment. Why is Cl- not considered? Related to this point, it looks like ATP is just another salt ion studied and much of the Results section is on NaCl, so the emphasis of ATP (“Diverse Roles of ATP” in the title is somewhat misleading.

      We model ATP and ATP-Mg both as single-bead ions (in rG-RPA) and also as structurally more realistic short multiple-bead polymers (in field-theoretic simulation, FTS). We have now added discussions to clarify our modeling rationale in using and comparing different models for ATP and ATP-Mg, as follows:

      p. 8 (lines 19–36):

      “The complementary nature of our multiple methodologies allows us to focus sharply on the electrostatic aspects of hydrolysis-independent role of ATP in biomolecular condensation by comparing ATP’s effects with those of simple salt. Here, Caprin1 and pY-Caprin1 are modeled minimally as heteropolymers of charged and neutral beads in rG-RPA and FTS. ATP and ATP-Mg are modeled as simple salts (singlebead ions) in rG-RPA whereas they are modeled with more structural complexity as short charged polymers (multiple-bead chains) in FTS, though the latter models are still highly coarse-grained. Despite this modeling difference, rG-RPA and FTS both rationalize experimentally observed ATP- and NaCl-modulated reentrant LLPS of Caprin1 and a lack of a similar reentrance for pY-Caprin1 as well as a prominent colocalization of ATP with the Caprin1 condensate. Consistently, the same contrasting trends in the effect of NaCl on Caprin1 and pY-Caprin1 are also seen in our coarse-grained MD simulations, though polymer field theories tend to overestimate LLPS propensity [99]. The robustness of the theoretical trends across different modeling platforms underscores electrostatics as a significant component in the diverse roles of ATP in the context of its well-documented ability to modulate biomolecular LLPS via hydrophobic and π-related effects [63, 65, 67].”

      Here, the last sentence quoted above addresses this reviewer’s question about our intended meaning in referring to “diverse roles of ATP” in the title of our paper. To make this point even clearer, we have also added the following sentence to the Abstract (p. 2, lines 12–13):

      “... The electrostatic nature of these features complements ATP’s involvement in π-related interactions and as an amphiphilic hydrotrope, ...”

      Moreover, to enhance readability, we have now added pointers in the rG-RPA part of our paper to anticipate the structurally more complex ATP and ATP-Mg models to be introduced subsequently in the FTS part, as follows:

      p. 9 (lines 13–15):

      “As mentioned above, in the present rG-RPA formulation, (ATP-Mg)<sup>2−</sup> and ATP<sup>4−</sup> are modeled minimally as a single-bead ion. They are represented by charged polymer models with more structural complexity in the FTS models below.”

      p. 11 (lines 8–11):

      These observations from analytical theory will be corroborated by FTS below with the introduction of structurally more realistic models of (ATP-Mg) <sup>2−</sup>, ATP<sup>4−</sup> together with the possibility of simultaneous inclusion of Na<sup>+</sup>, Cl−, and Mg<sup>2+</sup> in the FTS models of Caprin1/pY-Caprin1 LLPS systems.

      Reviewer #2 (Public Review):

      Summary:

      In this paper, Lin and colleagues aim to understand the role of different salts on the phase behavior of a model protein of significant biological interest, Caprin1, and its phosphorylated variant, pY-Caprin1. To achieve this, the authors employed a variety of methods to complement experimental studies and obtain a molecular-level understanding of ion partitioning inside biomolecular condensates. A simple theory based on rG-RPA is shown to capture the different salt dependencies of Caprin1 and pY-Caprin1 phase separation, demonstrating excellent agreement with experimental results. The application of this theory to multivalent ions reveals many interesting features with the help of multicomponent phase diagrams. Additionally, the use of CG model-based MD simulations and FTS provides further clarity on how counterions can stabilize condensed phases.

      Strengths:

      The greatest strength of this study lies in the integration of various methods to obtain complementary information on thermodynamic phase diagrams and the molecular details of the phase separation process. The authors have also extended their previously proposed theoretical approaches, which should be of significant interest to other researchers. Some of the findings reported in this paper, such as bridging interactions, are likely to inspire new studies using higher-resolution atomistic MD simulations.

      Weaknesses:

      The paper does not have any major issues.

      We are very encouraged by this reviewer’s positive assessment of our work.

      Reviewer #3 (Public Review):

      Authors first use rG-RPA to reproduce two observed trends. Caprin1 does not phase separate at very low salt but then undergoes LLPS with added salt while further addition of salt reduces its propensity to LLPS. On the other hand pY-Caprin1 exhibits a monotonic trend where the propensity to phase separate decreases with the addition of salt. This distinction is captured by a two component model and also when salt ions are explicitly modeled as a separate species with a ternary phase diagram. The predicted ternary diagrams (when co and counter ions are explicitly accounted for) also predict the tendency of ions to co-condense or exclude proteins in the dense phase. Predicted trends are generally in line with the measurement for Cparin1 [sic]. Next, the authors seek to explain the observed difference in phase separation when Arginines are replaced by Lysines creating different variants. In the current rG-RPA type models both Arginine (R) and Lysine (K) are treated equally since non-electrostatic effects are only modeled in a meanfield manner that can be fitted but not predicted. For this reason, coarse grain MD simulation is suitable. Moreover, MD simulation affords structural features of the condensates. They used a force field that is capable of discriminating R and K. The MD predicted degrees of LLPS of these variants again is consistent with the measurement. One additional insight emerges from MD simulations that a negative ion can form a bridge between two positively charged residues on the chain. These insights are not possible to derive from rG-RPA. Both rG-RPA and MD simulation become cumbersome when considering multiple types of ions such as Na, Cl, [ATP] and [ATP-Mg] all present at the same time. FTS is well suited to handle this complexity. FTS also provides insights into the co-localization of ions and proteins that is consistent with NMR. By using different combinations of ions they confirm the robustness of the prediction that Caprin1 shows salt-dependent reentrant behavior, adding further support that the differential behavior of Caprin1, and pY-Caprin1 is likely to be mediated by charge-charge interactions.

      We are encouraged by this reviewer’s positive assessment of our manuscript.

      Reviewer #1 (Recommendations For The Authors):

      Analysis:

      Analyze the simulation results to distinguish net charge neutralization and interchain bridging; see MacAinsh et al.

      Please see response above to points (3) and (4) under “Weaknesses” in this reviewer’s public review. We have now added a 1.5-page subsection starting from the bottom of p. 12 to the top of p. 14 to discuss a new extensive analysis of Arg<sup>+</sup>–Cl<sup>−</sup>–Arg<sup>+</sup> configurations to identify bridging interactions, with key results reported in a new Fig. 6 (p. 42). Recent results from MacAinsh, Dey & Zhou (cited now as ref. 72) are included in the added discussion. Relevant advances made in MacAinsh et al., including clarification and classification of salt-mediated interactions in the phase separation of A1-LCD are now mentioned multiple times in the revised manuscript (p. 5, lines 19–20; p. 6, lines 2–5; p. 11, line 30; p. 14, line 10; p. 18, lines 28–29; and p. 20, line 4).

      Writing and presentation

      (1) Cite subtle effects that may be missed by the coarser approaches in this study

      Please see response above to point (1) under “Weaknesses” in this reviewer’s public review.

      (2) Try to distill the findings into a simple set of conclusions

      Please see response above to point (2) under “Weaknesses” in this reviewer’s public review.

      (3) Clarify and simplify physical interpretations

      Please see response above to point (2) under “Weaknesses” in this reviewer’s public review.

      (4) Explain the treatment of ATP-Mg as either a single ion or two separate ions; reconsider modifying the reference to ATP in the title

      Please see response above to point (4) under “Weaknesses” in this reviewer’s public review.

      (5) Minor points:

      p. 4, citation of ref 56: this work shows ATP is a driver of LLPS, not merely a regulator (promotor or suppressor)

      This citation to original ref. 56 (now ref. 63) on p. 4 is now corrected (bottom line of p. 4).

      p. 7 and throughout: “using bulk [Caprin1]” – I assume this is the initial overall Caprin1 concentration. It would avoid confusion to state such concentrations as “initial” or “initial overall”

      We have now added “initial overall concentration” in parentheses on p. 8 (line 4) to clarify the meaning of “bulk concentration”.

      p. 7 and throughout: both mM (also uM) and mg/ml have been used as units of protein concentration and that can cause confusion. Indeed, the authors seem to have confused themselves on p. 9, where 400 (750) mM is probably 400 (750) mg/ml. The same with the use of mM and M for salt concentrations (400 mM Mg2+ but 0.1 and 1.0 M Na+)

      Concentrations are now given in both molarity and mass density in Fig. 1 (p. 37), Fig. 2 (p. 38), Fig. 4 (p. 40), and Fig. 7 (p. 43), as noted in the text on p. 8 (lines 4–5). Inconsistencies and errors in quoting concentrations are now corrected (p. 10, line 18, and p. 11, line 2).

      p. 7, “LCST-like”: isn’t this more like a case of a closed coexistence curve that contains both UCST and LCST?

      The discussion on p. 8 around this observation from Fig. 1d is now expanded, including alluding to the theoretical possibility of a closed co-existence curve mentioned by this reviewer, as follows:

      “Interestingly, the decrease in some of the condensed-phase [pY-Caprin1]s with decreasing T (orange and green symbols for ≲ 20◦C in Fig. 1d trending toward slightly lower [pY-Caprin1]) may suggest a hydrophobicity-driven lower critical solution temperature (LCST)-like reduction of LLPS propensity as temperature approaches ∼ 0◦C as in cold denaturation of globular proteins [7,23] though the hypothetical LCST is below 0◦C and therefore not experimentally accessible. If that is the case, the LLPS region would resemble those with both an UCST and a LCST [4]. As far as simple modeling is concerned, such a feature may be captured by a FH model wherein interchain contacts are favored by entropy at intermediate to low temperatures and by enthalpy at high temperatures, thus entailing a heat capacity contribution in χ(T), with [7,109,110] beyond the temperature-independent ϵ<sub>h</sub> and ϵ<sub>s</sub> used in Fig. 1c,d and Fig. 2. Alternatively, a reduction in overall condensed-phase concentration can also be caused by formation of heterogeneous locally organized structures with large voids at low temperatures even when interchain interactions are purely enthalpic (Fig. 4 of ref. [111]).”

      p. 8 “Caprin1 can undergo LLPS without the monovalent salt (Na+) ions (LLPS regions extend to [Na+] = 0 in Fig. 2e,f”: I don’t quite understand what’s going on here. Is the effect caused by a small amount of counterion (ATP-Mg) that’s calculated according to eq 1 (with z s set to 0)?

      The discussion of this result in Fig. 2e,f is now clarified as follows (p. 10, lines 8–14 in the revised manuscript):

      “The corresponding rG-RPA results (Fig. 2e–h) indicate that, in the present of divalent counterions (needed for overall electric neutrality of the Caprin1 solution), Caprin1 can undergo LLPS without the monvalent salt (Na+) ions (LLPS regions extend to [Na+] = 0 in Fig. 2e,f; i.e., ρs \= 0, ρc > 0 in Eq. (1)), because the configurational entropic cost of concentrating counterions in the Caprin1 condensed phase is lesser for divalent (zc \= 2) than for monovalent (zc \= 1) counterions as only half of the former are needed for approximate electric neutrality in the condensed phase.”

      p. 9 “Despite the tendency for polymer field theories to overestimate LLPS propensity and condensed-phase concentrations”: these limitations should be mentioned earlier, along with the very high concentrations (e.g., 1200 mg/ml) in Fig. 2

      This sentence (now on p. 11, lines 11–18) is now modified to clarify the intended meaning as suggested by this reviewer:

      “Despite the tendency for polymer field theories to overestimate LLPS propensity and condensed-phase concentrations quantitatively because they do not account for ion condensation [99]—which can be severe for small ions with more than ±1 charge valencies as in the case of condensed [Caprin1] ≳ 120 mM in Fig. 2i–l, our present rG-RPA-predicted semi-quantitative trends are consistent with experiments indicating “

      In addition, this limitation of polymer field theories is also mentioned earlier in the text on p. 6, lines 30–31.

      Reviewer #2 (Recommendations For The Authors):

      (1) he current version of the paper goes through many different methodologies, but how these methods complement or overlap in terms of their applicability to the problem at hand may not be so clear. This can be especially difficult for readers not well-versed in these methods. I suggest the authors summarize this somewhere in the paper.

      As mentioned above in response to Reviewer #1, we have now added a subsection with heading “Overview of key observations from complementary approaches” at the beginning of the “Results” section on p. 6 (lines 18–37) and the first line of p. 7 to make our paper more accessible to readers who might not be well-versed in the various theoretical and computational techniques. A few sentences to summarize our key results are added as well to the first paragraph of “Discussion” (p. 18, lines 23–26).

      (2) It wasn’t clear if the authors obtained LCST-type behavior in Figure 1d or if another phenomenon is responsible for the non-monotonic change in dense phase concentrations. At the very least, the authors should comment on the possibility of observing LCST behavior using the rG-RPA model and if modifications are needed to make the theory more appropriate for capturing LCST.

      As mentioned above in response to Reviewer #1, the discussion regarding possible LCSTtype behanvior in Fig. 1d is now expanded to include two possible physical origins: (i) hydrophobicity-like temperature-dependent effective interactions, and (ii) formation of heterogeneous, more open structures in the condensed phase at low temperatures. Three additional references [109, 110, 111] (from the Dill, Chan, and Panagiotopoulos group respectively) are now included to support the expanded discussion. Again, the modified discussion is as follows:

      “Interestingly, the decrease in some of the condensed-phase [pY-Caprin1]s with decreasing T (orange and green symbols for ≲ 20◦C in Fig. 1d trending toward slightly lower [pY-Caprin1]) may suggest a hydrophobicity-driven lower critical solution temperature (LCST)-like reduction of LLPS propensity as temperature approaches ∼ 0◦C as in cold denaturation of globular proteins [7,23] though the hypothetical LCST is below 0◦C and therefore not experimentally accessible. If that is the case, the LLPS region would resemble those with both an UCST and a LCST [4]. As far as simple modeling is concerned, such a feature may be captured by a FH model wherein interchain contacts are favored by entropy at intermediate to low temperatures and by enthalpy at high temperatures, thus entailing a heat capacity contribution in χ(T), with [7,109,110] beyond the temperature-independent ϵ<sub>h</sub> and ϵ<sub>s</sub> used in Fig. 1c,d and Fig. 2. Alternatively, a reduction in overall condensed-phase concentration can also be caused by formation of heterogeneous locally organized structures with large voids at low temperatures even when interchain interactions are purely enthalpic (Fig. 4 of ref. [111]).”

      (3) In Figures 4c and 4d, ionic density profiles could be shown as a separate zoomed-in version to make it easier to see the results.

      This is an excellent suggestion. Two such panels are now added to Fig. 4 (p. 40) as parts (g) and (h).

      Reviewer #3 (Recommendations For The Authors):

      I would suggest authors make some minor edits as noted here.

      (1) Please note down the chi values that were used when fitting experimental phase diagrams with rG-RPA theory in Figure 2a,b. At present there aren’t too many such values available in the literature and reporting these would help to get an estimate of effective chi values when electrostatics is appropriately modeled using rG-RPA.

      The χ(T) values and their enthalpic and entropic components ϵh and ϵs used to fit the experimental data in Fig. 1c,d are now stated in the caption for Fig. 1 (p. 37). Same fitted χ(T) values are used in Fig. 2 (p. 38) as it is now stated in the revised caption for Fig. 2. Please note that for clarity we have now changed the notation from ∆h and ∆s in our originally submitted manuscript to ϵh and ϵs in the revised text (p. 7, last line) as well as in the revised figure captions to conform to the notation in our previous works [18, 71].

      (2) Authors note “monovalent positive salt ions such as Na+ can be attracted, somewhat counterintuitively, into biomolecular condensates scaffolded by positively-charged polyelectrolytic IDRs in the presence of divalent counterions”. This may be due to the fact that the divalent negative counterions present in the dense phase (as seen in the ternary phase diagrams) also recruit a small amount of Na+.

      The reviewer’s comment is valid, as a physical explanation for this prediction is called for. Accordingly, the following sentence is added to p. 10, lines 27–29:

      “This phenomenon arises because the positively charge monovalent salt ions are attracted to the negatively charged divalent counterions in the protein-condensed phase.”

      (3) In the discussion where authors contrast the LLPS propensity of Caprin1 against FUS, TDP43, Brd4, etc, they correctly note majority of these other proteins have low net charge and possibly higher non-electrostatic interaction that can promote LLPS at room temperature even in the absence of salt. It is also worth noting if some of these proteins were forced to undergo LLPS with crowding which is sometimes typical. A quick literature search will make this clear.

      A careful reading of the work in question (Krainer et al., ref. 50) does not suggest that crowders were used to promote LLPS for the proteins the authors studied. Nonetheless, the reviewer’s point regarding the potential importance of crowder effects is well taken. Accordingly, crowder effects are now mentioned briefly in the Introduction (p. 4, line 13), with three additional references on the impact of crowding on LLPS added [30–32] (from the Spruijt, Mukherjee, and Rakshit groups respectively). In this connection, to provide a broader historical context to the introductory discussion of electrostatics effects in biomolecular processes in general, two additional influential reviews (from the Honig and Zhou groups respectively) are now cited as well [15, 16].

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The authors used structural and biophysical methods to provide insight into Parkin regulation. The breadth of data supporting their findings was impressive and generally well-orchestrated. Still, the impact of their results builds on recent structural studies and the stated impact is based on these prior works.

      Strengths:

      (1) After reading through the paper, the major findings are:

      - RING2 and pUbl compete for binding to RING0.

      - Parkin can dimerize.

      - ACT plays an important role in enzyme kinetics.

      (2) The use of molecular scissors in their construct represents a creative approach to examining inter-domain interactions.

      (3) From my assessment, the experiments are well-conceived and executed.

      We thank the reviewer for their positive remark and extremely helpful suggestions.

      Weaknesses:

      The manuscript, as written, is NOT for a general audience. Admittedly, I am not an expert on Parkin structure and function, but I had to do a lot of homework to try to understand the underlying rationale and impact. This reflects, I think, that the work generally represents an incremental advance on recent structural findings.

      To this point, it is hard to understand the impact of this work without more information highlighting the novelty. There are several structures of Parkin in various auto-inhibited states, and it was hard to delineate how this is different.

      For the sake of the general audience, we have included all the details of Parkin structures and conformations seen (Extended Fig. 1). The structures in the present study are to validate the biophysical/biochemical experiments, highlighting key findings. For example, we solved the phospho-Parkin (complex with pUb) structure after treatment with 3C protease (Fig. 2C), which washes off the pUbl-linker, as shown in Fig 2B. The structure of the pUbl-linker depleted phospho-Parkin-pUb complex showed that RING2 returned to the closed state (Fig. 2C), which is confirmation of the SEC assay in Fig. 2B. Similarly, the structure of the pUbl-linker depleted phospho-Parkin R163D/K211N-pUb complex (Fig. 3C), was done to validate the SEC data showing displacement of pUbl-linker is independent of pUbl interaction with the basic patch on RING0 (Fig. 3B). In addition, the latter structure also revealed a new donor ubiquitin binding pocket in the linker (connecting REP and RING2) region of Parkin (Fig. 9). Similarly, trans-complex structure of phospho-Parkin (Fig. 4D) was done to validate the biophysical data (Fig. 4A-C, Fig. 5A-D) showing trans-complex between phospho-Parkin and native Parkin. The latter also confirmed that the trans-complex was mediated by interactions between pUbl and the basic patch on RING0 (Fig. 4D). Furthermore, we noticed that the ACT region was disordered in the trans-complex between phospho-Parkin (1-140 + 141-382 + pUb) (Fig. 8A) which had ACT from the trans molecule, indicating ACT might be present in the cis molecule. The latter was validated from the structure of trans-complex between phospho-Parkin with cis ACT (1-76 + 77-382 + pUb) (Fig. 8C), showing the ordered ACT region. The structural finding was further validated by biochemical assays (Fig. 8 D-F, Extended Data Fig. 9C-E).

      The structure of TEV-treated R0RBR (TEV) (Extended Data Fig. 4C) was done to ensure that the inclusion of TEV and treatment with TEV protease did not perturb Parkin folding, an important control for our biophysical experiments.

      As noted, I appreciated the use of protease sites in the fusion protein construct. It is unclear how the loop region might affect the protein structure and function. The authors worked to demonstrate that this did not introduce artifacts, but the biological context is missing.

      We thank the reviewer for appreciating the use of protease sites in the fusion protein construct.  Protease sites were used to overcome the competing mode of binding that makes interactions very transient and beyond the detection limit of methods such as ITC or SEC. While these interactions are quite transient in nature, they could still be useful for the activation of various Parkin isoforms that lack either the Ubl domain or RING2 domain (Extended Data Fig. 6, Fig. 10). Also, our Parkin localization assays also suggest an important role of these interactions in the recruitment of Parkin molecules to the damaged mitochondria (Fig. 6).

      While it is likely that the binding is competitive between the Ubl and RING2 domains, the data is not quantitative. Is it known whether the folding of the distinct domains is independent? Or are there interactions that alter folding? It seems plausible that conformational rearrangements may invoke an orientation of domains that would be incompatible. The biological context for the importance of this interaction was not clear to me.

      This is a great point. In the revised manuscript, we have included quantitative data between phospho-Parkin and untethered ∆Ubl-Parkin (TEV) (Fig. 5B) showing similar interactions using phospho-Parkin K211N and untethered ∆Ubl-Parkin (TEV) (Fig. 4B). Folding of Ubl domain or various combinations of RING domains lacking Ubl seems okay. Also, folding of the RING2 domain on its own appears to be fine. However, human Parkin lacking the RING2 domain seems to have some folding issues, majorly due to exposure of hydrophobic pocket on RING0, also suggested by previous efforts (Gladkova et al.ref. 24, Sauve et al. ref. 29).  The latter could be overcome by co-expression of RING2 lacking Parkin construct with PINK1 (Sauve et al. ref. 29) as phospho-Ubl binds on the same hydrophobic pocket on RING0 where RING2 binds. A drastic reduction in the melting temperature of phospho-Parkin (Gladkova et al.ref. 24), very likely due to exposure of hydrophobic surface between RING0 and RING2, correlates with the folding issues of RING0 exposed human Parkin constructs.

      From the biological context, the competing nature between phospho-Ubl and RING2 domains could block the non-specific interaction of phosphorylated-ubiquitin-like proteins (phospho-Ub or phospho-NEDD8) with RING0 (Lenka et al. ref. 33), during Parkin activation. 

      (5) What is the rationale for mutating Lys211 to Asn? Were other mutations tried? Glu? Ala? Just missing the rationale. I think this may have been identified previously in the field, but not clear what this mutation represents biologically.

      Lys211Asn is a Parkinson’s disease mutation; therefore, we decided to use the same mutation for biophysical studies.  

      I was confused about how the phospho-proteins were generated. After looking through the methods, there appear to be phosphorylation experiments, but it is unclear what the efficiency was for each protein (i.e. what % gets modified). In the text, the authors refer to phospho-Parkin (T270R, C431A), but not clear how these mutations might influence this process. I gather that these are catalytically inactive, but it is unclear to me how this is catalyzing the ubiquitination in the assay.

      This is an excellent question. Because different phosphorylation statuses would affect the analysis, we ensured complete phosphorylation status using Phos-Tag SDS-PAGE, as shown below.

      Author response image 1.

      Our biophysical experiments in Fig. 5C show that trans complex formation is mediated by interactions between the basic patch (comprising K161, R163, K211) on RING0 and phospho-Ubl domain in trans. These interactions result in the displacement of RING2 (Fig. 5C). Parkin activation is mediated by displacement of RING2 and exposure of catalytic C431 on RING2. While phospho-Parkin T270R/C431A is catalytically dead, the phospho-Ubl domain of phospho-Parkin T270R/C431would bind to the basic patch on RING0 of WT-Parkin resulting in activation of WT-Parkin as shown in Fig. 5E. A schematic figure is shown below to explain the same.

      Author response image 2.

      (7) The authors note that "ACT can be complemented in trans; however, it is more efficient in cis", but it is unclear whether both would be important or if the favored interaction is dominant in a biological context.

      First, this is an excellent question about the biological context of ACT and needs further exploration. While due to the flexible nature of ACT, it can be complemented both in cis and trans, we can only speculate cis interactions between ACT and RING0 could be more relevant from the biological context as during protein synthesis and folding, ACT would be translated before RING2, and thus ACT would occupy the small hydrophobic patch on RING0 in cis. Unpublished data shows the replacement of the ACT region by Biogen compounds to activate Parkin (https://doi.org/10.21203/rs.3.rs-4119143/v1). The latter finding further suggests the flexibility in this region.        

      (8) The authors repeatedly note that this study could aid in the development of small-molecule regulators against Parkin to treat PD, but this is a long way off. And it is not clear from their manuscript how this would be achieved. As stated, this is conjecture.

      As suggested by this reviewer, we have removed this point in the revised manuscript.

      Reviewer #2 (Public Review):

      This manuscript uses biochemistry and X-ray crystallography to further probe the molecular mechanism of Parkin regulation and activation. Using a construct that incorporates cleavage sites between different Parkin domains to increase the local concentration of specific domains (i.e., molecular scissors), the authors suggest that competitive binding between the p-Ubl and RING2 domains for the RING0 domain regulates Parkin activity. Further, they demonstrate that this competition can occur in trans, with a p-Ubl domain of one Parkin molecule binding the RING0 domain of a second monomer, thus activating the catalytic RING1 domain. In addition, they suggest that the ACT domain can similarly bind and activate Parkin in trans, albeit at a lower efficiency than that observed for p-Ubl. The authors also suggest from crystal structure analysis and some biochemical experiments that the linker region between RING2 and repressor elements interacts with the donor ubiquitin to enhance Parkin activity.<br /> Ultimately this manuscript challenges previous work suggesting that the p-Ubl domain does not bind to the Parkin core in the mechanism of Parkin activation. The use of the 'molecular scissors' approach to probe these effects is an interesting approach to probe this type of competitive binding. However, there are issues with the experimental approach manuscript that detract from the overall quality and potential impact of the work.

      We thank the reviewer for their positive remark and constructive suggestions.

      The competitive binding between p-Ubl and RING2 domains for the Parkin core could have been better defined using biophysical and biochemical approaches that explicitly define the relative affinities that dictate these interactions. A better understanding of these affinities could provide more insight into the relative bindings of these domains, especially as it relates to the in trans interactions.

      This is an excellent point regarding the relative affinities of pUbl and RING2 for the Parkin core (lacking Ubl and RING2). While we could purify p-Ubl, we failed to purify human Parkin (lacking RING2 and phospho-Ubl). The latter folding issues were likely due to the exposure of a highly hydrophobic surface on RING0 (as shown below) in the absence of pUbl and RING2 in the R0RB construct. Also, RING2 with an exposed hydrophobic surface would be prone to folding issues, which is not suitable for affinity measurements. A drastic reduction in the melting temperature of phospho-Parkin (Gladkova et al.ref. 24) also highlights the importance of a hydrophobic surface between RING0 and RING2 on Parkin folding/stability. A separate study would be required to try these Parkin constructs from different species and ensure proper folding before using them for affinity measurements.

      Author response image 3.

      I also have concerns about the results of using molecular scissors to 'increase local concentrations' and allow for binding to be observed. These experiments are done primarily using proteolytic cleavage of different domains followed by size exclusion chromatography. ITC experiments suggest that the binding constants for these interactions are in the µM range, although these experiments are problematic as the authors indicate in the text that protein precipitation was observed during these experiments. This type of binding could easily be measured in other assays. My issue relates to the ability of a protein complex (comprising the core and cleaved domains) with a Kd of 1 µM to be maintained in an SEC experiment. The off-rates for these complexes must be exceeding slow, which doesn't really correspond to the low µM binding constants discussed in the text. How do the authors explain this? What is driving the Koff to levels sufficiently slow to prevent dissociation by SEC? Considering that the authors are challenging previous work describing the lack of binding between the p-Ubl domain and the core, these issues should be better resolved in this current manuscript. Further, it's important to have a more detailed understanding of relative affinities when considering the functional implications of this competition in the context of full-length Parkin. Similar comments could be made about the ACT experiments described in the text.

      This is a great point. In the revised manuscript, we repeated ITC measurements in a different buffer system, which gave nice ITC data. In the revised manuscript, we have also performed ITC measurements using native phospho-Parkin. Phospho-Parkin and untethered ∆Ubl-Parkin (TEV) (Fig. 5B) show similar affinities as seen between phospho-Parkin K211N and untethered ∆Ubl-Parkin (TEV) (Fig. 4B). However, Kd values were consistent in the range of 1.0 ± 0.4 µM which could not address the reviewer’s point regarding slow off-rate. The crystal structure of the trans-complex of phospho-Parkin shows several hydrophobic and ionic interactions between p-Ubl and Parkin core, suggesting a strong interaction and, thus, justifying the co-elution on SEC. Additionally, ITC measurements between E2-Ub and P-Parkin-pUb show similar affinity (Kd = 0.9 ± 0.2 µM) (Kumar et al., 2015, EMBO J.), and yet they co-elute on SEC (Kumar et al., 2015, EMBO J.).

      Ultimately, this work does suggest additional insights into the mechanism of Parkin activation that could contribute to the field. There is a lot of information included in this manuscript, giving it breadth, albeit at the cost of depth for the study of specific interactions. Further, I felt that the authors oversold some of their data in the text, and I'd recommend being a bit more careful when claiming an experiment 'confirms' a specific model. In many cases, there are other models that could explain similar results. For example, in Figure 1C, the authors state that their crystal structure 'confirms' that "RING2 is transiently displaced from the RING0 domain and returns to its original position after washing off the p-Ubl linker". However, it isn't clear to me that RING2 ever dissociated when prepared this way. While there are issues with the work that I feel should be further addressed with additional experiments, there are interesting mechanistic details suggested by this work that could improve our understanding of Parkin activation. However, the full impact of this work won't be fully appreciated until there is a more thorough understanding of the regulation and competitive binding between p-Ubl and RIGN2 to RORB both in cis and in trans.

      We thank the reviewer for their positive comment. In the revised manuscript, we have included the reviewer’s suggestion. The conformational changes in phospho-Parkin were established from the SEC assay (Fig. 2A and Fig. 2B), which show displacement/association of phospho-Ubl or RING2 after treatment of phospho-Parkin with 3C and TEV, respectively. For crystallization, we first phosphorylated Parkin, where RING2 is displaced due to phospho-Ubl (as shown in SEC), followed by treatment with 3C protease, which led to pUbl wash-off. The Parkin core separated from phospho-Ubl on SEC was used for crystallization and structure determination in Fig. 2C, where RING2 returned to the RING0 pocket, which confirms SEC data (Fig. 2B).

      Reviewer #3 (Public Review):

      Summary:

      In their manuscript "Additional feedforward mechanism of Parkin activation via binding of phospho-UBL and RING0 in trans", Lenka et al present data that could suggest an "in trans" model of Parkin ubiquitination activity. Parkin is an intensely studied E3 ligase implicated in mitophagy, whereby missense mutations to the PARK2 gene are known to cause autosomal recessive juvenile parkinsonism. From a mechanistic point of view, Parkin is extremely complex. Its activity is tightly controlled by several modes of auto-inhibition that must be released by queues of mitochondrial damage. While the general overview of Parkin activation has been mapped out in recent years, several details have remained murky. In particular, whether Parkin dimerizes as part of its feed-forward signaling mechanism, and whether said dimerization can facilitate ligase activation, has remained unclear. Here, Lenka et al. use various truncation mutants of Parkin in an attempt to understand the likelihood of dimerization (in support of an "in trans" model for catalysis).

      Strengths:

      The results are bolstered by several distinct approaches including analytical SEC with cleavable Parkin constructs, ITC interaction studies, ubiquitination assays, protein crystallography, and cellular localization studies.

      We thank the reviewer for their positive remark.

      Weaknesses:

      As presented, however, the storyline is very confusing to follow and several lines of experimentation felt like distractions from the primary message. Furthermore, many experiments could only indirectly support the author's conclusions, and therefore the final picture of what new features can be firmly added to the model of Parkin activation and function is unclear.

      We thank the reviewer for their constructive criticism, which has helped us to improve the quality of this manuscript.

      Major concerns:

      (1) This manuscript solves numerous crystal structures of various Parkin components to help support their idea of in trans transfer. The way these structures are presented more resemble models and it is unclear from the figures that these are new complexes solved in this work, and what new insights can be gleaned from them.

      The structures in the present study are to validate the biophysical/biochemical experiments highlighting key findings. For example, we solved the phospho-Parkin (complex with pUb) structure after treatment with 3C protease (Fig. 2C), which washes off the pUbl-linker, as shown in Fig. 2B. The structure of pUbl-linker depleted phospho-Parkin-pUb complex showed that RING2 returned to the closed state (Fig. 2C), which is confirmation of the SEC assay in Fig. 2B. Similarly, the structure of the pUbl-linker depleted phospho-Parkin R163D/K211N-pUb complex (Fig. 3C), was done to validate the SEC data showing displacement of pUbl-linker is independent of pUbl interaction with the basic patch on RING0 (Fig. 3B). In addition, the latter structure also revealed a new donor ubiquitin binding pocket in the linker (connecting REP and RING2) region of Parkin (Fig. 9). Similarly, trans-complex structure of phospho-Parkin (Fig. 4D) was done to validate the biophysical data (Fig. 4A-C, Fig. 5A-D) showing trans-complex between phospho-Parkin and native Parkin. The latter also confirmed that the trans-complex was mediated by interactions between pUbl and the basic patch on RING0 (Fig. 4D). Furthermore, we noticed that the ACT region was disordered in the trans-complex between phospho-Parkin (1-140 + 141-382 + pUb) (Fig. 8A) which had ACT from the trans molecule, indicating ACT might be present in the cis molecule. The latter was validated from the structure of trans-complex between phospho-Parkin with cis ACT (1-76 + 77-382 + pUb) (Fig. 8C), showing the ordered ACT region. The structural finding was further validated by biochemical assays (Fig. 8 D-F, Extended Data Fig. 9C-E).

      The structure of TEV-treated R0RBR (TEV) (Extended Data Fig. 4C) was done to ensure that the inclusion of TEV and treatment with TEV protease did not perturb Parkin folding, an important control for our biophysical experiments.

      (2) There are no experiments that definitively show the in trans activation of Parkin. The binding experiments and size exclusion chromatography are a good start, but the way these experiments are performed, they'd be better suited as support for a stronger experiment showing Parkin dimerization. In addition, the rationale for an in trans activation model is not convincingly explained until the concept of Parkin isoforms is introduced in the Discussion. The authors should consider expanding this concept into other parts of the manuscript.

      We thank the reviewer for appreciating the Parkin dimerization. Our biophysical data in Fig. 5C shows that Parkin dimerization is mediated by interactions between phospho-Ubl and RING0 in trans, leading to the displacement of RING2. However, Parkin K211N (on RING0) mutation perturbs interaction with phospho-Parkin and leads to loss of Parkin dimerization and loss of RING2 displacement (Fig. 5C). The interaction between pUbl and K211 pocket on RING0 leads to the displacement of RING2 resulting in Parkin activation as catalytic residue C431 on RING2 is exposed for catalysis. The biophysical experiment is further confirmed by a biochemical experiment where the addition of catalytically in-active phospho-Parkin T270R/C431A activates autoinhibited WT-Parkin in trans using the mechanism as discussed (a schematic representation also shown in Author response image 2).

      We thank this reviewer regarding Parkin isoforms. In the revised manuscript, we have included Parkin isoforms in the results section, too.

      (2a) For the in trans activation experiment using wt Parkin and pParkin (T270R/C431A) (Figure 3D), there needs to be a large excess of pParkin to stimulate the catalytic activity of wt Parkin. This experiment has low cellular relevance as these point mutations are unlikely to occur together to create this nonfunctional pParkin protein. In the case of pParkin activating wt Parkin (regardless of artificial point mutations inserted to study specifically the in trans activation), if there needs to be much more pParkin around to fully activate wt Parkin, isn't it just more likely that the pParkin would activate in cis?

      To test phospho-Parkin as an activator of Parkin in trans, we wanted to use the catalytically inactive version of phospho-Parkin to avoid the background activity of p-Parkin. While it is true that a large excess of pParkin (T270R/C431A) is required to activate WT-Parkin in the in vitro set-up, it is not very surprising as in WT-Parkin, the unphosphorylated Ubl domain would block the E2 binding site on RING1. Also, due to interactions between pParkin (T270R/C431A) molecules, the net concentration of pParkin (T270R/C431A) as an activator would be much lower. However, the Ubl blocking E2 binding site on RING1 won’t be an issue between phospho-Parkin molecules or between Parkin isoforms (lacking Ubl domain or RING2).

      (2ai) Another underlying issue with this experiment is that the authors do not consider the possibility that the increased activity observed is a result of increased "substrate" for auto-ubiquitination, as opposed to any role in catalytic activation. Have the authors considered looking at Miro as a substrate in order to control for this?

      This is quite an interesting point. However, this will be only possible if Parkin is ubiquitinated in trans, as auto-ubiquitination is possible with active Parkin and not with catalytically dead (phospho-Parkin T270R, C431A) or autoinhibited (WT-Parkin). Also, in the previous version of the manuscript, where we used only phospho-Ubl as an activator of Parkin in trans, we tested Miro1 ubiquitination and auto-ubiquitination, and the results were the same (Author response image 4).

      Author response image 4.

      (2b) The authors mention a "higher net concentration" of the "fused domains" with RING0, and use this to justify artificially cleaving the Ubl or RING2 domains from the Parkin core. This fact should be moot. In cells, it is expected there will only be a 1:1 ratio of the Parkin core with the Ubl or RING2 domains. To date, there is no evidence suggesting multiple pUbls or multiple RING2s can bind the RING0 binding site. In fact, the authors here even show that either the RING2 or pUbl needs to be displaced to permit the binding of the other domain. That being said, there would be no "higher net concentration" because there would always be the same molar equivalents of Ubl, RING2, and the Parkin core.

      We apologize for the confusion. “Higher net concentration” is with respect to fused domains versus the domain provided in trans. Due to the competing nature of the interactions between pUbl/RING2 and RING0, the interactions are too transient and beyond the detection limit of the biophysical techniques. While the domains are fused (for example, RING0-RING2 in the same polypeptide) in a polypeptide, their effective concentrations are much higher than those (for example, pUbl) provided in trans; thus, biophysical methods fail to detect the interaction. Treatment with protease solves the above issue due to the higher net concentration of the fused domain, and trans interactions can be measured using biophysical techniques. However, the nature of these interactions and conformational changes is very transient, which is also suggested by the data. Therefore, Parkin molecules will never remain associated; rather, Parkin will transiently interact and activate Parkin molecules in trans.

      (2c) A larger issue remaining in terms of Parkin activation is the lack of clarity surrounding the role of the linker (77-140); particularly whether its primary role is to tether the Ubl to the cis Parkin molecule versus a role in permitting distal interactions to a trans molecule. The way the authors have conducted the experiments presented in Figure 2 limits the possible interactions that the activated pUbl could have by (a) ablating the binding site in the cis molecule with the K211N mutation; (b) further blocking the binding site in the cis molecule by keeping the RING2 domain intact. These restrictions to the cis parkin molecule effectively force the pUbl to bind in trans. A competition experiment to demonstrate the likelihood of cis or trans activation in direct comparison with each other would provide stronger evidence for trans activation.

      This is an excellent point. In the revised manuscript, we have performed experiments using native phospho-Parkin (Revised Figure 5), and the results are consistent with those in Figure 2 ( Revised Figure 4), where we used the K211N mutation.

      (3) A major limitation of this study is that the authors interpret structural flexibility from experiments that do not report directly on flexibility. The analytical SEC experiments report on binding affinity and more specifically off-rates. By removing the interdomain linkages, the accompanying on-rate would be drastically impacted, and thus the observations are disconnected from a native scenario. Likewise, observations from protein crystallography can be consistent with flexibility, but certainly should not be directly interpreted in this manner. Rigorous determination of linker and/or domain flexibility would require alternative methods that measure this directly.

      We also agree with the reviewer that these methods do not directly capture structural flexibility. Also, rigorous determination of linker flexibility would require alternative methods that measure this directly. However, due to the complex nature of interactions and technical limitations, breaking the interdomain linkages was the best possible way to capture interactions in trans. Interestingly, all previous methods that report cis interactions between pUbl and RING0 also used a similar approach (Gladkova et al.ref. 24, Sauve et al. ref. 29).  

      (4) The analysis of the ACT element comes across as incomplete. The authors make a point of a competing interaction with Lys48 of the Ubl domain, but the significance of this is unclear. It is possible that this observation could be an overinterpretation of the crystal structures. Additionally, the rationale for why the ACT element should or shouldn't contribute to in trans activation of different Parkin constructs is not clear. Lastly, the conclusion that this work explains the evolutionary nature of this element in chordates is highly overstated.

      We agree with the reviewer that the significance of Lys48 is unclear. We have presented this just as one of the observations from the crystal structure. As the reviewer suggested, we have removed the sentence about the evolutionary nature of this element from the revised manuscript.

      (5) The analysis of the REP linker element also seems incomplete. The authors identify contacts to a neighboring pUb molecule in their crystal structure, but the connection between this interface (which could be a crystallization artifact) and their biochemical activity data is not straightforward. The analysis of flexibility within this region using crystallographic and AlphaFold modeling observations is very indirect. The authors also draw parallels with linker regions in other RBR ligases that are involved in recognizing the E2-loaded Ub. Firstly, it is not clear from the text or figures whether the "conserved" hydrophobic within the linker region is involved in these alternative Ub interfaces. And secondly, the authors appear to jump to the conclusion that the Parkin linker region also binds an E2-loaded Ub, even though their original observation from the crystal structure seems inconsistent with this. The entire analysis feels very preliminary and also comes across as tangential to the primary storyline of in trans Parkin activation.

      We agree with the reviewer that crystal structure data and biochemical data are not directly linked. In the revised manuscript, we have also highlighted the conserved hydrophobic in the linker region at the ubiquitin interface (Fig. 9C and Extended Data Fig. 11A), which was somehow missed in the original manuscript. We want to add that a very similar analysis and supporting experiments identified donor ubiquitin-binding sites on the IBR and helix connecting RING1-IBR (Kumar et al., Nature Str. and Mol. Biol., 2017), which several other groups later confirmed. In the mentioned study, the Ubl domain of Parkin from the symmetry mate Parkin molecule was identified as a mimic of “donor ubiquitin” on IBR and helix connecting RING1-IBR.

      In the present study, a neighboring pUb molecule in the crystal structure is identified as a donor ubiquitin mimic (Fig. 9C) by supporting biophysical/biochemical experiments. First, we show that mutation of I411A in the REP linker of Parkin perturbs Parkin interaction with E2~Ub (donor) (Fig. 9F). Another supporting experiment was performed using a Ubiquitin-VS probe assay, which is independent of E2. Assays using Ubiquitin-VS show that I411A mutation in the REP-RING2 linker perturbs Parkin charging with Ubiquitin-VS (Extended Data Fig. 11 B). Furthermore, the biophysical data showing loss of Parkin interaction with donor ubiquitin is further supported by ubiquitination assays. Mutations in the REP-RING2 linker perturb the Parkin activity (Fig. 9E), confirming biophysical data. This is further confirmed by mutations (L71A or L73A) on ubiquitin (Extended Data Fig. 11C), resulting in loss of Parkin activity. The above experiments nicely establish the role of the REP-RING2 linker in interaction with donor ubiquitin, which is consistent with other RBRs (Extended Data Fig. 11A).

      While we agree with the reviewer that this appears tangential to the primary storyline in trans-Parkin activation, we decided to include this data because it could be of interest to the field.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) For clarity, a schematic of the domain architecture of Parkin would be helpful at the outset in the main figures. This will help with the introduction to better understand the protein organization. This is lost in the Extended Figure in my opinion.

      We thank the reviewer for suggesting this, which we have included in Figure 1 of the revised manuscript.

      (2) Related to the competition between the Ubl and RING2 domains, can competition be shown through another method? SPR, ITC, etc? ITC was used in other experiments, but only in the context of mutations (Lys211Asn)? Can this be done with WT sequence?

      This is an excellent suggestion. In the revised Figure 5, we have performed ITC experiment using WT Parkin, and the results are consistent with what we observed using Lys211Asn Parkin.

      (3) The authors also note that "the AlphaFold model shows a helical structure in the linker region of Parkin (Extended Data Figure 10C), further confirming the flexible nature of this region"... but the secondary structure would not be inherently flexible. This is confusing.

      The flexibility is in terms of the conformation of this linker region observed under the open or closed state of Parkin. In the revised manuscript, we have explained this point more clearly.

      (4) The manuscript needs extensive revision to improve its readability. Minor grammatical mistakes were prevalent throughout.

      We thank the reviewer for pointing out this and we have corrected these in the revised manuscript.

      (5) The confocal images are nice, but inset panels may help highlight the regions of interest (ROIs).

      This is corrected in the revised manuscript.

      (6) Trans is misspelled ("tans") towards the end of the second paragraph on page 16.

      This is corrected in the revised manuscript.

      (7) The schematics are helpful, but some of the lettering in Figure 2 is very small.

      This is corrected in the revised manuscript.

      Reviewer #3 (Recommendations For The Authors):

      (1) A significant portion of the results section refers to the supplement, making the overall readability very difficult.

      We accept this issue as a lot of relevant data could not be added to the main figures and thus ended up in the supplement.  In the revised manuscript, we have moved some of the supplementary figures to the main figures.

      (2) Interpretation of the experiments utilizing many different Parkin constructs and cleavage scenarios (particularly the SEC and crystallography experiments) is extremely difficult. The work would benefit from a layout of the Parkin model system, highlighting cleavage sites, key domain terminology, and mutations used in the study, presented together and early on in the manuscript. Using this to identify a simpler system of referencing Parkin constructs would also be a large improvement.

      This is a great suggestion. We have included these points in the revised manuscript, which has improved the readability.

      (3) Lines 81-83; the authors say they "demonstrate the conformational changes in Parkin during the activation process", but fail to show any actual conformational changes. Further, much of what is demonstrated in this work (in terms of crystal structures) corroborates existing literature. The authors should use caution not to overstate their original conclusions in light of the large body of work in this area.

      We thank the reviewer for pointing out this. We have corrected the above statement in the revised manuscript to indicate that we meant it in the context of trans conformational changes.

      (4) Line 446 and 434; there is a discrepancy about which amino acid is present at residue 409. Is this a K408 typo? The authors also present mutational work on K416, but this residue is not shown in the structure panel.

      We thank the reviewer for pointing out this. In the revised manuscript, we have corrected these typos.

    1. Author response:

      The following is the authors’ response to the current reviews.

      Reviewer 1 (Public Review):

      I want to reiterate my comment from the first round of reviews: that I am insufficiently familiar with the intricacies of Maxwell’s equations to assess the validity of the assumptions and the equations being used by WETCOW. The work ideally needs assessing by someone more versed in that area, especially given the potential impact of this method if valid.

      We appreciate the reviewer’s candor. Unfortunately, familiarity with Maxwell’s equations is an essential prerequisite for assessing the veracity of our approach and our claims.

      Effort has been made in these revisions to improve explanations of the proposed approach (a lot of new text has been added) and to add new simulations. However, the authors have still not compared their method on real data with existing standard approaches for reconstructing data from sensor to physical space. Refusing to do so because existing approaches are deemed inappropriate (i.e. they “are solving a different problem”) is illogical.

      Without understanding the importance of our model for brain wave activity (cited in the paper) derived from Maxwell’s equations in inhomogeneous and anisotropic brain tissue, it is not possible to critically evaluate the fundamental difference between our method and the standard so-called “source localization” method which the Reviewer feels it is important to compare our results with. Our method is not “source localization” which is a class of techniques based on an inappropriate model for static brain activity (static dipoles sprinkled sparsely in user-defined areas of interest). Just because a method is “standard” does not make it correct. Rather, we are reconstructing a whole brain, time dependent electric field potential based upon a model for brain wave activity derived from first principles. It is comparing two methods that are “solving different problems” that is, by definition, illogical.

      Similarly, refusing to compare their method with existing standard approaches for spatio-temporally describing brain activity, just because existing approaches are deemed inappropriate, is illogical.

      Contrary to the Reviewer’s assertion, we do compare our results with three existing methods for describing spatiotemporal variations of brain activity.

      First, Figures 1, 2, and 6 compare the spatiotemporal variations in brain activity between our method and fMRI, the recognized standard for spatiotemporal localization of brain activity. The statistical comparison in Fig 3 is a quantitative demonstration of the similarity of the activation patterns. It is important to note that these data are simultaneous EEG/fMRI in order to eliminate a variety of potential confounds related to differences in experimental conditions.

      Second, Fig 4 (A-D) compares our method with the most reasonable “standard” spatiotemporal localization method for EEG: mapping of fields in the outer cortical regions of the brain detected at the surface electrodes to the surface of the skull. The consistency of both the location and sign of the activity changes detected by both methods in a “standard” attention paradigm is clearly evident. Further confirmation is provided by comparison of our results with simultaneous EEG/fMRI spatial reconstructions (E-F) where the consistency of our reconstructions between subjects is shown in Fig 5.

      Third, measurements from intra-cranial electrodes, the most direct method for validation, are compared with spatiotemporal estimates derived from surface electrodes and shown to be highly correlated.

      For example, the authors say that “it’s not even clear what one would compare [between the new method and standard approaches]”. How about:

      (1) Qualitatively: compare EEG activation maps. I.e. compare what you would report to a researcher about the brain activity found in a standard experimental task dataset (e.g. their gambling task). People simply want to be able to judge, at least qualitatively on the same data, what the most equivalent output would be from the two approaches. Note, both approaches do not need to be done at the same spatial resolution if there are constraints on this for the comparison to be useful.

      (2) Quantitatively: compare the correlation scores between EEG activation maps and fMRI activation maps

      These comparison were performed and already in the paper.

      (1) Fig 4 compares the results with a standard attention paradigm (data and interpretation from Co-author Dr Martinez, who is an expert in both EEG and attention). Additionally, Fig 12 shows detected regions of increased activity in a well-known brain circuit from an experimental task (’reward’) with data provided by Co-author Dr Krigolson, an expert in reward circuitry.

      (2) Correlation scores between EEG and fMRI are shown in Fig 3.

      (3) Very high correlation between the directly measured field from intra-cranial electrodes in an epilepsy patient and those estimated from only the surface electrodes is shown in Fig 9.

      There are an awful lot of typos in the new text in the paper. I would expect a paper to have been proof read before submitting.

      We have cleaned up the typos.

      The abstract claims that there is a “direct comparison with standard state-of-the-art EEG analysis in a well-established attention paradigm”, but no actual comparison appears to have been completed in the paper.

      On the contrary, as mentioned above, Fig 4 compares the results of our method with the state-of-the-art surface spatial mapping analysis, with the state-of-the-art time-frequency analysis, and with the state-of-the-art fMRI analysis

      Reviewer 2 (Public Review):

      This is a major rewrite of the paper. The authors have improved the discourse vastly.

      There is now a lot of didactics included but they are not always relevant to the paper.

      The technique described in the paper does in fact leverage several novel methods we have developed over the years for analyzing multimodal space-time imaging data. Each of these techniques has been described in detail in separate publications cited in the current paper. However, the Reviewers’ criticisms stated that the methods were non-standard and they were unfamiliar with them. In lieu of the Reviewers’ reading the original publications, we added a significant amount of text indeed intended to be didactic. However, we can assume the Reviewer that nothing presented was irrelevant to the paper. We certainly had no desire to make the paper any longer than it needed to be.

      The section on Maxwell’s equation does a disservice to the literature in prior work in bioelectromagnetism and does not even address the issues raised in classic text books by Plonsey et al. There is no logical “backwardness” in the literature. They are based on the relative values of constants in biological tissues.

      This criticism highlights the crux of our paper. Contrary to the assertion that we have ignored the work of Plonsey, we have referenced it in the new additional text detailing how we have constructed Maxwell’s Equations appropriate for brain tissue, based on the model suggested by Plonsey that allows the magnetic field temporal variations to be ignored but not the time-dependence electric fields.

      However, the assumption ubiquitous in the vast prior literature of bioelectricity in the brain that the electric field dynamics can be “based on the relative values of constants in biological tissues”, as the Reviewer correctly summarizes, is precisely the problem. Using relative average tissue properties does not take into account the tissue anisotropy necessary to properly account for correct expressions for the electric fields. As our prior publications have demonstrated in detail, taking into account the inhomogeneity and anisotropy of brain tissue in the solution to Maxwell’s Equations is necessary for properly characterizing brain electrical fields, and serves as the foundation of our brain wave theory. This led to the discovery of a new class of brain waves (weakly evanescent transverse cortical waves, WETCOW).

      It is this brain wave model that is used to estimate the dynamic electric field potential from the measurements made by the EEG electrode array. The standard model that ignores these tissue details leads to the ubiquitous “quasi-static approximation” that leads to the conclusion that the EEG signal cannot be spatial reconstructed. It is indeed this critical gap in the existing literature that is the central new idea in the paper.

      There are reinventions of many standard ideas in terms of physics discourses, like Bayesian theory or PCA etc.

      The discussion of Bayesian theory and PCA is in response to the Reviewer complaint that they were unfamiliar with our entropy field decomposition (EFD) method and the request that we compare it with other “standard” methods. Again, we have published extensively on this method (as referenced in the manuscript) and therefore felt that extensive elaboration was unnecessary. Having been asked to provide such elaboration and then being pilloried for it therefore feels somewhat inappropriate in our view. This is particularly disappointing as the Reviewer claims we are presenting “standard” ideas when in fact the EFD is new general framework we developed to overcome the deficiencies in standard “statistical” and probabilistic data analysis methods that are insufficient for characterizing non-linear, nonperiodic, interacting fields that are the rule, rather than the exception, in complex dynamical systems, such as brain electric fields (or weather, or oceans, or ....).

      The EFD is indeed a Bayesian framework, as this is the fundamental starting point for probability theory, but it is developed in a unique and more general fashion than previous data analysis methods. (Again, this is detailed in several references in the papers bibliography. The Reviewer’s requested that an explanation be included in the present paper, however, so we did so). First, Bayes Theorem is expressed in terms of a field theory that allows an arbitrary number of field orders and coupling terms. This generality comes with a penalty, which is that it’s unclear how to assess the significance of the essentially infinite number of terms. The second feature is the introduction of a method by which to determine the significant number of terms automatically from the data itself, via the our theory of entropy spectrum pathways (ESP), which is also detailed in a cited publication, and which produces ranked spatiotemporal modes from the data. Rather than being “reinventions of many standard ideas” these are novel theoretical and computational methods that are central to the EEG reconstruction method presented in the paper.

      I think that the paper remains quite opaque and many of the original criticisms remain, especially as they relate to multimodal datasets. The overall algorithm still remains poorly described. benchmarks.

      It’s not clear how to assess the criticisms that the algorithm is poorly described yet there is too much detail provided that is mistakenly assessed as “standard”. Certainly the central wave equations that are estimated from the data are precisely described, so it’s not clear exactly what the Reviewer is referring to.

      The comparisons to benchmark remain unaddressed and the authors state that they couldn’t get Loreta to work and so aborted that. The figures are largely unaltered, although they have added a few more, and do not clearly depict the ideas. Again, no benchmark comparisons are provided to evaluate the results and the performance in comparison to other benchmarks.

      As we have tried to emphasize in the paper, and in the Response to Reviewers, the standard so-called “source localization” methods are NOT a benchmark, as they are solving an inappropriate model for brain activity. Once again, static dipole “sources” arbitrarily sprinkled on pre-defined regions of interest bear little resemblance to observed brain waves, nor to the dynamic electric field wave equations produced by our brain wave theory derived from a proper solution to Maxwell’s equations in the anisotropic and inhomogeneous complex morphology of the brain.

      The comparison with Loreta was not abandoned because we couldn’t get it to work, but because we could not get it to run under conditions that were remotely similar to whole brain activity described by our theory, or, more importantly, by an rationale theory of dynamic brain activity that might reproduce the exceedingly complex electric field activity observed in numerous neuroscience experiments.

      We take issue with the rather dismissive mention of “a few more” figures that “do not clearly depict the idea” when in fact the figures that have been added have demonstrated additional quantitative validation of the method.


      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer 1 (Public Review):

      The paper proposes a new source reconstruction method for electroencephalography (EEG) data and claims that it can provide far superior spatial resolution than existing approaches and also superior spatial resolution to fMRI. This primarily stems from abandoning the established quasi-static approximation to Maxwell’s equations.<br /> The proposed method brings together some very interesting ideas, and the potential impact is high. However, the work does not provide the evaluations expected when validating a new source reconstruction approach. I cannot judge the success or impact of the approach based on the current set of results. This is very important to rectify, especially given that the work is challenging some long- standing and fundamental assumptions made in the field.

      We appreciate the Reviewer’s efforts in reviewing this paper and have included a significant amount of new text to address their concerns.

      I also find that the clarity of the description of the methods, and how they link to what is shown in the main results hard to follow.

      We have added significantly more detail on the methods, including more accessible explanations of the technical details, and schematic diagrams to visualize the key processing components.

      I am insufficiently familiar with the intricacies of Maxwell’s equations to assess the validity of the assumptions and the equations being used by WETCOW. The work therefore needs assessing by someone more versed in that area. That said, how do we know that the new terms in Maxwell’s equations, i.e. the time-dependent terms that are normally missing from established quasi-static-based approaches, are large enough to need to be considered? Where is the evidence for this?

      The fact that the time-dependent terms are large enough to be considered is essentially the entire focus of the original papers [7,8]. Time-dependent terms in Maxwell’s equations are generally not important for brain electrodynamics at physiological frequencies for homogeneous tissues, but this is not true for areas with stroung inhomogeneity and ansisotropy.

      I have not come across EFD, and I am not sure many in the EEG field will have. To require the reader to appreciate the contributions of WETCOW only through the lens of the unfamiliar (and far from trivial) approach of EFD is frustrating. In particular, what impact do the assumptions of WETCOW make compared to the assumptions of EFD on the overall performance of SPECTRE?

      We have added an entire new section in the Appendix that provides a very basic introduction to EFD and relates it to more commonly known methods, such as Fourier and Independent Components Analyses.

      The paper needs to provide results showing the improvements obtained when WETCOW or EFD are combined with more established and familiar approaches. For example, EFD can be replaced by a first-order vector autoregressive (VAR) model, i.e. y<sub>t</sub> = Ay<sub>t−1</sub> + e<sub>t</sub> (where y<sub>t</sub> is [num<sub>gridpoints</sub> ∗ 1] and A is [num<sub>gridpoints</sub> ∗ num<sub>gridpoints</sub>] of autoregressive parameters).

      The development of EFD, which is independent of WETCOW, stemmed from the necessity of developing a general method for the probabilistic analysis of finitely sampled non-linear interacting fields, which are ubiquitous in measurements of physical systems, of which functional neuroimaging data (fMRI, EEG) are excellent examples. Standard methods (such as VAR) are inadequate in such cases, as discussed in great detail in our EFD publications (e.g., [12,37]). The new appendix on EFD reviews these arguments. It does not make sense to compare EFD with methods which are inappropriate for the data.

      The authors’ decision not to include any comparisons with established source reconstruction approaches does not make sense to me. They attempt to justify this by saying that the spatial resolution of LORETA would need to be very low compared to the resolution being used in SPECTRE, to avoid compute problems. But how does this stop them from using a spatial resolution typically used by the field that has no compute problems, and comparing with that? This would be very informative. There are also more computationally efficient methods than LORETA that are very popular, such as beamforming or minimum norm.

      he primary reason for not comparing with ’source reconstruction’ (SR) methods is that we are are not doing source reconstruction. Our view of brain activity is that it involves continuous dynamical non-linear interacting fields througout the entire brain. Formulating EEG analysis in terms of reconstructing sources is, in our view, like asking ’what are the point sources of a sea of ocean waves’. It’s just not an appropriate physical model. A pre-chosen limited distribution of static dipoles is just a very bad model for brain activity, so much so that it’s not even clear what one would compare. Because in our view, as manifest in our computational implementation, one needs to have a very high density of computational locations throughout the entire brain, including white matter, and the reconstructed modes are waves whose extent can be across the entire brain. Our comments about the low resolution of computational methods for SR techniques really is expressing the more overarching concern that they are not capable of, or even designed for, detecting time-dependent fields of non-linear interacting waves that exist everywhere througout the brain. Moreover, the SR methods always give some answer, but in our view the initial conditions upon which those methods are based (pre-selected regions of activity with a pre-selected number of ’sources’) is a highly influential but artificial set of strong computational constraints that will almost always provide an answer consist with (i.e., biased toward) the expectations of the person formlating the problem, and is therefore potentially misleading.

      In short, something like the following methods needs to be compared:

      (1) Full SPECTRE (EFD plus WETCOW)

      (2) WETCOW + VAR or standard (“simple regression”) techniques

      (3) Beamformer/min norm plus EFD

      (4) Beamformer/min norm plus VAR or standard (“simple regression”) techniques

      The reason that no one has previously ever been able to solve the EEG inverse problem is due to the ubiquitous use of methods that are too ’simple’, i.e., are poor physical models of brain activity. We have spent a decade carefully elucidating the details of this statement in numerous highly technical and careful publications. It therefore serves no purpose to return to the use of these ’simple’ methods for comparison. We do agree, however, that a clearer overview of the advantages of our methods is warranted and have added significant additional text in this revision towards that purpose.

      This would also allow for more illuminating and quantitative comparisons of the real data. For example, a metric of similarity between EEG maps and fMRI can be computed to compare the performance of these methods. At the moment, the fMRI-EEG analysis amounts to just showing fairly similar maps.

      We disagree with this assessment. The correlation coefficient between the spatially localized activation maps is a conservative sufficient statistic for the measure of statistically significant similarity. These numbers were/are reported in the caption to Figure 5, and have now also been moved to, and highlighted in, the main text.

      There are no results provided on simulated data. Simulations are needed to provide quantitative comparisons of the different methods, to show face validity, and to demonstrate unequivocally the new information that SPECTRE can ’potentially’ provide on real data compared to established methods. The paper ideally needs at least 3 types of simulations, where one thing is changed at a time, e.g.:

      (1) Data simulated using WETCOW plus EFD assumptions

      (2) Data simulated using WETCOW plus e.g. VAR assumptions

      (3) Data simulated using standard lead fields (based on the quasi-static Maxwell solutions) plus e.g. VAR assumptions

      These should be assessed with the multiple methods specified earlier. Crucially the assessment should be quantitative showing the ability to recover the ground truth over multiple realisations of realistic noise. This type of assessment of a new source reconstruction method is the expected standard

      We have now provided results on simulated data, along with a discussion on what entails a meaningful simulation comparison. In short, our original paper on the WETCOW theory included a significant number of simulations of predicted results on several spatial and temporal scales. The most relevant simulation data to compare with the SPECTRE imaging results are the cortical wave loop predicted by WETCOW theory and demonstrated via numerical simulation in a realistic brain model derived from high resolution anatomical (HRA) MRI data. The most relevant data with which to compare these simulations are the SPECTRE recontruction from the data that provides the closest approximation to a “Gold Standard” - reconstructions from intra-cranial EEG (iEEG). We have now included results (new Fig 8) that demonstrate the ability of SPECTRE to reconstruct dynamically evolving cortical wave loops in iEEG data acquired in an epilepsy patient that match with the predicted loop predicted theoretically by WETCOW and demonstrated in realistic numerical simulations.

      The suggested comparison with simple regression techniques serves no purpose, as stated above, since that class of analysis techniques was not designed for non-linear, non-Gaussian, coupled interacting fields predicted by the WETCOW model. The explication of this statement is provided in great detail in our publications on the EFD approach and in the new appendix material provided in this revision. The suggested simulation of the dipole (i.e., quasi-static) model of brain activity also serves no purpose, as our WETCOW papers have demonstrated in great detail that is is not a reasonable model for dynamic brain activity.

      Reviewer 2 (Public Review):

      Strengths:

      If true and convincing, the proposed theoretical framework and reconstruction algorithm can revolutionize the use of EEG source reconstructions.

      Weaknesses:

      There is very little actual information in the paper about either the forward model or the novel method of reconstruction. Only citations to prior work by the authors are cited with absolutely no benchmark comparisons, making the manuscript difficult to read and interpret in isolation from their prior body of work.

      We have now added a significant amount of material detailing the forward model, our solution to the inverse problem, and the method of reconstruction, in order to remedy this deficit in the previous version of the paper.

      Recommendations for the authors:

      Reviewer 1 (Recommendations):

      It is not at all clear from the main text (section 3.1) and the caption, what is being shown in the activity patterns in Figures 1 and 2. What frequency bands and time points etc? How are the values shown in the figures calculated from the equations in the methods?

      We have added detailed information on the frequency bands reconstructed and the activity pattern generation and meaning. Additional information on the simultaneous EEG/fMRI acquisition details has been added to the Appendix.

      How have the activity maps been thresholded? Where are the color bars in Figures 1 and 2?

      We have now included that information in new versions of the figures. In addition, the quantitative comparison between fMRI and EEG are presented is now presented in a new Figure 2 (now Figure 3).

      P30 “This term is ignored in the current paper”. Why is this term ignored, but other (time-dependent) terms are not?

      These terms are ignored because they represent higher order terms that complicate the processing (and intepretation) but do not substatially change the main results. A note to this effect has been added to the text.

      The concepts and equations in the EFD section are not very accessible (e.g. to someone unfamiliar with IFT).

      We have added a lengthy general and more accessible description of the EFD method in the Appendix.

      Variables in equation 1, and the following equation, are not always defined in a clear, accessible manner. What is ?

      We have added additional information on how Eqn 1 (now Eqn 3) is derived, and the variables therein.

      In the EFD section, what do you mean conceptually by α, i.e. “the coupled parameters α”?

      This sentence has been eliminated, as it was superfluous and confusing.

      How are the EFD and WETCOW sections linked mathematically? What is ψ (in eqn 2) linked to in the WETCOW section (presumably ϕ<sub>ω</sub>?) ?

      We have added more introductory detail at the beginning of the Results to describe the WETCOW theory and how this is related to the inverse problem for EEG.

      What is the difference between data d and signal s in section 6.1.3? How are they related?

      We have added a much more detailed Appendix A where this (and other) details are provided.

      What assumptions have been made to get the form for the information Hamiltonian in eqn3?

      Eq 3 (now Eqn A.5) is actually very general. The approximations come in when constructing the interaction Hamiltonian H<sub>i</sub>.

      P33 “using coupling between different spatio-temporal points that is available from the data itself” I do not understand what is meant by this.

      This was a poorly worded sentence, but this section has now been replaced by Appendix A, which now contains the sentence that prior information “is contained within the data itself”. This refers to the fact that the prior information consists of correlations in the data, rather than some other measurements independent of the original data. This point is emphasized because in many Bayesian application, prior information consists of knowledge of some quantity that were acquired independently from the data at hand (e.g., mean values from previous experiments)

      Reviewer 2 (Recommendations):

      Abstract

      The first part presents validation from simultaneous EEG/fMRI data, iEEG data, and comparisons with standard EEG analyses of an attention paradigm. Exactly what constitutes adequate validation or what metrics were used to assess performance is surprisingly absent.

      Subsequently, the manuscript examines a large cohort of subjects performing a gambling task and engaging in reward circuits. The claim is that this method offers an alternative to fMRI.

      Introduction

      Provocative statements require strong backing and evidence. In the first paragraph, the “quasi-static” assumption which is dominant in the field of EEG and MEG imaging is questioned with some classic citations that support this assumption. Instead of delving into why exactly the assumption cannot be relaxed, the authors claim that because the assumption was proved with average tissue properties rather than exact, it is wrong. This does not make sense. Citations to the WETCOW papers are insufficient to question the quasi-static assumption.

      The introduction purports to validate a novel theory and inverse modeling method but poorly outlines the exact foundations of both the theory (WETCOW) and the inverse modeling (SPECTRE) work.

      We have added a new introductory subsection (“A physical theory of brain waves”) to the Results section that provides a brief overview of the foundations of the WETCOW theory and an explicit description of why the quasi-static approximation can be abandoned. We have expanded the subsequent subsection (“Solution to the inverse EEG problem”) to more clearly detail the inverse modeling (SPECTRE) method.

      Section 3.2 Validation with fMRI

      Figure 1 supposedly is a validation of this promising novel theoretical approach that defies the existing body of literature in this field. Shockingly, a single subject data is shown in a qualitative manner with absolutely no quantitative comparison anywhere to be found in the manuscript. While there are similarities, there are also differences in reconstructions. What to make out of these discrepancies? Are there distortions that may occur with SPECTRE reconstructions? What are its tradeoffs? How does it deal with noise in the data?

      It is certainly not the case that there are no quantitative comparisons. Correlation coefficients, which are the sufficient statistics for comparison of activation regions, are given in Figure 5 for very specific activation regions. Figure 9 (now Figure 11) shows a t-statistic demonstrating the very high significance of the comparison between multiple subjects. And we have now added a new Figure 7 demonstrating the strongly correlated estimates for full vs surface intra-cranial EEG reconstructions. To make this more clear, we have added a new section “Statistical Significance of the Results”.

      We note that a discussion of the discrepancies between fMRI and EEG was already presented in the Supplementary Material. Therein we discuss the main point that fMRI and EEG are measuring different physical quantities and so should not be expected to be identical. We also highlight the fact that fMRI is prone to significant geometrical distortions for magnetic field inhomogeities, and to physiological noise. To provide more visibility for this important issue, we have moved this text into the Discussion section.

      We do note that geometric distortions in fMRI data due to suboptimal acquisitions and corrections is all too common. This, coupled with the paucity of open source simultaneous fMRI-EEG data, made it difficult to find good data for comparison. The data on which we performed the quantitative statistical comparison between fMRI and EEG (Fig 5) was collected by co-author Dr Martinez, and was of the highest quality and therefore sufficient for comparison. The data used in Fig 1 and 2 was a well publicized open source dataset but had significant fMRI distortions that made quantitative comparison (i.e., correlation coefficents between subregions in the Harvard-Oxford atlas) suboptimal. Nevertheless, we wanted to demonstrate the method in more than one source, and feel that visual similarity is a reasonble measure for this data.

      Section 3.2 Validation with fMRI

      Figure 2 Are the sample slices being shown? How to address discrepancies? How to assume that these are validations when there are such a level of discrepancies?

      It’s not clear what “sample slices” means. The issue of discrepancies is addressed in the response to the previous query.

      Section 3.2 Validation with fMRI

      Figure 3 Similar arguments can be made for Figure 3. Here too, a comparison with source localization benchmarks is warranted because many papers have examined similar attention data.

      Regarding the fMRI/EEG comparison, these data are compared quantitatively in the text and in Figure 5.

      Regarding the suggestion to perform standard ’source localization’ analysis, see responses to Reviewer 1.

      Section 3.2 Validation with fMRI

      Figure 4 While there is consistency across 5 subjects, there are also subtle and not-so-subtle differences.

      What to make out of them?

      Discrepancies in activations patterns between individuals is a complex neuroscience question that we feel is well beyond the scope of this paper.

      Section 3.2 Validation with fMRI

      Figures 5 & 6 Figure 5 is also a qualitative figure from two subjects with no appropriate quantification of results across subjects. The same is true for Figure 6.

      On the contrary, Figure 5 contains a quantitative comparison, which is now also described in the text. A quantitative comparison for the epilepsy data in Fig 6 (and C.4-C.6) is now shown in Fig 7.

      Section 3.2 Validation with fMRI

      Given the absence of appropriate “validation” of the proposed model and method, it is unclear how much one can trust results in Section 4.

      We believe that the quantitative comparisons extant in the original text (and apparently missed by the Reviewer) along with the additional quantitative comparisons are sufficient to merit trust in Section 4.

      Section 3.2 Validation with fMRI

      What are the thresholds used in maps for Figure 7? Was correction for multiple comparisons performed? The final arguments at the end of section 4 do not make sense. Is the claim that all results of reconstructions from SPECTRE shown here are significant with no reason for multiple comparison corrections to control for false positives? Why so?

      We agree that the last line in Section 4 is misleading and have removed it.

      Section 3.2 Validation with fMRI

      Discussion is woefully inadequate in addition to the inconclusive findings presented here.

      We have added a significant amount of text to the Discussion to address the points brought up by the Reviewer. And, contrary to the comments of this Reviewer, we believe the statistically significant results presented are not “inconclusive”.

      Supplementary Materials

      This reviewer had an incredibly difficult time understanding the inverse model solution. Even though this has been described in a prior publication by the authors, it is important and imperative that all details be provided here to make the current manuscript complete. The notation itself is so nonstandard. What is Σ<sup>ij</sup>, δ<sup>ij</sup>? Where is the reference for equation (1)? What about the equation for <sup>ˆ</sup>(R)? There are very few details provided on the exact implementation details for the Fourier-space pseudo-spectral approach. What are the dimensions of the problem involved? How were different tissue compartments etc. handled? Equation 1 holds for the entire volume but the measurements are only made on the surface. How was this handled? What is the WETCOW brain wave model? I don’t see any entropy term defined anywhere - where is it?

      We have added more detail on the theoretical and numerical aspects of the inverse problem in two new subsections “Theory” and “Numerical Implementation” in the new section “Solution to the inverse EEG problem”.

      Supplementary Materials

      So, how can one understand even at a high conceptual level what is being done with SPECTRE?

      We have added a new subsection “Summary of SPECTRE” that provides a high conceptual level overview of the SPECTRE method outlined in the preceding sections.

      Supplementary Materials

      In order to understand what was being presented here, it required the reader to go on a tour of the many publications by the authors where the difficulty in understanding what they actually did in terms of inverse modeling remains highly obscure and presents a huge problem for replicability or reproducibility of the current work.

      We have now included more basic material from our previous papers, and simplified the presentation to be more accessible. In particular, we have now moved the key aspects of the theoretic and numerical methods, in a more readable form, from the Supplementary Material to the main text, and added a new Appendix that provides a more intuitive and accessible overview of our estimation procedures.

      Supplementary Materials

      How were conductivity values for different tissue types assigned? Is there an assumption that the conductivity tensor is the same as the diffusion tensor? What does it mean that “in the present study only HRA data were used in the estimation procedure?” Does that mean that diffusion MRI data was not used? What is SYMREG? If this refers to the MRM paper from the authors in 2018, that paper does not include EEG data at all. So, things are unclear here.

      The conductivity tensor is not exactly the same as the diffusion tensor in brain tissues, but they are closely related. While both tensors describe transport properties in brain tissue, they represent different physical processes. The conductivity tensor is often assumed to share the same eigenvectors as the diffusion tensor. There is a strong linear relationship between the conductivity and diffusion tensor eigenvalues, as supported by theoretical models and experimental measurements. For the current study we only used the anatomical data for estimatition and assignment of different tissue types and no diffusion MRI data was used. To register between different modalities, including MNI, HRA, function MRI, etc., and to transform the tissue assignment into an appropriate space we used the SYMREG registration method. A comment to the effect has been added to the text.

      Supplementary Materials

      How can reconstructed volumetric time-series of potential be thought of as the EM equivalent of an fMRI dataset? This sentence doesn’t make sense.

      This sentence indeed did not make sense and has been removed.

      Supplementary Materials

      Typical Bayesian inference does not include entropy terms, and entropy estimation doesn’t always lend to computing full posterior distributions. What is an “entropy spectrum pathway”? What is µ∗? Why can’t things be made clear to the reader, instead of incredible jargon used here? How does section 6.1.2 relate back to the previous section?

      That is correct that Bayesian inference typically does not include entropy terms. We believe that their introduction via the theory of entropy spectrum pathways (ESP) is a significant advance in Bayesian estimation as it provides highly relevent prior information from within the data itself (and therefore always available in spatiotemporal data) that facilitates a practical methodology for the analysis of complex non-linear dynamical system, as contained in the entropy field decomposition (EFD).

      Section 6.1.3 has now been replaced by a new Appendix A that discusses ESP in a much more intuitive and conceptual manner.

      Supplementary Materials

      Section 6.1.3 describes entropy field decomposition in very general terms. What is “non-period”? This section is incomprehensible. Without reference to exactly where in the process this procedure is deployed it is extremely difficult to follow. There seems to be an abuse of notation of using ϕ for eigenvectors in equation (5) and potentials earlier. How do equations 9-11 relate back to the original problem being solved in section 6.1.1? What are multiple modalities being described here that require JESTER?

      Section 6.1.3 has now been replaced by a new Appendix A that covers this material in a much more intuitive and conceptual manner.

      Supplementary Materials

      Section 6.3 discusses source localization methods. While most forward lead-field models assume quasistatic approximations to Maxwell’s equations, these are perfectly valid for the frequency content of brain activity being measured with EEG or MEG. Even with quasi-static lead fields, the solutions can have frequency dependence due to the data having frequency dependence. Solutions do not have to be insensitive to detailed spatially variable electrical properties of the tissues. For instance, if a FEM model was used to compute the forward model, this model will indeed be sensitive to the spatially variable and anisotropic electrical properties. This issue is not even acknowledged.

      The frequency dependence of the tissue properties is not the issue. Our theoretical work demonstrates that taking into account the anisotropy and inhomogeneity of the tissue is necessary in order to derive the existence of the weakly evanescent transverse cortical waves (WETCOW) that SPECTRE is detecting. We have added more details about the WETCOW model in the new Section “A physical theory of brain wave” to emphasize this point.

      Supplementary Materials

      Arguments to disambiguate deep vs shallow sources can be achieved with some but not all source localization algorithms and do not require a non-quasi-static formulation. LORETA is not even the main standard algorithm for comparison. It is disappointing that there are no comparisons to source localization and that this is dismissed away due to some coding issues.

      Again, we are not doing ’source localization’. The concept of localized dipole sources is anathema to our brain wave model, and so in our view comparing SPECTRE to such methods only propagates the misleading idea that they are doing the same thing. So they are definitely not dismissed due to coding issues. However, because of repeated requests to do compare SPECTRE with such methods, we attempted to run a standard source localization method with parameters that would at least provide the closest approximation to what we were doing. This attempt highlighted a serious computational issue in source localization methods that is a direct consequence of the fact that they are not attempting to do what SPECTRE is doing - describing a time-varying wave field, in the technical definition of a ’field’ as an object that has a value at every point in space-time.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      Bennion and colleagues present a careful examination of how an earlier set of memories can either interfere with or facilitate memories formed later. This impressive work is a companion piece to an earlier paper by Antony and colleagues (2022) in which a similar experimental design was used to examine how a later set of memories can either interfere with or facilitate memories formed earlier. This study makes contact with an experimental literature spanning 100 years, which is concerned with the nature of forgetting, and the ways in which memories for particular experiences can interact with other memories. These ideas are fundamental to modern theories of human memory, for example, paired-associate studies like this one are central to the theoretical idea that interference between memories is a much bigger contributor to forgetting than any sort of passive decay. 

      Strengths: 

      At the heart of the current investigation is a proposal made by Osgood in the 1940s regarding how paired associates are learned and remembered. In these experiments, one learns a pair of items, A-B (cue-target), and then later learns another pair that is related in some way, either A'-B (changing the cue, delta-cue), or A-B' (changing the target, delta-target), or A'-B' (changing both, delta-both), where the prime indicates that item has been modified, and may be semantically related to the original item. The authors refer to the critical to-be-remembered pairs as base pairs. Osgood proposed that when the changed item is very different from the original item there will be interference, and when the changed item is similar to the original item there will be facilitation. Osgood proposed a graphical depiction of his theory in which performance was summarized as a surface, with one axis indicating changes to the cue item of a pair and the other indicating changes to the target item, and the surface itself necessary to visualize the consequences of changing both. 

      In the decades since Osgood's proposal, there have been many studies examining slivers of the proposal, e.g., just changing targets in one experiment, just changing cues in another experiment. Because any pair of experiments uses different methods, this has made it difficult to draw clear conclusions about the effects of particular manipulations. 

      The current paper is a potential landmark, in that the authors manipulate multiple fundamental experimental characteristics using the same general experimental design. Importantly, they manipulate the semantic relatedness of the changed item to the original item, the delay between the study experience and the test, and which aspect of the pair is changed. Furthermore, they include both a positive control condition (where the exact same pair is studied twice), and a negative control condition (where a pair is only studied once, in the same phase as the critical base pairs). This allows them to determine when the prior learning exhibits an interfering effect relative to the negative control condition and also allows them to determine how close any facilitative effects come to matching the positive control. 

      The results are interpreted in terms of a set of existing theories, most prominently the memory-for-change framework, which proposes a mechanism (recursive reminding) potentially responsible for the facilitative effects examined here. One of the central results is the finding that a stronger semantic relationship between a base pair and an earlier pair has a facilitative effect on both the rate of learning of the base pair and the durability of the memory for the base pair. This is consistent with the memory-for-change framework, which proposes that this semantic relationship prompts retrieval of the earlier pair, and the two pairs are integrated into a common memory structure that contains information about which pair was studied in which phase of the experiment. When semantic relatedness is lower, they more often show interference effects, with the idea being that competition between the stored memories makes it more difficult to remember the base pair. 

      This work represents a major methodological and empirical advance for our understanding of paired-associates learning, and it sets a laudably high bar for future work seeking to extend this knowledge further. By manipulating so many factors within one set of experiments, it fills a gap in the prior literature regarding the cognitive validity of an 80-year-old proposal by Osgood. The reader can see where the observed results match Osgood's theory and where they are inconclusive. This gives us insight, for example, into the necessity of including a long delay in one's experiment, to observe potential facilitative effects. This point is theoretically interesting, but it is also a boon for future methodological development, in that it establishes the experimental conditions necessary for examining one or another of these facilitation or interference effects more closely. 

      We thank the reviewer for their thorough and positive comments -- thank you so much!

      Weaknesses: 

      One minor weakness of the work is that the overarching theoretical framing does not necessarily specify the expected result for each and every one of the many effects examined. For example, with a narrower set of semantic associations being considered (all of which are relatively high associations) and a long delay, varying the semantic relatedness of the target item did not reliably affect the memorability of that pair. However, the same analysis showed a significant effect when the wider set of semantic associations was used. The positive result is consistent with the memory-for-change framework, but the null result isn't clearly informative to the theory. I call this a minor weakness because I think the value of this work will grow with time, as memory researchers and theorists use it as a benchmark for new theory development. For example, the data from these experiments will undoubtedly be used to develop and constrain a new generation of computational models of paired-associates learning. 

      We thank the reviewer for this constructive critique. We agree that the experiments with a narrower set of semantic associations are less informative; in fact, we thought about removing these experiments from the current study, but given that we found results in the ΔBoth condition in Antony et al. (2022) using these stimuli that we did NOT find in the wider set, we thought it was worth including for a thorough comparison. We hope that the analyses combining the two experiment sets (Fig 6-Supp 1) are informative for contextualizing the results in the ‘narrower’ experiments and, as the reviewer notes, for informing future researchers.

      Reviewer #2 (Public Review): 

      Summary: 

      The study focuses on how relatedness with existing memories affects the formation and retention of new memories. Of core interest were the conditions that determine when prior memories facilitate new learning or interfere with it. Across a set of experiments that varied the degree of relatedness across memories as well as retention interval, the study compellingly shows that relatedness typically leads to proactive facilitation of new learning, with interference only observed under specific conditions and immediate test and being thus an exception rather than a rule. 

      Strengths: 

      The study uses a well-established word-pair learning paradigm to study interference and facilitation of overlapping memories. However it goes more in-depth than a typical interference study in the systematic variation of several factors: (1) which elements of an association are overlapping and which are altered (change target, change cue, change both, change neither); (2) how much the changed element differs from the original (word relatedness, with two ranges of relatedness considered); (3) retention period (immediate test, 2-day delay). Furthermore, each experiment has a large N sample size, so both significant effects as well as null effects are robust and informative. 

      The results show the benefits of relatedness, but also replicate interference effects in the "change target" condition when the new target is not related to the old target and when the test is immediate. This provides a reconciliation of some existing seemingly contradictory results on the effect of overlap on memory. Here, the whole range of conditions is mapped to convincingly show how the direction of the effect can flip across the surface of relatedness values. 

      Additional strength comes from supporting analyses, such as analyses of learning data, demonstrating that relatedness leads to both better final memory and also faster initial learning. 

      More broadly, the study informs our understanding of memory integration, demonstrating how the interdependence of memory for related information increases with relatedness. Together with a prior study or retroactive interference and facilitation, the results provide new insights into the role of reminding in memory formation. 

      In summary, this is a highly rigorous body of work that sets a great model for future studies and improves our understanding of memory organization. 

      We thank their reviewer for their thorough summary and very supportive words!

      Weaknesses: 

      The evidence for the proactive facilitation driven by relatedness is very convincing. However, in the finer scale results, the continuous relationship between the degree of relatedness and the degree of proactive facilitation/interference is less clear. This could be improved with some additional analyses and/or context and discussion. In the narrower range, the measure used was AS, with values ranging from 0.03-0.98, where even 0.03 still denotes clearly related words (pious - holy). Within this range from "related" to "related a lot", no relationship to the degree of facilitation was found. The wider range results are reported using a different scale, GloVe, with values from -0.14 to 0.95, where the lower end includes unrelated words (sap - laugh). It is possible that any results of facilitation/interference observed in the wider range may be better understood as a somewhat binary effect of relatedness (yes or no) rather than the degree of relatedness, given the results from the narrower condition. These two options could be more explicitly discussed. The report would benefit from providing clearer information about these measures and their range and how they relate to each other (e.g., not a linear transformation). It would be also helpful to know how the values reported on the AS scale would end up if expressed in the GloVe scale (and potentially vice-versa) and how that affects the results. Currently, it is difficult to assess whether the relationship between relatedness and memory is qualitative or quantitative. This is less of a problem with interdependence analyses where the results converge across a narrow and wider range. 

      We thank the reviewer for this point. While other analyses do show differences across the range of AS values we used, we agree in the case of the memorability analysis in the narrower stimulus set, 48-hr experiment (or combining across the narrower and wider stimulus sets), there could be a stronger influence of binary (yes/no) relatedness. We have now made this point explicitly (p. 26):

      “Altogether, these results show that PI can still occur with low relatedness, like in other studies finding PI in ΔTarget (A-B, A-D) paradigms (for a review, see Anderson & Neely, 1996), but PF occurs with higher relatedness. In fact, the absence of low relatedness pairs in the narrower stimulus set likely led to the strong overall PF in this condition across all pairs (positive y-intercept in the upper right of Fig 3A). In this particular instance, there may have been a stronger influence of a binary factor (whether they are related or not), though this remains speculative and is not the case for other analyses in our paper.”

      Additionally, we have also emphasized that the two relatedness metrics are not linear transforms of each other. Finally, as in addressing both your and reviewer #3’s comment below, we now graph relatedness values under a common GloVe metric in Fig 1-Supp 1C (p. 9):

      “Please note that GloVe is an entirely different relatedness metric and is not a linear transformation of AS (see Fig 1-Supp 1C for how the two stimulus sets compare using the common GloVe metric).”

      A smaller weakness is generalizability beyond the word set used here. Using a carefully crafted stimulus set and repeating the same word pairings across participants and conditions was important for memorability calculations and some of the other analyses. However, highlighting the inherently noisy item-by-item results, especially in the Osgood-style surface figures, makes it challenging to imagine how the results would generalize to new stimuli, even within the same relatedness ranges as the current stimulus sets. 

      We thank the reviewer for this critique. We have added this caveat in the limitations to suggest that future studies should replicate these general findings with different stimulus sets (p. 28):

      “Finally, future studies could ensure these effects are not limited to these stimuli and generalize to other word stimuli in addition to testing other domains (Baek & Papaj, 2024; Holding, 1976).”

      Reviewer #3 (Public Review): 

      Summary: 

      Bennion et al. investigate how semantic relatedness proactively benefits the learning of new word pairs. The authors draw predictions from Osgood (1949), which posits that the degree of proactive interference (PI) and proactive facilitation (PF) of previously learned items on to-be-learned items depends on the semantic relationships between the old and new information. In the current study, participants learn a set of word pairs ("supplemental pairs"), followed by a second set of pairs ("base pairs"), in which the cue, target, or both words are changed, or the pair is identical. Pairs were drawn from either a narrower or wider stimulus set and were tested after either a 5-minute or 48-hour delay. The results show that semantic relatedness overwhelmingly produces PF and greater memory interdependence between base and supplemental pairs, except in the case of unrelated pairs in a wider stimulus set after a short delay, which produced PI. In their final analyses, the authors compare their current results to previous work from their group studying the analogous retroactive effects of semantic relatedness on memory. These comparisons show generally similar, if slightly weaker, patterns of results. The authors interpret their results in the framework of recursive reminders (Hintzman, 2011), which posits that the semantic relationships between new and old word pairs promote reminders of the old information during the learning of the new to-be-learned information. These reminders help to integrate the old and new information and result in additional retrieval practice opportunities that in turn improve later recall. 

      Strengths: 

      Overall, I thought that the analyses were thorough and well-thought-out and the results were incredibly well-situated in the literature. In particular, I found that the large sample size, inclusion of a wide range of semantic relatedness across the two stimulus sets, variable delays, and the ability to directly compare the current results to their prior results on the retroactive effects of semantic relatedness were particular strengths of the authors' approach and make this an impressive contribution to the existing literature. I thought that their interpretations and conclusions were mostly reasonable and included appropriate caveats (where applicable). 

      We thank the reviewer for this kind, effective summary and highlight of the paper’s strengths!

      Weaknesses: 

      Although I found that the paper was very strong overall, I have three main questions and concerns about the analyses. 

      My first concern lies in the use of the narrow versus wider stimulus sets. I understand why the initial narrow stimulus set was defined using associative similarity (especially in the context of their previous paper on the retroactive effects of semantic similarity), and I also understand their rationale for including an additional wider stimulus set. What I am less clear on, however, is the theoretical justification for separating the datasets. The authors include a section combining them and show in a control analysis that there were no directional effects in the narrow stimulus set. The authors seem to imply in the Discussion that they believe there are global effects of the lower average relatedness on differing patterns of PI vs PF across stimulus sets (lines 549-553), but I wonder if an alternative explanation for some of their conflicting results could be that PI only occurs with pairs of low semantic relatedness between the supplemental and base pair and that because the narrower stimulus set does not include the truly semantically unrelated pairs, there was no evidence of PI. 

      We agree with the reviewer’s interpretation here, and we have now directly stated this in the discussion section (p. 26):

      “Altogether, these results show that PI can still occur with low relatedness, like in other studies finding PI in ΔTarget (A-B, A-D) paradigms (for a review see, Anderson & Neely, 1996), but PF occurs with higher relatedness. In fact, the absence of low relatedness pairs in the narrower stimulus set likely led to the strong overall PF in this condition across all pairs (positive y-intercept in the upper right of Fig 3A).”

      As for the remainder of this concern, please see our response to your elaboration on the critique below.

      My next concern comes from the additive change in both measures (change in Cue + change in Target). This measure is simply a measure of overall change, in which a pair where the cue changes a great deal but the target doesn't change is treated equivalently to a pair where the target changes a lot, but the cue does not change at all, which in turn are treated equivalently to a pair where the cue and target both change moderate amounts. Given that the authors speculate that there are different processes occurring with the changes in cue and target and the lack of relationship between cue+target relatedness and memorability, it might be important to tease apart the relative impact of the changes to the different aspects of the pair. 

      We thank the reviewer for this great point. First, we should clarify that we only added cue and target similarity values in the ΔBoth condition, which means that all instances of equivalence relate to non-zero values for both cue and target similarity. However, it is certainly possible cue and target similarity separately influence memorability or interdependence. We have now run this analysis separately for cue and target similarity (but within the ΔBoth condition). For memorability, neither cue nor target similarity independently predicted memorability within the ΔBoth condition in any of the four main experiments (all p > 0.23). Conversely, there were some relationships with interdependence. In the narrower stimulus set, 48-hr delay experiment, both cue and target similarity significantly or marginally predicted base-secondary pair interdependence (Cue: r = 0.30, p = 0.04; Target: r = 0.29, p = 0.054). Notably, both survived partial correlation analyses partialing out the other factor (Cue: r = 0.33, p = 0.03; Target: r = 0.32, p = 0.04). In the wider stimulus set, 48-hr delay experiment, only target similarity predicted interdependence (Cue: r = 0.09, p = 0.55; Target: r = 0.34, p = 0.02), and target similarity also predicted interdependence after partialing out cue similarity (r = 0.34, p = 0.02). Similarly, in the narrower stimulus set, 5-min delay experiment, only target similarity predicted interdependence (Cue: r = 0.01, p = 0.93; Target: r = 0.41, p = 0.005), and target similarity also predicted interdependence after partialing out cue similarity (r = 0.42, p = 0.005). Neither predicted interdependence in the wider stimulus set, 5-min delay experiment (Cue: r = -0.14, p = 0.36; Target: r = 0.09, p = 0.54). We have opted to leave this out of the paper for now, but we could include it if the reviewer believes it is worthwhile.

      Note that we address the multiple regression point raised by the reviewer in the critique below.

      Finally, it is unclear to me whether there was any online spell-checking that occurred during the free recall in the learning phase. If there wasn't, I could imagine a case where words might have accidentally received additional retrieval opportunities during learning - take for example, a case where a participant misspelled "razor" as "razer." In this example, they likely still successfully learned the word pair but if there was no spell-checking that occurred during the learning phase, this would not be considered correct, and the participant would have had an additional learning opportunity for that pair. 

      We did not use online spell checking. We agree that misspellings would be considered successful instances of learning (meaning that for those words, they would essentially have successful retrieval more than once). However, we do not have a reason to think that this would meaningfully differ across conditions, so the main learning results would still hold. We have included this in the Methods (p. 29-30):

      “We did not use spell checking during learning, meaning that in some cases pairs could have been essentially retrieved more than once. However, we do not believe this would differ across conditions to affect learning results.”

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      In terms of the framing of the paper, I think the paper would benefit from a clearer explication of the different theories at play in the introductory section. There are a few theories being examined. Memory-for-change is described in most detail in the discussion, it would help to describe it more deliberately in the intro. The authors refer to a PI account, and this is contrasted with the memory-for-change account, but it seems to me that these theories are not mutually exclusive. In the discussion, several theories are mentioned in passing without being named, e.g., I believe the authors are referring to the fan effect when they mention the difference between delta-cue and delta-target conditions. Perhaps this could be addressed with a more detailed account of the theory underlying Osgood's predictions, which I believe arise from an associative account of paired-associates memory. Osgood's work took place when there was a big debate between unlearning and interference. The current work isn't designed to speak directly to that old debate. But it may be possible to develop the theory a bit more in the intro, which would go a long way towards scaffolding the many results for the reader, by giving them a better sense up front of the theoretical implications. 

      We thank the reviewer for this comment and the nudge to clarify these points. First, we have now made the memory-for-change and remindings accounts more explicit in the introduction, as well as the fact that we are combining the two in forming predictions for the current study (p. 3):

      “Conversely, in favor of the PF account, we consider two main, related theories. The first is the importance of “remindings” in memory, which involve reinstating representations from an earlier study phase during later learning (Hintzman, 2011). This idea centers study-phase retrieval, which involves being able to mentally recall prior information and is usually applied to exact repetitions of the same material (Benjamin & Tullis, 2010; Hintzman et al., 1975; Siegel & Kahana, 2014; Thios & D’Agostino, 1976; Zou et al., 2023). However, remindings can occur upon the presentation of related (but not identical) material and can result in better memory for both prior and new information when memory for the linked events becomes more interdependent (Hintzman, 2011; Hintzman et al., 1975; McKinley et al., 2019; McKinley & Benjamin, 2020; Schlichting & Preston, 2017; Tullis et al., 2014; Wahlheim & Zacks, 2019). The second is the memory-for-change framework, which builds upon these ideas and argues that humans often retrieve prior experiences during new learning, either spontaneously by noticing changes from what was learned previously or by instruction (Jacoby et al., 2015; Jacoby & Wahlheim, 2013). The key advance of this framework is that recollecting changes is necessary for PF, whereas PI occurs without recollection. This framework has been applied to paradigms including stimulus changes, including common paired associate paradigms (e.g., A-B, A-D) that we cover extensively later. Because humans may be more likely to notice and recall prior information when it is more related to new information, these two accounts would predict that semantic relatedness instead promotes successful remindings, which would create PF and interdependence among the traces.”

      Second, as the reviewer suggests, we were referring to the fan effect in the discussion, and we have now made that more explicit (p. 26):

      “We believe these effects arise from the competing processes of impairments between competing responses at retrieval that have not been integrated versus retrieval benefits when that integration has occurred (which occurs especially often with high target relatedness). These types of competing processes appear operative in various associative learning paradigms such as retrieval-induced forgetting (Anderson & McCulloch, 1999; Carroll et al., 2007), and the fan effect (Moeser, 1979; Reder & Anderson, 1980).”

      Finally, our reading of Osgood’s proposal is as an attempt to summarize the qualitative effects of the scattered literature (as of 1949) and did not discuss many theories. For this reason, we generally focus on the directional predictions relating to Osgood’s surface, but we couch it in theories proposed since then.

      It strikes me that the advantage seen for items in the retroactive study compared to the proactive study is consistent with classic findings examining spontaneous recovery. These classic studies found that first-learned materials tended to recover to a level above second-learned materials as time passed. This could be consistent with the memory-for-change proposal presented in the text. The memory-for-change proposal provides a potential cognitive mechanism for the effect, here I'm just suggesting a connection that could be made with the spontaneous recovery literature. 

      We thank the reviewer for this suggestion. Indeed, we agree there is a meaningful point of connection here. We have added the following to the Discussion (p. 27):

      “Additionally, these effects partially resemble those on spontaneous recovery, whereby original associations tend to face interference after new, conflicting learning, but slowly recover over time (either absolutely or relative to the new learning) and often eventually eclipse memory for the new information (Barnes & Underwood, 1959; Postman et al., 1969; Wheeler, 1995). In both cases, original associations appear more robust to change over time, though it is unclear whether these similar outcomes stem from similar mechanisms.”

      Minor recommendations 

      Line 89: relative existing -> relative to existing. 

      Line 132: "line from an unrelated and identical target" -> from an unrelated to identical target (take a look, just needs rephrasing). 

      Line 340: (e.g. peace-shaverazor) I wasn't clear whether this was a typographical error, or whether the intent was to typographically indicate a unified representation. <br /> Line 383: effects on relatedness -> effects of relatedness. 

      We think the reviewer for catching these errors. We have fixed them, and for the third comment, we have clarified that we indeed meant to indicate a unified representation (p. 12):

      “[e.g., peace-shaverazor (written jointly to emphasize the unification)]”

      Page 24: Figure 8. I think the statistical tests in this figure are just being done between the pairs of the same color? Like in the top left panel, delta-cue pro and delta-target retro are adjacent and look equivalent, but there is no n.s. marking for this pair. Could consider keeping the connecting line between the linked conditions and removing the connecting lines that span different conditions. 

      Indeed, we were only comparing conditions with the same color. We have changed the connecting lines to reflect this.

      Page 26 line 612: I think this is the first mention that the remindings account is referred to as the memory-for-change framework, consider mentioning this in the introduction. 

      Thank you – we have now mentioned this in the introduction.

      Lines 627-630. Is this sentence referring to the fan effect? If so it could help the reader to name it explicitly. 

      We have now named this explicitly.

      Reviewer #2 (Recommendations For The Authors): 

      This is a matter of personal preference, but I would prefer PI and PF spelled out instead of the abbreviations. This was also true for RI and RF which are defined early but then not used for 20 pages before being re-used again. In contrast, the naming of the within-subject conditions was very intuitive. 

      We appreciate this perspective. However, we prefer to keep the terms PI and PF for the sake of brevity. We now re-introduce terms that do not return until later in the manuscript.

      Osgood surface in Figure 1A could be easier to read if slightly reformatted. For example, target and cue relatedness sides are very disproportional and I kept wondering if that was intentional. The z-axis could be slightly more exaggerated so it's easier to see the critical messages in that figure (e.g., flip from + to - effect along the one dimension). The example word pairs were extremely helpful. 

      Figures 1C and 1D were also very helpful. It would be great if they could be a little bigger as the current version is hard to read. 

      Figure 1B took a while to decipher and could use a little more anticipation in the body of the text. Any reason to plot the x-axis from high to low on this figure? It is confusing (and not done in the actual results figures). I believe the supplemental GloVe equivalent in the supplement also has a confusing x-axis. 

      Thank the reviewer for this feedback. We have modified Figure 1A to reduce the disproportionality and accentuate the z-axis changes. We have also made the text in C and D larger. Finally, we have flipped around the x-axis in B and in the supplement.

      The description of relatedness values was rather confusing. It is not intuitive to accept that AS values from 0.03-0.96 are "narrow", as that seems to cover almost the whole theoretical range. I do understand that 0.03 is still a value showing relatedness, but more explanation would be helpful. It is also not clear how the GloVe values compare to the AS values. If I am understanding the measures and ranges correctly, the "narrow" condition could also be called "related only" while the "wide" condition could be called "related and unrelated". This is somewhat verbalized but could be clearer. In general, please provide a straightforward way for a reader to explicitly or implicitly compare those conditions, or even plot the "narrow" condition using both AS values and GloVe values so one can really compare narrow and wider conditions comparing apples with apples. 

      We thank the reviewer for this critique. First, we have now sought to clarify this in the Introduction (p. 11-12):

      “Across the first four experiments, we manipulated two factors: range of relatedness among the pairs and retention interval before the final test. The narrower range of relatedness used direct AS between pairs using free association norms, such that all pairs had between 0.03-0.96 association strength. Though this encompasses what appears to be a full range of relatedness values, pairs with even low AS are still related in the context of all possible associations (e.g., pious-holy has AS = 0.03 but would generally be considered related) (Fig 1B). The stimuli using a wider range of relatedness spanned the full range of global vector similarity (Pennington et al., 2014) that included many associations that would truly be considered unrelated (Fig 1-Supp 1A). One can see the range of the wider relatedness values in Fig 1-Supp 1B and comparisons between narrower and wider relatedness values in Fig 1-Supp 1C.”

      Additionally, as noted in the text above, we have added a new subfigure to Fig 1-Supp 1 that compares the relatedness values in the narrower and wider stimulus sets using the common GloVe metric.

      Considering a relationship other than linear may also be beneficial (e.g., the difference between AS of 0.03 and 0.13 may not be equal to AS of .83 and .93; same with GloVe). I am assuming that AS and GloVe are not linear transforms of each other. Thus, it is not clear whether one should expect a linear (rather than curvilinear or another monotonic) relationship with both of them. It could be as simple as considering rank-order correlation rather than linear correlation, but just wanted to put this out for consideration. The linear approach is still clearly fruitful (e.g., interdependence), but limits further the utility of having both narrow and wide conditions without a straightforward way to compare them. 

      We thank the reviewer for this point. Indeed, AS and GloVe are not linear transforms of each other, but metrics derived from different sources (AS comes from human free associations; GloVe comes from a learned vector space language model). (We noted this in the text and in our response to your above comment.) However, we do have the ability to put all the word pairs into the GloVe metric, which we do in the Results section, “Re-assessing proactive memory and interdependence effects using a common metric”. In this analysis, we used a linear correlation that combined data sets with a similar retention interval and replicated our main findings earlier in the paper (p. 5):

      “In the 48-hr delay experiment, correlations between memorability and cue relatedness in the ΔCue condition [r2(44) > 0.29, p < 0.001] and target relatedness in the ΔTarget condition [r2(44) = 0.2, p < 0.001] were significant, whereas cue+target relatedness in the ΔBoth condition was not [r2(44) = 0.01, p = 0.58]. In all three conditions, interdependence increased with relatedness [all r2(44) > 0.16, p < 0.001].”

      Following the reviewer suggestion to test things out using rank order, we also re-created the combined analysis using rank order based on GloVe values rather than the raw GloVe values. The ranks now span 1-90 (because there were 45 pairs in each of the narrower and wider stimulus sets). All results qualitatively held.

      Author response image 1.

      Rank order results.

      Author response image 2.

      And the raw results in Fig 6-Supp 1 (as a reference).

      Reviewer #3 (Recommendations For The Authors):

      In regards to my first concern, the authors could potentially test whether the stimulus sets are different by specifically looking at pairs from the wider stimulus set that overlap with the range of relatedness from the narrow set and see if they replicate the results from the narrow stimulus set. If the results do not differ, the authors could simplify their results section by collapsing across stimulus sets (as they did in the analyses presented in Figure 6 - Supplementary Figure 1). If the authors opt to keep the stimulus sets separate, it would be helpful to include a version of Figure 1b/Figure 1 - Supplementary Figure 1 where the coverage of the two stimulus sets are plotted on the same figure using GloVe similarity so it is easier to interpret the results. 

      We have conducted this analysis in two ways, though we note that we will eventually settle upon keeping the stimulus sets separate. First, we examined memorability between the data sets by removing one pair at a time from the wider stimulus set until there was no significant difference (p > 0.05). We did this at the long delay because that was more informative for most of our analyses. Even after reducing the wider stimulus set, the narrow stimulus set still had significantly or marginally higher memorability in all three conditions (p < 0.001 for ΔCue; p < 0.001 for ΔTarget; p = 0.08 for ΔBoth. We reasoned that this was likely because the AS values still differed (all, p < 0.001), which would present a clear way for participants to associate words that may not be as strongly similar in vector space (perhaps due to polysemy for individual words). When we ran the analysis a different way that equated AS, we no longer found significant memorability differences (p \= 0.13 for ΔCue; p = 0.50 for ΔTarget; p = 0.18 for ΔBoth). However, equating the two data sets in this analysis required us to drop so many pairs to equate the wider stimulus data set (because only a few only had a direct AS connection; there were 3, 5, and 1 pairs kept in the ΔCue, ΔTarget, and ΔBoth conditions) that we would prefer not to report this result.

      Additionally, we now plot the two stimulus sets on the same plot (Reviewer 2 also suggested this).

      In regards to my second concern, one potential way the authors could disambiguate the effects of change in cue vs change in target might be to run a multiple linear regression with change in Cue, change in Target, and the change in Cue*change in Target interaction (potentially with random effects of subject identity and word pair identity to combine experiments and control for pair memorability/counterbalancing), which has the additional bonus of potentially allowing the authors to include all word pairs in a single model and better describe the Osgood-style spaces in Figure 6.

      This is a very interesting idea. We set this analysis up as the reviewer suggested, using fixed effects for ΔCue, ΔTarget, and ΔCue*ΔTarget, and random effects for subject and word ID. Because we had a binary outcome variable, we used mixed effects logistic regression. For a given pair, if it had the same cue or target, the corresponding change column received a 0, and if it had a different cue or target, it received a graded value (1 - GloVe value between the new and old cue or target). For this analysis, because we designed this analysis to indicate a treatment away from a repeat (as in the No Δ condition, which had no change for either cues and targets), we omitted control items. For items in the ΔBoth condition, we initially used positive values in both the Cue and Target columns too, with the multiplied ΔCue*ΔTarget value in its own column. We focused these analyses on the 48-hr delay experiments. In both experiments, running it this way resulted in highly significant negative effects of ΔCue and ΔTarget (both p < 0.001), but positive effects of ΔCue*ΔTarget (p < 0.001), presumably because after accounting for the negative independent predictions of both ΔCue and ΔTarget, ΔCue*ΔTarget values actually were better than expected.

      We thought that those results were a little strange given that generally there did not appear to be interactions with ΔCue*ΔTarget values, and the positive result was simply due to the other predictors in the model. To show that this is the case, we changed the predictors so that items in the ΔBoth condition had 0 in ΔCue and ΔTarget columns alongside their ΔCue*ΔTarget value. In this case, all three factors negatively predicted memory (all p < 0.001).

      We don't necessarily see this second approach as better, partly because it seems clear to us that any direction you go from identity is just hurting memory, and we felt the need to drop the control condition. We next flipped around the analysis to more closely resemble how we ran the other analyses, using similarity instead of distance. Here, identity along any dimension indicated a 1, a change in any part of the pair involved using that pair’s GloVe value (rather than the 1 – the GloVe value from above), and the control condition simply had zeros in all the columns. In this case, if we code the cue and target similarity values as themselves in the ΔBoth condition, in both 48-hr experiments, cue and target similarity significantly positively predicted memory (narrower set: cue similarity had p = 0.006, target similarity had p < 0.001; wider set: both p < 0.001) and the interaction term negatively predicted memory (p < 0.001 in both). If we code cue and target similarity values as 0s in the ΔBoth condition, all three factors tend to be positive (narrower, Cue: p = 0.11, Target and Interaction: p < 0.001; wider, Cue and Target p < 0.001; Interaction: p = 0.07).

      Ultimately, we would prefer to leave this out of the manuscript in the interest of simplicity and because we largely find that these analyses support our prior conclusions. However, we could include them if the reviewer prefers.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review):

      In this study, Alejandro Rosell et al. uncovers the immunoregulation functions of RAS-p110α pathway in macrophages, including the extravasation of monocytes from the bloodstream and subsequent lysosomal digestion. Disrupting RAS-p110α pathway by mouse genetic tools or by pharmacological intervention, hampers the inflammatory response, leading to delayed resolution and more severe acute inflammatory reactions. The authors proposed that activating p110α using small molecules could be a promising approach for treating chronic inflammation. This study provides insights into the roles and mechanisms of p110α on macrophage function and the inflammatory response, while some conclusions are still questionable because of several issues described below. 

      (1) Fig. 1B showed that disruption of RAS-p110α causes the decrease in the activation of NF-κB, which is a crucial transcription factor that regulates the expression of proinflammatory genes. However, the authors observed that disruption of RAS-p110α interaction results in an exacerbated inflammatory state in vivo, in both localized paw inflammation and systemic inflammatory mediator levels. Also, the authors introduced that "this disruption leads to a change in macrophage polarization, favoring a more proinflammatory M1 state" in introduction according to reference 12. The conclusions drew from the signaling and the models seemed contradictory and puzzling. Besides, it is not clear why the protein level of p65 was decreased at 10' and 30'. Was it attributed to the degradation of p65 or experimental variation? 

      We thank the reviewer for this insightful comment and apologize for not previously explaining the implications of the observed decrease in NF-κB activation. We found a decrease in NF-κB activation in response to LPS + IFN-γ stimulation in macrophages lacking RAS-PI3K interaction. As the reviewer pointed out, NF-κB is a key transcription factor that regulates the expression of various proinflammatory genes. To better characterize whether the decrease in p-p65 would lead to a reduction in the expression of specific cytokines, we performed a cytokine array using unstimulated and LPS + IFN-γ stimulated macrophages. The results indicated a small number of cytokines with altered expression, validating that RAS-p110α activation of p-p65 regulates the expression of some inflammatory cytokines. These results have been added to the manuscript and to Figure 1 (panels C and D). In brief, the data suggest an impairment in recruitment factors and inflammatory regulators following the disruption of RAS-p110α signaling in macrophages, which aligns with the observed in vivo phenotype. 

      Our findings indicate that the disruption of RAS-p110α signaling has a complex and multifaceted role in BMDMs. Specifically, monocytes lacking RAS-PI3K are unable to reach the inflamed area due to an impaired ability to extravasate, caused by altered actin cytoskeleton dynamics. Consequently, inflammation is sustained over time, continuously releasing inflammatory mediators. Moreover, we have shown that macrophages deficient in RAS-p110α interaction fail to mount a full inflammatory response due to decreased activation of p-p65, leading to reduced production of a set of inflammatory regulators. Additionally, these macrophages are unable to effectively process phagocytosed material and activate the resolutive phase of inflammation. As a result of these defects, an exacerbated and sustained inflammatory response occurs. 

      Our in vivo data, showing an increase in systemic inflammatory mediators, might be a consequence of the accumulation of monocytes produced by bone marrow progenitors in response to sensed inflammatory stimuli, but unable to extravasate.

      Regarding the sentence in the introduction: "this disruption leads to a change in macrophage polarization, favoring a more proinflammatory M1 state" (reference 12), this was observed in an oncogenic context, which might differ from the role of RAS-p110α in a non-oncogenic situation, as analyzed in this work. We introduced these results as an example to establish the role of RAS-p110α in macrophages, demonstrating its participation in macrophage-dependent responses. Together with our study, these findings clearly indicate that p110α signaling is critical when analyzing full immune responses. Previously, little was known about the role of this PI3K isoform in immune responses. Our data, along with those presented by Murillo et al. (ref. 12), demonstrate that p110α plays a significant role in macrophage function in both oncogenic and inflammatory contexts. Additionally, our results suggest that this role is complex and multifaceted, warranting further investigation to fully understand the complexity of p110α signaling in macrophages.

      Regarding decreased levels of p65 at 10’ and 30’ in RBD cells we are still uncertain about the possible molecular mechanism leading to the observed decrease. No changes in p65 mRNA levels were observed after 30 minutes of LPS+IFNγ treatment as shown in Author response image 1.

      Author response image 1.

      Preliminary data not shown here suggest that treating macrophages with BYL exhibits a similar effect, indicating a potential pathway for investigation. Considering that the decrease in protein levels is not due to lower mRNA expression, we may infer that post-translational mechanisms are leading to early protein degradation in RAS-p110α deficient macrophages. This could explain the observed decrease in protein activation. However, the specific molecular mechanism responsible for this degradation remains unclear, and further research is necessary to elucidate it. 

      (2) In Fig 3, the authors used bone-marrow derived macrophages (BMDMs) instead of isolated monocytes to evaluate the ability of monocyte transendothelial migration, which is not sufficiently convincing. In Fig. 3B, the authors evaluated the migration in Pik3caWT/- BMDMs, and Pik3caWT/WT BMDMs treated with BYL-719'. Given that the dose effect of gene expression, the best control is Pik3caWT/- BMDMs treated with BYL-719. 

      We thank reviewer for this comment. While we agree that using BMDMs might not be the most conventional approach for studying monocyte migration, there were several reasons why we still considered them a valid method. While isolated monocytes are the initial cell type involved in transendothelial migration, bone marrow-derived macrophages (BMDMs) provide a relevant and practical model for studying this process. BMDMs are differentiated from the same bone marrow precursors as monocytes and retain the ability to respond to chemotactic signals, adhere to endothelial cells, and migrate through the endothelium. This makes them a suitable tool for examining the cellular and molecular mechanisms underlying monocyte migration and subsequent macrophage infiltration into tissues. Additionally, BMDMs offer experimental consistency and are easier to manipulate in vitro, enabling more controlled and reproducible studies. 

      In response to the comment regarding Fig. 3B, we appreciate the suggestion to use Pik3ca WT/- BMDMs treated with BYL-719 as a control. However, our rationale for using Pik3ca WT/WT BMDMs treated with BYL-719 was based on a conceptual approach rather than a purely experimental control. The BYL-719 treatment in Pik3ca WT/WT cells was intended to simulate the inhibition of p110α in a fully functional, wild-type context. This allows us to directly assess the impact of p110α inhibition under normal physiological conditions, which is more representative of what would occur in an organism where the full dose of Pik3ca is present. Using Pik3ca WT/- BMDMs treated with BYL-719 as a control may not accurately reflect the in vivo scenario, where any therapeutic intervention would likely occur in the context of a fully functional, wild-type background. Our approach aims to provide a clearer understanding of how p110α inhibition affects cell functionality in a wild-type setting, which is relevant for potential therapeutic applications. Therefore, we considered the use of Pik3ca WT/WT BMDMs with BYL-719 treatment to be a more appropriate control for testing the effects of p110α inhibition in normal conditions.

      (3) In Fig. 4E-4G, the authors observed that elevated levels of serine 3 phosphorylated Cofilin in Pik3caRBD/- BMDMs both in unstimulated and in proinflammatory conditions, and phosphorylation of Cofilin at Ser3 increase actin stabilization, it is not clear why disruption of RAS-p110α binding caused a decrease in the F-actin pool in unstimulated BMDMs? 

      We thank the reviewer for this insightful comment. During the review process, we have carefully quantified all the Western blots conducted. While we did observe an increase in phospho-Cofilin (Ser3) levels in RBD BMDMs, this increase did not reach statistical significance. As a result, we cannot confidently attribute the observed increase in F-actin to this proposed mechanism. We apologize for any confusion this may have caused. Consequently, we have removed these data from Figure 4G and the associated discussion.

      Unfortunately, we have not yet identified the underlying mechanism responsible for this phenotype. Future experiments will focus on exploring potential alterations in other actin-nucleating, regulating, and stabilizing proteins that could account for the observed changes in F-actin levels.

      Reviewer #2 (Public Review): 

      Summary: 

      Cell intrinsic signaling pathways controlling the function of macrophages in inflammatory processes, including in response to infection, injury or in the resolution of inflammation are incompletely understood. In this study, Rosell et al. investigate the contribution of RAS-p110α signaling to macrophage activity. p110α is a ubiquitously expressed catalytic subunit of PI3K with previously described roles in multiple biological processes including in epithelial cell growth and survival, and carcinogenesis. While previous studies have already suggested a role for RAS-p110α signaling in macrophages function, the cell intrinsic impact of disrupting the interaction between RAS and p110α in this central myeloid cell subset is not known. 

      Strengths: 

      Exploiting a sound previously described genetically mouse model that allows tamoxifen-inducible disruption of the RAS-p110α pathway and using different readouts of macrophage activity in vitro and in vivo, the authors provide data consistent with their conclusion that alteration in RAS-p110α signaling impairs the function of macrophages in a cell intrinsic manner. The study is well designed, clearly written with overall high-quality figures. 

      Weaknesses: 

      My main concern is that for many of the readouts, the difference between wild-type and mutant macrophages in vitro or between wild-type and Pik3caRBD mice in vivo is rather modest, even if statistically significant (e.g. Figure 1A, 1C, 2A, 2F, 3B, 4B, 4C). In other cases, such as for the analysis of the H&E images (Figure 1D-E, S1E), the images are not quantified, and it is hard to appreciate what the phenotype in samples from Pik3caRBD mice is or whether this is consistently observed across different animals. Also, the authors claim there is a 'notable decrease' in Akt activation but 'no discernible chance' in ERK activation based on the western blot data presented in Figure 1A. I do not think the data shown supports this conclusion. 

      We appreciate the reviewer's careful examination of our data and their observation regarding the modest differences between wild-type and mutant macrophages in vitro, as well as between wild-type and Pik3caRBD mice in vivo. While the differences observed in Figures 1A, 1C, 2A, 2F, 3B, 4B, and 4C are statistically significant but modest, our data demonstrate that they are biologically relevant and should be interpreted within the specific nature of our model. Our study focuses on the disruption of the RASp110α interaction, but it should be noted that alternative pathways for p110α activation, independent of RAS, remain functional in this model. Additionally, the model retains the expression of other p110 isoforms, such as p110β, p110γ, and p110δ, which are known to have significant roles in immune responses. Given the overlapping functions of these p110 isoforms, and the fact that our model involves a subtle modification that specifically affects the RAS-p110α interaction without completely abrogating p110α activity, it is understandable that only modest effects are observed in some readouts. The redundancy and compensation by other p110 isoforms likely mitigate the impact of disrupting RAS-mediated p110α activation.

      However, despite these modest in vitro differences, it is crucial to highlight that the in vivo effects on inflammation are both clear and consistent. The persistence of inflammation in our model suggests that the RAS-p110α interaction plays a specific, non-redundant role in resolving inflammation, which cannot be fully compensated by other signaling pathways or p110 isoforms. These findings underscore the importance of RAS-p110α signaling in immune homeostasis and suggest that even subtle disruptions in this pathway can lead to significant physiological consequences over time, particularly in the context of inflammation. The modest differences observed may represent early or subtle alterations that could lead to more pronounced phenotypes under specific stress or stimulation conditions. This could be tested across all the figures mentioned. For instance, in Fig. 1A, the Western blot for AKT has been quantified, demonstrating a significant decrease in AKT levels; in Fig. 1C, although the difference in paw inflammation was only a few millimeters in thickness, considering the size of a mouse paw, those millimeters were very noticeable by eye. Furthermore, pathological examination of the tissue consistently showed an increase in inflammation in RBD mice. Furthermore, the consistency of the observed differences across different readouts and experimental setups reinforces the reliability and robustness of our findings. Even modest changes that are consistently observed across different assays and conditions are indicative of genuine biological effects. The statistical significance of the differences indicates that they are unlikely to be due to random variation. This statistical rigor supports the conclusion that the observed effects, albeit modest, are real and warrant further exploration.

      Regarding the analysis of H&E images, we have now quantified the changes with the assistance of the pathologist, Mª Carmen García Macías, who has been added to the author list. We removed the colored arrows from the images and instead quantified fibrin and chromatin remnants as markers of inflammation staging. Loose chromatin, which increases as a consequence of cell death, is higher in the early phases of inflammation and decreases as macrophages phagocytose cell debris to initiate tissue healing. Chromatin content was scored on a scale from 1 to 3, where 1 represents the lowest amount and 3 the highest. The scoring was based on the area within the acute inflammatory abscess where chromatin could be found: 3 for less than 30%, 2 for 30-60%, and 1 for over 60%. Graphs corresponding to this quantification have now been added to Figure 1 and an explanation of the scale has been added to Material and Methods. 

      To further substantiate the extent of macrophage function alteration upon disruption of RAS-p110α signaling, the manuscript would benefit from testing macrophage activity in vitro and in vivo across other key macrophage activities such as bacteria phagocytosis, cytokine/chemokine production in response to titrating amounts of different PAMPs, inflammasome function, etc. This would be generally important overall but also useful to determine whether the defects in monocyte motility or macrophage lysosomal function are selectively controlled downstream of RAS-p110α signaling.  

      We thank reviewer #2 for this comment. In order to better address the role of RAS-PI3K in macrophage function, we have performed some additional experiments, some of which have been added to the revised version of the manuscript. 

      (1) We have performed cytokine microarrays of RAS-p110α deficient macrophages unstimulated and stimulated with LPS+IFN-g. Results have been added to the manuscript and to Supplementary Figure S1E and S1F. In brief, the data obtained suggest an impairment in recruitment factors, as well as in inflammatory regulators after disruption of RAS-p110α signaling in macrophages, which align with the in vivo observed phenotype. 

      (2) We also conducted phagocytosis assays to analyze the ability of RAS-p110α deficient macrophages to phagocytose 1 µm Sepharose beads, Borrelia burgdorferi, and apoptotic cells. The data reveal varied behavior of RAS-p110α deficient bone marrow-derived macrophages (BMDMs) depending on the target: 

      • Engulfment of Non-biological Particles: RAS-p110α deficient macrophages showed a decreased ability to engulf 1 µm Sepharose beads. This suggests that RAS-p110α signaling is important for the effective phagocytosis of non-biological particles. These findings have now been added to the text and figures have been added to supplementary Fig. S4A

      • Response to Bacterial Pathogens: When exposed to Borrelia burgdorferi, RAS-p110α deficient macrophages did not exhibit a change in bacterial uptake. This indicates that RAS-p110α may not play a critical role in the initial phagocytosis of this bacterial pathogen. The observed increase in the phagocytic index, although not statistically significant, might imply a compensatory mechanism or a more complex interaction that warrants further investigation. These findings have now been added to the text and figures have been added to supplementary Fig. S4B. These experiments were performed in collaboration with Dr. Anguita, from CICBioBune (Bilbao, Spain) and, as a consequence, he has been added as an author in the paper. 

      • Phagocytosis of Apoptotic Cells: There were no differences in the phagocytosis rate of apoptotic cells between RAS-p110α deficient and control macrophages at early time points. However, the accumulation of engulfed material at later time points suggests a possible delay in the processing and degradation of apoptotic cells in the absence of RAS-p110α signaling.

      These findings highlight the complexity of RAS-p110α's involvement in phagocytic processes and suggest that its role may vary with different types of phagocytic targets. 

      Furthermore, given the key role of other myeloid cells besides macrophages in inflammation and immunity it remains unclear whether the phenotype observed in vivo can be attributed to impaired macrophage function. Is the function of neutrophils, dendritic cells or other key innate immune cells not affected? 

      Thank you for this insightful comment. We understand the key role of other myeloid cells in inflammation and immunity. However, our study specifically focuses on the role of macrophages. Our data show that disruption of RAS-PI3K leads to a clear defect in macrophage extravasation, and our in vitro data demonstrate issues in macrophage cytoskeleton and phagocytosis, aligning with the in vivo phenotype.

      Experiments investigating the role of RAS-PI3K in neutrophils, dendritic cells, or other innate immune cells are beyond the scope of this study. Understanding these interactions would indeed require separate, comprehensive studies and the generation of new mouse models to disrupt RAS-PI3K exclusively in specific cell types.

      Furthermore, during paw inflammation experiments, polymorphonuclear cells were present from the initial phases of the inflammatory response. What caught our attention was the prolonged presence of these cells. In conversation with our in-house pathologist, she mentioned the lack of macrophages to remove dead polymorphonuclear cells in our RAS-PI3K mutant mice. Specific staining for macrophages confirmed the absence of macrophages in the inflamed node of mutant mice.

      We acknowledge that further research is necessary to elucidate the effects on other myeloid cells. However, our current findings provide clear evidence of a decrease in inflammatory monocytes and defective macrophage responses to inflammation, both in vivo and in vitro. We believe these results significantly contribute to understanding the role of RAS-PI3K in macrophage function during inflammation.

      Compelling proof of concept data that targeting RAS-p110α signalling constitutes indeed a putative approach for modulation of chronic inflammation is lacking. Addressing this further would increase the conceptual advance of the manuscript and provide extra support to the authors' suggestion that p110α inhibition or activation constitute promising approaches to manage inflammation. 

      We thank Reviewer #2 for this insightful comment. In our manuscript, we have demonstrated through multiple experiments that the inhibition of p110α, either by disrupting RAS-p110α signaling or through the use of Alpelisib (BYL-719), has a modulatory effect on inflammatory responses. However, we acknowledge that we have not activated the pathway due to the unavailability of a suitable p110α activator until the concluding phase of our study.

      We recognize the importance of this point and are eager about investigating both the inhibition and activation of p110α as potential approaches to managing inflammation in well-established inflammatory disease models. We believe that such comprehensive studies would significantly enhance the conceptual advance and translational relevance of our findings.

      However, it is essential to note that the primary aim of our current work was to demonstrate the role of RAS-p110α in the inflammatory responses of macrophages. We have successfully shown that RASp110α influences macrophage behavior and inflammatory signaling. Expanding the scope to include disease models and pathway activation studies would be an extensive project that goes beyond the current objectives of this manuscript. While our present study establishes the foundational role of RASp110α in macrophage-mediated inflammatory responses, we agree that further investigation into both p110α inhibition and activation in disease models is crucial. We are keen to pursue this line of research in future studies, which we believe will provide robust evidence supporting the therapeutic potential of targeting RAS-p110α signaling in chronic inflammation.

      Finally, the analysis by FACS should also include information about the total number of cells, not just the percentage, which is affected by the relative change in other populations. On this point, Figure S2B shows a substantial, albeit not significant (with less number of mice analysed), increase in the percentage of CD3+ cells. Is there an increase in the absolute number of T cells or does this apparent relative increase reflect a reduction in myeloid cells? 

      We thank the reviewer for this comment, which we have addressed in the revised version of the manuscript. Regarding the total number of cells analyzed, we have added to the Materials and Methods section that in all our studies, a total of 50,000 cells were analyzed (line 749). The percentages of cells are related to these 50,000 events. Additionally, we have increased the number of mice analyzed by including new mice for CD3+ cell analysis. Despite this, the results remain not significant.

      Recommendations for the authors:  

      Reviewer #1 (Recommendations For The Authors):   

      (1) It is recommended to provide a graphical abstract to summarize the multiple functions of RAS-p110α pathway in monocyte/macrophages that the authors proposed 

      We thank reviewer for this useful recommendation. A graphical abstract has now been added to the study. 

      (2) Western blots in this paper need quantification and a measure of reproducibility 

      We have now added a graph with the quantification of the western blots performed in this work as a measure of reproducibility. 

      (3) Representative flow data and gating strategy should be included

      We have now added the description of the gating strategy followed to material and methods section.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1:

      (1) Peptides were synthesized with fluorescein isothiocyanate (FITC) and Tat tag, and then PEGylated with methoxy PEG Succinimidyl Succinate.

      I have two concerns about the peptide design. First, FTIC was intended "for monitoring" (line 129), but was never used in the manuscript. Second, PEGylation targets the two lysine sidechains on the Tat, which would alter its penetration property.

      We conducted an analysis of the cellular trafficking of FITC-tagged peptides following their permeabilization into cells.

      Author response image 1.

      However, we did not include it in the main text because it is a basic result.

      (2) As can be seen in the figure above, after pegylation and permeabilization, the cells were stained with FITC. It appears that this does not affect the ability to penetrate into the cells.

      (2) "Superdex 200 increase 10/300 GL column" (line 437) was used to isolate mono/di PEGylated PDZ and separate them from the residual PEG and PDZ peptide. "m-PEG-succinimidyl succinate with an average molecular weight of 5000 Da" (lines 133 and 134).

      To my knowledge, the Superdex 200 increase 10/300 GL column is not suitable and is unlikely to produce traces shown in Figure 1B.

      As Superdex 200 increase 10/300 GL featrues a fractionation range of 10,000 to 600,000 Da, we used it to fractionate PEGylated products including DiPEGylated PDZ (approx. 15 kDa) and MonoPEGylated PDZ (approx. 10 kDa) from residuals (PDZ and PEG), demonstrating successful isolation of PEGylated products (Figure 1C). Considering the molecular weights of PDZ and PEG are approximately 4.1 kDa and and 5.0 kDa, respectively, the late eluting peaks from SEC were likely to represent a mixed absorbance of PDZ and PEG at 215 nm.

      However, as the reviewer pointed out, it could be unreasonable to annotate peaks representing PDZ and PEG, respectively, from mixed absorbance detected in a region (11-12 min) beyond the fractionation range.

      In our revised manuscript, therefore, multiple peaks in the late eluting volume (11-12 min) were labeled as 'Residuals' all together. As a reference, the revised figure 1B includes a chromatogram of pure PDZ-WT under the same analytic condition.

      Therefore, we changed Fig.1B to new results as followed:

      (3) "the in vivo survival effect of LPS and PDZ co-administration was examined in mice. The pretreatment with WT PDZ peptide significantly increased survival and rescued compared to LPS only; these effects were not observed with the mut PDZ peptide (Figure 2a)." (lines 159-160).

      Fig 2a is the weight curve only. The data is missing in the manuscript.

      We added the survived curve into Fig. 2A as followed:

      (4) Table 1, peptide treatment on ALT and AST appears minor.

      In mice treated with LPS, levels of ALT and AGT in the blood are elevated, but these levels decrease upon treatment with WT PDZ. However, the use of mut PDZ does not result in significant changes. Figure 3A shows inflammatory cells within the central vein, yet no substantial hepatotoxicity is observed during the 5-day treatment with LPS. Normally, the ranges of ALT and AGT in C57BL6 mice are 16 ~ 200 U/L and 46 ~ 221 U/L, respectively, according to UCLA Diagnostic Labs. Therefore, the values in all experiments fall within these normal ranges. In summary, a 5-day treatment with LPS induces inflammation in the liver but is too short a duration to induce hepatotoxicity, resulting in lower values.

      (5) MitoTraker Green FM shouldn't produce red images in Figure 6.

      We changed new results (GREEN one) into Figs 6A and B as followed:

      (6) Figure 5. Comparison of mRNA expression in PDZ-treated BEAS-2B cells. Needs a clearer and more detailed description both in the main text and figure legend. The current version is very hard to read.

      We changed Fig. 5A to new one to understand much easier and added more detailed results and figure legend as followed:

      Results Section in Figure 5:

      “…we performed RNA sequencing analysis. The results of RNA-seq analysis showed the expression pattern of 24,424 genes according to each comparison combination, of which the results showed the similarity of 51 genes overlapping in 4 gene categories and the similarity between each comparison combination (Figure 5a). As a result, compared to the control group, it was confirmed that LPS alone, WT PDZ+LPS, and mut PDZ+LPS were all upregulated above the average value in each gene, and when LPS treatment alone was compared with WT PDZ+LPS, it was confirmed that they were averaged or downregulated. When comparing LPS treatment alone and mut PDZ+LPS, it was confirmed that about half of the genes were upregulated. Regarding the similarity between comparison combinations, the comparison combination with LPS…”

      Figure 5 Legend Section:

      “Figure 5. Comparison of mRNA expression in PDZ-treated BEAS-2B cells.

      BEAS-2B cells were treated with wild-type PDZ or mutant PDZ peptide for 24 h and then incubated with LPS for 2 h, after which RNA sequencing analysis was performed. (a) The heat map shows the general regulation pattern of about 51 inflammation-related genes that are differentially expressed when WT PDZ and mut PDZ are treated with LPS, an inflammatory substance. All samples are RED = upregulated and BLUE = downregulated relative to the gene average. Each row represents a gene, and the columns represent the values of the control group treated only with LPS and the WT PDZ and mut PDZ groups with LPS. This was used by converting each log value into a fold change value. All genes were adjusted to have the same mean and standard deviation, the unit of change is the standard deviation from the mean, and the color value range of each row is the same. (b) Significant genes were selected using Gene category chat (Fold change value of 2.00 and normalized data (log2) value of 4.00). The above pie chart shows the distribution of four gene categories when comparing LPS versus control, WT PDZ+LPS/LPS, and mut PDZ+LPS/LPS. The bar graph below shows RED=upregulated, GREEN=downregulated for each gene category, and shows the number of upregulated and downregulated genes in each gene category. (c) The protein-protein interaction network constructed by the STRING database differentially displays commonly occurring genes by comparing WT PDZ+LPS/LPS, mut PDZ+LPS/LPS, and LPS. These nodes represent proteins associated with inflammation, and these connecting lines denote interactions between two proteins. Different line thicknesses indicate types of evidence used in predicting the associations.”

      Reviewer 2:

      (1) In this paper, the authors demonstrated the anti-inflammatory effect of PDZ peptide by inhibition of NF-kB signaling. Are there any results on the PDZ peptide-binding proteins (directly or indirectly) that can regulate LPS-induced inflammatory signaling pathway? Elucidation of the PDZ peptide-its binding partner protein and regulatory mechanisms will strengthen the author's hypothesis about the anti-inflammatory effects of PDZ peptide

      As mentioned in the Discussion section, we believe it is crucial to identify proteins that directly interact with PDZ and regulate it. This direct interaction can modulate intracellular signaling pathways, so we plan to express GST-PDZ and induce binding with cellular lysates, then characterize it using the LC-Mass/Mass method. We intend to further research these findings and submit them for publication.

      (2) The authors presented interesting insights into the therapeutic role of the PDZ motif peptide of ZO-1. PDZ domains are protein-protein interaction modules found in a variety of species. It has been thought that many cellular and biological functions, especially those involving signal transduction complexes, are affected by PDZ-mediated interactions. What is the rationale for selecting the core sequence that regulates inflammation among the PDZ motifs of ZO-1 shown in Figure 1A?

      The rationale for selecting the core sequence that regulates inflammation among the PDZ motifs of ZO-1, as shown in Figure 1A, is grounded in the specific roles these motifs play in signal transduction pathways that are crucial for inflammatory processes. PDZ domains are recognized for their ability to function as scaffolding proteins that organize signal transduction complexes, crucial for modulating cellular and biological functions. The chosen core sequence is particularly important because it is conserved across ZO-1, ZO-2, and ZO-3, indicating a fundamental role in maintaining cellular integrity and signaling pathways. This conservation suggests that the sequence’s involvement in inflammatory regulation is not only significant in ZO-1 but also reflects a broader biological function across the ZO family.

      (3) In Figure 3, the authors showed the representative images of IHC, please add the quantification analysis of Iba1 expression and PAS-positive cells using Image J or other software. To help understand the figure, an indication is needed to distinguish specifically stained cells (for example, a dotted line or an arrow).

      We added the semi-quantitative results into Figs. 4d,e,f as followed:

      Result section: “The specific physiological mechanism by which WT PDZ peptide decreases LPS-induced systemic inflammation in mice and the signal molecules involved remain unclear. These were confirmed by a semi-quantitative analysis of Iba-1 immunoreactivity and PAS staining in liver, kidney, and lung,respectively (Figures 4d, e, and f). To examine whether WT PDZ peptide can alter LPS-induced tissue damage in the kidney, cell toxicity assay was performed (Figure 3g). LPS induced cell damage in the kidney, however, WT PDZ peptide could significantly alleviate the toxicity, but mut PDZ peptide could not. Because cytotoxicity caused by LPS is frequently due to ROS production in the kidney (Su et al., 2023; Qiongyue et al., 2022), ROS production in the mitochondria was investigated in renal mitochondria cells harvested from kidney tissue (Figure 3h)....”

      Figure legend section: “Indicated scale bars were 20 μm. (d,e,f) Semi-quantitative analysis of each are positive for Iba-1 in liver and kidney, and positive cells of PAS in lung, respectively. (g) After the kidneys were harvested, tissue lysates were used for MTT assay. (h) After...”

      (4) In Figure 6G, H, the authors confirmed the change in expression of the M2 markers by PDZ peptide using the mouse monocyte cell line Raw264.7. It would be good to add an experiment on changes in M1 and M2 markers caused by PDZ peptides in human monocyte cells (for example, THP-1).

      We thank you for your comments. To determine whether PDZ peptide regulates M1/M2 polarization in human monocytes, we examined changes in M1 and M2 gene expression in THP-1 cells. As a result, wild-type PDZ significantly suppressed the expression of M1 marker genes (hlL-1β, hIL-6, hIL-8, hTNF-ɑ), while increasing the expression of M2 marker genes (hlL-4, hIL-10, hMRC-1). However, mutant PDZ did not affect M1/M2 polarization. These results suggest that PDZ peptide can suppress inflammation by regulating M1/M2 polarization of human monocyte cells. These results are for the reviewer's reference only and will not be included in the main content.

      Author response image 2.

      Author response image 3.

      Minor point:

      The use of language is appropriate, with good writing skills. Nevertheless, a thorough proofread would eliminate small mistakes such as:

      - line 254, " mut PDZ+LPS/LPS (45.75%) " → " mut PDZ+LPS/LPS (47.75%) "

      - line 296, " Figure 6f " → " Figure 6h "

      We changed these points into the manuscript.

    1. Reviewer #1 (Public review):

      Summary:

      The authors show that corticotropin-releasing factor (CRF) neurons in the central amygdala (CeA) and bed nucleus of the stria terminalis (BNST) monosynaptically target cholinergic interneurons (CINs) in the dorsal striatum of rodents. Functionally, activation of CRFR1 receptors increases CIN firing rate, and this modulation was reduced by pre-exposure to ethanol. This is an interesting finding, with potential significance for alcohol use disorders, but some conclusions could use additional support.

      Strengths:

      Well-conceived circuit mapping experiments identify a novel pathway by which the CeA and BNST can modulate dorsal striatal function by controlling cholinergic tone. Important insight into how CRF, a neuropeptide that is important in mediating aspects of stress, affective/motivational processes, and drug-seeking, modulates dorsal striatal function.

      Weaknesses:

      (1) Tracing and expression experiments were performed both in mice and rats (in a mostly non-overlapping way). While these species are similar in many ways, some conclusions are based on assumptions of similarities that the presented data do not directly show. In most cases, this should be addressed in the text (but see point number 2).

      (2) Experiments in rats show that CRFR1 expression is largely confined to a subpopulation of striatal CINs. Is this true in mice, too? Since most electrophysiological experiments are done in various synaptic antagonists and/or TTX, it does not affect the interpretation of those data, but non-CIN expression of CRFR1 could potentially have a large impact on bath CRF-induced acetylcholine release.

      (3) Experiments in rats show that about 30% of CINs express CRFR1 in rats. Did only a similar percentage of CINs in mice respond to bath application of CRF? The effect sizes and error bars in Figure 5 imply that the majority of recorded CINs likely responded. Were exclusion criteria used in these experiments?

      (4) The conclusion that prior acute alcohol exposure reduces the ability of subsequent alcohol exposure to suppress CIN activity in the presence of CRF may be a bit overstated. In Figure 6D (no ethanol pre-exposure), ethanol does not fully suppress CIN firing rate to baseline after CRF exposure. The attenuated effect of CRF on CIN firing rate after ethanol pre-treatment (6E) may just reduce the maximum potential effect that ethanol can have on firing rate after CRF, due to a lowered starting point. It is possible that the lack of significant effect of ethanol after CRF in pre-treated mice is an issue of experimental sensitivity. Related to this point, does pre-treatment with ethanol reduce the later CIN response to acute ethanol application (in the absence of CRF)?

      (5) More details about the area of the dorsal striatum being examined would be helpful (i.e., a-p axis).

    1. Author response:

      Reviewer #1 (Public review):

      Summary:

      The aim of this paper is to develop a simple method to quantify fluctuations in the partitioning of cellular elements. In particular, they propose a flow-cytometry-based method coupled with a simple mathematical theory as an alternative to conventional imaging-based approaches.

      Strengths:

      The approach they develop is simple to understand and its use with flow-cytometry measurements is clearly explained. Understanding how the fluctuations in the cytoplasm partition vary for different kinds of cells is particularly interesting.

      Weaknesses:

      The theory only considers fluctuations due to cellular division events. This seems a large weakness because it is well known that fluctuations in cellular components are largely affected by various intrinsic and extrinsic sources of noise and only under particular conditions does partitioning noise become the dominant source of noise.

      We thank the Reviewer for her/his evaluation of our manuscript. The point raised is indeed a crucial one. In a cell division cycle, there are at least three distinct sources of noise that affect component numbers [1] : 

      (1) Gene expression and degradation, which determine component numbers fluctuations during cell growth.

      (2) Variability in cell division time, which depending on the underlying model may or may not be a function of protein level and gene expression.

      (3) Noise in the partitioning/inheritance of components between mother and daughter cells.

      Our approach specifically addresses the latter, with the goal of providing a quantitative measure of this noise source. For this reason, in the present work, we consider homogeneous cancer cell populations that could be considered to be stationary from a population point-of-view. By tracking the time evolution of the distribution of tagged components via live fluorescent markers, we aim at isolating partitioning noise effects. However, as noted by the Reviewer, other sources of noise are present, and depending on the considered system the relative contributions of the different sources may change. Thus, we agree that a quantification of the effect of the various noise sources on the accuracy of our measurements will improve the reliability of our method. 

      In this respect, assuming independence between noise sources, we reasoned that variability in cell cycle length would affect the timing of population emergence but not the intrinsic properties of those populations (e.g., Gaussian variance). To test this hypothesis, we conducted a preliminary set of simulations in which cell division times were drawn from an Erlang distribution (mean = 18 h, k=4k = 4k=4). The results, showing the behavior of the mean and variance of the component distributions across generations, are presented in Author response image 1. Under the assumption of independence between different noise sources, no significant effects were observed. Next, we plan to quantify the accuracy of our measurements in the presence of cross-talks between the various noise sources. As suggested, we will update the manuscript to include a more complete discussion on this topic and an evaluation of our model’s stability.

      Author response image 1.

      Variance and mean of the distribution of fluorescence intensity as a function of the generation for a time course dynamic with cell-cycle length variability. We repeated the same simulations as the one in figure 1 of the manuscript, but introducing a variable division time for each cell. The division time of each cell is extracted from an Erlang distribution (mean = 18 h and k = 4). As it is possible to observe in the plots, the results of our theoretical framework are not affected from the introduction of this variability. Hence, the Gaussian Mixture Model is still able to give the correct results  even in a noisy environment.

      (1) Soltani, Mohammad, et al. "Intercellular variability in protein levels from stochastic expression and noisy cell cycle processes." PLoS computational biology 12.8 (2016): e1004972.

      Reviewer #2 (Public review):

      Summary:

      The authors present a combined experimental and theoretical workflow to study partitioning noise arising during cell division. Such quantifications usually require time-lapse experiments, which are limited in throughput. To bypass these limitations, the authors propose to use flow-cytometry measurements instead and analyse them using a theoretical model of partitioning noise. The problem considered by the authors is relevant and the idea to use statistical models in combination with flow cytometry to boost statistical power is elegant. The authors demonstrate their approach using experimental flow cytometry measurements and validate their results using time-lapse microscopy. However, while I appreciate the overall goal and motivation of this work, I was not entirely convinced by the strength of this contribution. The approach focuses on a quite specific case, where the dynamics of the labelled component depend purely on partitioning. As such it seems incompatible with studying the partitioning noise of endogenous components that exhibit production/turnover. The description of the methods was partly hard to follow and should be improved. In addition, I have several technical comments, which I hope will be helpful to the authors.

      We are grateful to the Reviewer for her/his comments. Indeed, both partitioning and production turnover noise are in general fundamental processes. At present the only way to consider them together are time-consuming and costly transfection/microscopy/tracking experiments. In this work, we aimed at developing a method to effectively pinpoint the first component, i.e. partitioning noise thus we opted to separate the two different noise sources.  

      Below, we provide a point-by-point response that we hope will clarify all raised concerns.

      Comments:

      (1) In the theoretical model, copy numbers are considered to be conserved across generations. As a consequence, concentrations will decrease over generations due to dilution. While this consideration seems plausible for the considered experimental system, it seems incompatible with components that exhibit production and turnover dynamics. I am therefore wondering about the applicability/scope of the presented approach and to what extent it can be used to study partitioning noise for endogenous components. As presented, the approach seems to be limited to a fairly small class of experiments/situations.

      We see the Reviewer's point. Indeed, we are proposing a high-throughput and robust procedure to measure the partitioning/inheritance noise of cell components through flow cytometry time courses. By using live-cell staining of cellular compounds, we can track the effect of partitioning noise on fluorescence intensity distribution across successive generations. This specific procedure is purposely optimized to isolate partitioning noise from other sources and, as it is, can not track endogenous components or dyes that require fixation. While this certainly poses limits to the proposed approach, there are numerous contexts in which our methodology could be used to explore the role of asymmetric inheritance. Among others, (i) investigating how specific organelles are differentially partitioned and how this influences cellular behavior could provide deeper insights into fundamental biological processes: asymmetric segregation of organelles is a key factor in cell differentiation, aging, and stress response. During cell division, organelles such as mitochondria, the endoplasmic reticulum, lysosomes, peroxisomes, and centrosomes can be unequally distributed between daughter cells, leading to functional differences that influence their fate. For instance, Kajaitso et al. [1] proposed that asymmetric division of mitochondria in stem cells is associated with the retention of stemness traits in one daughter cell and differentiation in the other. As organisms age, stem cells accumulate damage, and to prevent exhaustion and compromised tissue function, cells may use asymmetric inheritance to segregate older or damaged subcellular components into one daughter cell. (ii) Asymmetric division has also been linked to therapeutic resistance in Cancer Stem Cells  [2]. Although the functional consequences are not yet fully determined, the asymmetric inheritance of mitochondria is recognized as playing a pivotal role [3]. Another potential application of our methodology may be (iii) the inheritance of lysosomes, which, together with mitochondria, appears to play a crucial role in determining the fate of human blood stem cells [4]. Furthermore, similar to studies conducted on liquid tumors [5][6], our approach could be extended to investigate cell growth dynamics and the origins of cell size homeostasis in adherent cells [7][8][9].  The aforementioned cases of study can be readily addressed using our approach that in general is applicable whenever live-cell dyes can be used. We will add a discussion of the strengths and limitations of the method in the Discussion section of the revised version of the manuscript. 

      (1) Katajisto, Pekka, et al. "Asymmetric apportioning of aged mitochondria between daughter cells is required for stemness." Science 348.6232 (2015): 340-343.

      (2) Hitomi, Masahiro, et al. "Asymmetric cell division promotes therapeutic resistance in glioblastoma stem cells." JCI insight 6.3 (2021): e130510.

      (3) García-Heredia, José Manuel, and Amancio Carnero. "Role of mitochondria in cancer stem cell resistance." Cells 9.7 (2020): 1693.

      (4) Loeffler, Dirk, et al. "Asymmetric organelle inheritance predicts human blood stem cell fate." Blood, The Journal of the American Society of Hematology 139.13 (2022): 2011-2023.

      (5) Miotto, Mattia, et al. "Determining cancer cells division strategy." arXiv preprint arXiv:2306.10905 (2023).

      (6) Miotto, Mattia, et al. "A size-dependent division strategy accounts for leukemia cell size heterogeneity." Communications Physics 7.1 (2024): 248.

      (7) Kussell, Edo, and Stanislas Leibler. "Phenotypic diversity, population growth, and information in fluctuating environments." Science 309.5743 (2005): 2075-2078.

      (8) McGranahan, Nicholas, and Charles Swanton. "Clonal heterogeneity and tumor evolution: past, present, and the future." Cell 168.4 (2017): 613-628.

      (9) De Martino, Andrea, Thomas Gueudré, and Mattia Miotto. "Exploration-exploitation tradeoffs dictate the optimal distributions of phenotypes for populations subject to fitness fluctuations." Physical Review E 99.1 (2019): 012417.

      (2) Similar to the previous comment, I am wondering what would happen in situations where the generations could not be as clearly identified as in the presented experimental system (e.g., due to variability in cell-cycle length/stage). In this case, it seems to be challenging to identify generations using a Gaussian Mixture Model. Can the authors comment on how to deal with such situations? In the abstract, the authors motivate their work by arguing that detecting cell divisions from microscopy is difficult, but doesn't their flow cytometry-based approach have a similar problem?

      The point raised is an important one, as it highlights the fundamental role of the gating strategy. The ability to identify the distribution of different generations using the Gaussian Mixture Model (GMM) strongly depends on the degree of overlap between distributions. The more the distributions overlap, the less capable we are of accurately separating them.

      The extent of overlap is influenced by the coefficients of variation (CV) of both the partitioning distribution function and the initial component distribution. Specifically, the component distribution at time t results from the convolution of the component distribution itself at time t−1 and the partitioning distribution function. Therefore, starting with a narrow initial component distribution allows for better separation of the generation peaks. The balance between partitioning asymmetry and the width of the initial component distribution is thus crucial.

      As shown in Author response image 2, increasing the CV of either distribution reduces the ability to distinguish between different generations.

      Author response image 2.

      Components distribution at varying CVs of initial components and partitioning distributions. Starting from a condition in which both division asymmetry and wideness of the initial components distribution are low and different generations are clearly separable, increasing either the CVs leads to distribution mixing and greater reconstruction difficulty.

      However, the variance of the initial distribution cannot be reduced arbitrarily. While selecting a narrow distribution facilitates a better reconstruction of the distributions, it simultaneously limits the number of cells available for the experiment. Therefore, for components exhibiting a high level of asymmetry, further narrowing of the initial distribution becomes experimentally impractical.

      In such cases, an approach previously tested on liquid tumors [1] involves applying the Gaussian Mixture Model (GMM) in two dimensions by co-staining another cellular component with lower division asymmetry.

      Regarding time-lapse fluorescence microscopy, the main challenge lies not in disentangling the interplay of different noise sources, but rather in obtaining sufficient statistical power from experimental data. While microscopy provides detailed insights into the division process and component partitioning, its low throughput limits large-scale statistical analyses. Current segmentation algorithms still perform poorly in crowded environments and with complex cell shapes, requiring a substantial portion of the image analysis pipeline to be performed manually, a process that is time-consuming and difficult to scale. In contrast, our cytometry-based approach bypasses this analysis bottleneck, as it enables a direct population-wide measurement of the system's evolution. We will provide a detailed discussion on these aspects in the revised version of the manuscript.

      (1) Peruzzi, Giovanna, et al. "Asymmetric binomial statistics explains organelle partitioning variance in cancer cell proliferation." Communications Physics 4.1 (2021): 188.

      (3) I could not find any formal definition of division asymmetry. Since this is the most important quantity of this paper, it should be defined clearly.

      We thank the Reviewer for the note. With division asymmetry we refer to a quantity that reflects how similar two daughter cells are likely to be in terms of inherited components after a division process. We opted to measure it via the coefficient of variation (root squared variance divided by the mean) of the partitioning fraction distribution. We will amend this lack of definition in the reviewed version of the manuscript. 

      (4) The description of the model is unclear/imprecise in several parts. For instance, it seems to me that the index "i" does not really refer to a cell in the population, but rather a subpopulation of cells that has undergone a certain number of divisions. Furthermore, why is the argument of Equation 11 suddenly the fraction f as opposed to the component number? I strongly recommend carefully rewriting and streamlining the model description and clearly defining all quantities and how they relate to each other.

      We are amending the text carefully to avoid double naming of variables and clarifying each computation passage. In equation 11 the variable f refers to the fluorescent intensity, but the notation will be changed to increase clarity. 

      (5) Similarly, I was not able to follow the logic of Section D. I recommend carefully rewriting this section to make the rationale, logic, and conclusions clear to the reader.

      We will update the manuscript clarifying the scope of section D and its results. In brief, Section A presents a general model to derive the variance of the partitioning distribution from flow cytometry time-course data without making any assumptions about the shape of the distribution itself. In Section D, our goal is to interpret the origin of asymmetry and propose a possible form for the partitioning distribution. Since the dyes used bind non-specifically to cytoplasmic amines, the tagged proteins are expected to be uniformly distributed throughout the cytoplasm and present in large numbers. Given these assumptions the least complex model for division follows the binomial distribution, with a parameter that measures the bias in the process. Therefore, we performed a similar computation to that in Section A, which allows us to estimate not only the variance but also the degree of biased asymmetry. Finally, we fitted the data to this new model and proposed an experimental interpretation of the results.

      (6) Much theoretical work has been done recently to couple cell-cycle variability to intracellular dynamics. While the authors neglect the latter for simplicity, it would be important to further discuss these approaches and why their simplified model is suitable for their particular experiments.

      We agree with the Reviewer, we will discuss this aspect in the revised version of the manuscript.

      (7) In the discussion the authors note that the microscopy-based estimates may lead to an overestimation of the fluctuations due to limited statistics. I could not follow that reasoning. Due to the gating in the flow cytometry measurements, I could imagine that the resulting populations are more stringently selected as compared to microscopy. Could that also be an explanation? More generally, it would be interesting to see how robust the results are in terms of different gating diameters.

      The Reviewer is right on the importance of the sorting procedure. As already discussed in a previous point, the gating strategy we employed plays a fundamental role: it reduces the overlap of fluorescence distributions as generations progress, enables the selection of an initial distribution distinct from the fluorescence background, allowing for longer tracking of proliferation, and synchronizes the initial population. The narrower the initial distribution, the more separated the peaks of different generations will be. However, this also results in a smaller number of cells available for the experiment, requiring a careful balance between precision and experimental feasibility. A similar procedure, although it would certainly limit the estimation error, would be impracticable In the case of microscopy. Indeed, the primary limitation and source of error is the number of recorded events. Our pipeline allowed us to track on the order of hundreds of division dynamics, but the analysis time scales non-linearly with the number of events. Significantly increasing the dataset would have been extremely time-consuming. Reducing the analysis to cells with similar fluorescence, although theoretically true, would have reduced the statistics to a level where the sampling error would drastically dominate the measure. Moreover, different experiments would have been hardly comparable, since different fluorescences could map in equally sized cells. In light of these factors, we expect higher CV for the microscopy measure than for flow cytometry’s ones.  In the plots below, we show the behaviour of the mean and the standard deviation of N numbers sampled from a gaussian distribution N(0,1) as a function of the sampling number N. The higher is N the closer the sampled distribution will be to the true one. The region in the hundreds of samples is still very noisy, but to do much better we would have to reach the order of thousands. We will add a discussion on these aspects in the reviewed version of the manuscript. 

      Author response image 3.

      Standard deviation and mean value of a distribution of points sampled from a Gaussian distribution with mean 0 and standard deviation 1,  versus the number of samples, N. Increasing N leads to a closer approximation of the expected values. In orange is highlighted the Microscopy Working Region (Microscopy WR) which corresponds to the number of samples we are able to reach with microscopy experiments. In yellow the region we would have to reach to lower the estimating error, which is although very expensive in terms of analysis time.

      (8) It would be helpful to show flow cytometry plots including the identified subpopulations for all cell lines, currently, they are shown only for HCT116 cells. More generally, very little raw data is shown.

      We will provide the requested plots for the other cell lines together with additional raw data coming from simulations in the Supplementary Material. 

      (9) The title of the manuscript could be tailored more to the considered problem. At the moment it is very generic.

      We see the Reviewer point. The proposed title aims at conveying the wide applicability of the presented approach, which ultimately allows for the assessment of the levels of fluctuations in the levels of the cellular components at division. This in turn reflects the asymmetricity in the division.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      This work provides a new dataset of 71,688 images of different ape species across a variety of environmental and behavioral conditions, along with pose annotations per image. The authors demonstrate the value of their dataset by training pose estimation networks (HRNet-W48) on both their own dataset and other primate datasets (OpenMonkeyPose for monkeys, COCO for humans), ultimately showing that the model trained on their dataset had the best performance (performance measured by PCK and AUC). In addition to their ablation studies where they train pose estimation models with either specific species removed or a certain percentage of the images removed, they provide solid evidence that their large, specialized dataset is uniquely positioned to aid in the task of pose estimation for ape species.

      The diversity and size of the dataset make it particularly useful, as it covers a wide range of ape species and poses, making it particularly suitable for training off-the-shelf pose estimation networks or for contributing to the training of a large foundational pose estimation model. In conjunction with new tools focused on extracting behavioral dynamics from pose, this dataset can be especially useful in understanding the basis of ape behaviors using pose.

      We thank the reviewer for the kind comments.

      Since the dataset provided is the first large, public dataset of its kind exclusively for ape species, more details should be provided on how the data were annotated, as well as summaries of the dataset statistics. In addition, the authors should provide the full list of hyperparameters for each model that was used for evaluation (e.g., mmpose config files, textual descriptions of augmentation/optimization parameters).

      We have added more details on the annotation process and have included the list of instructions sent to the annotators. We have also included mmpose configs with the code provided. The following files include the relevant details:

      File including the list of instructions sent to the annotators: OpenMonkeyWild Photograph Rubric.pdf

      Mmpose configs:

      i) TopDownOAPDataset.py

      ii) animal_oap_dataset.py

      iii) init.py

      iv) hrnet_w48_oap_256x192_full.py

      Anaconda environment files:

      i) OpenApePose.yml

      ii) requirements.txt

      Overall this work is a terrific contribution to the field and is likely to have a significant impact on both computer vision and animal behavior.

      Strengths:

      • Open source dataset with excellent annotations on the format, as well as example code provided for working with it.

      • Properties of the dataset are mostly well described.

      • Comparison to pose estimation models trained on humans vs monkeys, finding that models trained on human data generalized better to apes than the ones trained on monkeys, in accordance with phylogenetic similarity. This provides evidence for an important consideration in the field: how well can we expect pose estimation models to generalize to new species when using data from closely or distantly related ones? - Sample efficiency experiments reflect an important property of pose estimation systems, which indicates how much data would be necessary to generate similar datasets in other species, as well as how much data may be required for fine-tuning these types of models (also characterized via ablation experiments where some species are left out).

      • The sample efficiency experiments also reveal important insights about scaling properties of different model architectures, finding that HRNet saturates in performance improvements as a function of dataset size sooner than other architectures like CPMs (even though HRNets still perform better overall).

      We thank the reviewer for the kind comments.

      Weaknesses:

      • More details on training hyperparameters used (preferably full config if trained via mmpose).

      We have now included mmpose configs and anaconda environment files that allow researchers to use the dataset with specific versions of mmpose and other packages we trained our models with. The list of files is provided above.

      • Should include dataset datasheet, as described in Gebru et al 2021 (arXiv:1803.09010).

      We have included a datasheet for our dataset in the appendix lines 621-764.

      • Should include crowdsourced annotation datasheet, as described in Diaz et al 2022 (arXiv:2206.08931). Alternatively, the specific instructions that were provided to Hive/annotators would be highly relevant to convey what annotation protocols were employed here.

      We have included the list of instructions sent to the Hive annotators in the supplementary materials. File: OpenMonkeyWild Photograph Rubric.pdf

      • Should include model cards, as described in Mitchell et al (arXiv:1810.03993).

      We have included a model card for the included model in the results section line 359. See Author response image 1.

      Author response image 1.

      • It would be useful to include more information on the source of the data as they are collected from many different sites and from many different individuals, some of which may introduce structural biases such as lighting conditions due to geography and time of year.

      We agree that the source could introduce structural biases. This is why we included images from so many different sources and captured images at different times from the same source—in hopes that a large variety of background and lighting conditions are represented. However, doing so limits our ability to document each source background and lighting condition separately.

      • Is there a reason not to use OKS? This incorporates several factors such as landmark visibility, scale, and landmark type-specific annotation variability as in Ronchi & Perona 2017 (arXiv:1707.05388). The latter (variability) could use the human pose values (for landmarks types that are shared), the least variable keypoint class in humans (eyes) as a conservative estimate of accuracy, or leverage a unique aspect of this work (crowdsourced annotations) which affords the ability to estimate these values empirically.

      The focus of this work is on overall keypoint localization accuracy and hence we wanted a metric that is easy to interpret and implement, in this case we made use of PCK (Percentage of Correct Keypoints). PCK is a simple and widely used metric that measures the percentage of correctly localized keypoints within a certain distance threshold from their corresponding groundtruth keypoints.

      • A reporting of the scales present in the dataset would be useful (e.g., histogram of unnormalized bounding boxes) and would align well with existing pose dataset papers such as MS-COCO (arXiv:1405.0312) which reports the distribution of instance sizes and instance density per image.

      RESPONSE: We have now included a histogram of unnormalized bounding boxes in the manuscript, Author response image 2.

      Author response image 2.

      Reviewer #2 (Public Review):

      The authors present the OpenApePose database constituting a collection of over 70000 ape images which will be important for many applications within primatology and the behavioural sciences. The authors have also rigorously tested the utility of this database in comparison to available Pose image databases for monkeys and humans to clearly demonstrate its solid potential.

      We thank the reviewer for the kind comments.

      However, the variation in the database with regards to individuals, background, source/setting is not clearly articulated and would be beneficial information for those wishing to make use of this resource in the future. At present, there is also a lack of clarity as to how this image database can be extrapolated to aid video data analyses which would be highly beneficial as well.

      I have two major concerns with regard to the manuscript as it currently stands which I think if addressed would aid the clarity and utility of this database for readers.

      1) Human annotators are mentioned as doing the 16 landmarks manually for all images but there is no assessment of inter-observer reliability or the such. I think something to this end is currently missing, along with how many annotators there were. This will be essential for others to know who may want to use this database in the future.

      We thank the reviewer for pointing this out. Inter-observer reliability is important for ensuring the quality of the annotations. We first used Amazon MTurk to crowd source annotations and found that the inter-observer reliability and the annotation quality was poor. This was the reason for choosing a commercial service such as Hive AI. As the crowd sourcing and quality control are managed by Hive through their internal procedures, we do not have access to data that can allow us to assess inter-observer reliability. However, the annotation quality was assessed by first author ND through manual inspections of the annotations visualized on all of the images the database. Additionally, our ablation experiments with high out of sample performances further vaildate the quality of the annotations.

      Relevant to this comment, in your description of the database, a table or such could be included, providing the number of images from each source/setting per species and/or number of individuals. Something to give a brief overview of the variation beyond species. (subspecies would also be of benefit for example).

      Our goal was to obtain as many images as possible from the most commonly studied ape species. In order to ensure a large enough database, we focused only on the species and combined images from as many sources as possible to reach our goal of ~10,000 images per species. With the wide range of people involved in obtaining the images, we could not ensure that all the photographers had the necessary expertise to differentiate individuals and subspecies of the subjects they were photographing. We could only ensure that the right species was being photographed. Hence, we cannot include more detailed information.

      2) You mention around line 195 that you used a specific function for splitting up the dataset into training, validation, and test but there is no information given as to whether this was simply random or if an attempt to balance across species, individuals, background/source was made. I would actually think that a balanced approach would be more appropriate/useful here so whether or not this was done, and the reasoning behind that must be justified.

      This is especially relevant given that in one test you report balancing across species (for the sample size subsampling procedure).

      We created the training set to reflect the species composition of the whole dataset, but used test sets balanced by species. This was done to give a sense of the performance of a model that could be trained with the entire dataset, that does not have the species fully balanced. We believe that researchers interested in training models using this dataset for behavior tracking applications would use the entire dataset to fully leverage the variation in the dataset. However, for those interested in training models with balanced species, we provide an annotation file with all the images included, which would allow researchers to create their own training and test sets that meet their specific needs. We have added this justification in the manuscript to guide the other users with different needs. Lines 530-534: “We did not balance our training set for the species as we wanted to utilize the full variation in the dataset and assess models trained with the proportion of species as reflected in the dataset. We provide annotations including the entire dataset to allow others to make create their own training/validation/test sets that suit their needs.”

      And another perhaps major concern that I think should also be addressed somewhere is the fact that this is an image database tested on images while the abstract and manuscript mention the importance of pose estimation for video datasets, yet the current manuscript does not provide any clear test of video datasets nor engage with the practicalities associated with using this image-based database for applications to video datasets. Somewhere this needs to be added to clarify its practical utility.

      We thank the reviewer for this important suggestion. Since we can separate a video into its constituent frames, one can indeed use the provided model or other models trained using this dataset for inference on the frames, thus allowing video tracking applications. We now include a short video clip of a chimpanzee with inferences from the provided model visualized in the supplementary materials.

      Reviewer #1 (Recommendations For The Authors):

      • Please provide a more thorough description of the annotation procedure (i.e., the instructions given to crowd workers)! See public review for reference on dataset annotation reporting cards.

      We have included the list of instructions for Hive annotators in the supplementary materials.

      • An estimate of the crowd worker accuracy and variability would be super valuable!

      While we agree that this is useful, we do not have access to Hive internal data on crowd worker IDs that could allow us to estimate these metrics. Furthermore, we assessed each image manually to ensure good annotation quality.

      • In the methods section it is reported that images were discarded because they were either too blurry, small, or highly occluded. Further quantification could be provided. How many images were discarded per species?

      It’s not really clear to us why this is interesting or important. We used a large number of photographers and annotators, some of whom gave a high ratio of great images; some of whom gave a poor ratio. But it’s not clear what those ratios tell us.

      • Placing the numerical values at the end of the bars would make the graphs more readable in Figures 4 and 5.

      We thank the reviewer for this suggestion. While we agree that this can help, we do not have space to include the number in a font size that would be readable. Smaller font sizes that are likely to fit may not be readable for all readers. We have included the numerical values in the main text in the results section for those interested and hope that the figures provide a qualitative sense of the results to the readers.

    1. Author response:

      eLife Assessment

      This valuable short paper is an ingenious use of clinical patient data to address an issue in imaging neuroscience. The authors clarify the role of face-selectivity in human fusiform gyrus by measuring both BOLD fMRI and depth electrode recordings in the same individuals; furthermore, by comparing responses in different brain regions in the two patients, they suggested that the suppression of blood oxygenation is associated with a decrease in local neural activity. While the methods are compelling and provide a rare dataset of potentially general importance, the presentation of the data in its current form is incomplete.

      We thank the Reviewing editor and Senior editor at eLife for their positive assessment of our paper. After reading the reviewers’ comments – to which we reply below - we agree that the presentation of the data could be completed. We provide additional presentation of data in the responses below and we will slightly modify Figure 2 of the paper. However, in keeping the short format of the paper, the revised version will have the same number of figures, which support the claims made in the paper.

      Reviewer #1 (Public review):

      Summary:

      Measurement of BOLD MR imaging has regularly found regions of the brain that show reliable suppression of BOLD responses during specific experimental testing conditions. These observations are to some degree unexplained, in comparison with more usual association between activation of the BOLD response and excitatory activation of the neurons (most tightly linked to synaptic activity) in the same brain location. This paper finds two patients whose brains were tested with both non-invasive functional MRI and with invasive insertion of electrodes, which allowed the direct recording of neuronal activity. The electrode insertions were made within the fusiform gyrus, which is known to process information about faces, in a clinical search for the sites of intractable epilepsy in each patient. The simple observation is that the electrode location in one patient showed activation of the BOLD response and activation of neuronal firing in response to face stimuli. This is the classical association. The other patient showed an informative and different pattern of responses. In this person, the electrode location showed a suppression of the BOLD response to face stimuli and, most interestingly, an associated suppression of neuronal activity at the electrode site.

      Strengths:

      Whilst these results are not by themselves definitive, they add an important piece of evidence to a long-standing discussion about the origins of the BOLD response. The observation of decreased neuronal activation associated with negative BOLD is interesting because, at various times, exactly the opposite association has been predicted. It has been previously argued that if synaptic mechanisms of neuronal inhibition are responsible for the suppression of neuronal firing, then it would be reasonable

      Weaknesses:

      The chief weakness of the paper is that the results may be unique in a slightly awkward way. The observation of positive BOLD and neuronal activation is made at one brain site in one patient, while the complementary observation of negative BOLD and neuronal suppression actually derives from the other patient. Showing both effects in both patients would make a much stronger paper.

      We thank reviewer #1 for their positive evaluation of our paper. Obviously, we agree with the reviewer that the paper would be much stronger if BOTH effects – spike increase and decrease – would be found in BOTH patients in their corresponding fMRI regions (lateral and medial fusiform gyrus) (also in the same hemisphere). Nevertheless, we clearly acknowledge this limitation in the (revised) version of the manuscript (p.8: Material and Methods section).

      In the current paper, one could think that P1 shows only increases to faces, and P2 would show only decreases (irrespective of the region). However, that is not the case since 11% of P1’s face-selective units are decreases (89% are increases) and 4% of P2’s face-selective units are increases. This has now been made clearer in the manuscript (p.5).

      As the reviewer is certainly aware, the number and position of the electrodes are based on strict clinical criteria, and we will probably never encounter a situation with two neighboring (macro-micro hybrid electrodes), one with microelectrodes ending up in the lateral MidFG, the other in the medial MidFG, in the same patient. If there is no clinical value for the patient, this cannot be done.

      The only thing we can do is to strengthen these results in the future by collecting data on additional patients with an electrode either in the lateral or the medial FG, together with fMRI. But these are the only two patients we have been able to record so far with electrodes falling unambiguously in such contrasted regions and with large (and comparable) measures.

      While we acknowledge that the results may be unique because of the use of 2 contrasted patients only (and this is why the paper is a short report), the data is compelling in these 2 cases, and we are confident that it will be replicated in larger cohorts in the future.

      Reviewer #2 (Public review):

      Summary:

      This is a short and straightforward paper describing BOLD fMRI and depth electrode measurements from two regions of the fusiform gyrus that show either higher or lower BOLD responses to faces vs. objects (which I will call face-positive and facenegative regions). In these regions, which were studied separately in two patients undergoing epilepsy surgery, spiking activity increased for faces relative to objects in the face-positive region and decreased for faces relative to objects in the face-negative region. Interestingly, about 30% of neurons in the face-negative region did not respond to objects and decreased their responses below baseline in response to faces (absolute suppression).

      Strengths:

      These patient data are valuable, with many recording sessions and neurons from human face-selective regions, and the methods used for comparing face and object responses in both fMRI and electrode recordings were robust and well-established. The finding of absolute suppression could clarify the nature of face selectivity in human fusiform gyrus since previous fMRI studies of the face-negative region could not distinguish whether face < object responses came from absolute suppression, or just relatively lower but still positive responses to faces vs. objects.

      Weaknesses:

      The authors claim that the results tell us about both 1) face-selectivity in the fusiform gyrus, and 2) the physiological basis of the BOLD signal. However, I would like to see more of the data that supports the first claim, and I am not sure the second claim is supported.

      (1) The authors report that ~30% of neurons showed absolute suppression, but those data are not shown separately from the neurons that only show relative reductions. It is difficult to evaluate the absolute suppression claim from the short assertion in the text alone (lines 105-106), although this is a critical claim in the paper.

      We thank reviewer #2 for their positive evaluation of our paper. We understand the reviewer’s point, and we partly agree. Where we respectfully disagree is that the finding of absolute suppression is critical for the claim of the paper: finding an identical contrast between the two regions in terms of RELATIVE increase/decrease of face-selective activity in fMRI and spiking activity is already novel and informative. Where we agree with the reviewer is that the absolute suppression could be more documented: it wasn’t, due to space constraints (brief report). We provide below an example of a neuron showing absolute suppression to faces. In the frequency domain, there is only a face-selective response (1.2 Hz and harmonics) but no significant response at 6 Hz (common general visual response). In the time-domain, relative to face onset, the response drops below baseline level. It means that this neuron has baseline (non-periodic) spontaneous spiking activity that is actively suppressed when a face appears.

      Author response image 1.

      (2) I am not sure how much light the results shed on the physiological basis of the BOLD signal. The authors write that the results reveal "that BOLD decreases can be due to relative, but also absolute, spike suppression in the human brain" (line 120). But I think to make this claim, you would need a region that exclusively had neurons showing absolute suppression, not a region with a mix of neurons, some showing absolute suppression and some showing relative suppression, as here. The responses of both groups of neurons contribute to the measured BOLD signal, so it seems impossible to tell from these data how absolute suppression per se drives the BOLD response.

      It is a fact that we find both kinds of responses in the same region.  We cannot tell with this technique if neurons showing relative vs. absolute suppression of responses are spatially segregated for instance (e.g., forming two separate sub-regions) or are intermingled. And we cannot tell from our data how absolute suppression per se drives the BOLD response. In our view, this does not diminish the interest and originality of the study, but the statement "that BOLD decreases can be due to relative, but also absolute, spike suppression in the human brain” will be rephrased in the revised manuscript, in the following way: "that BOLD decreases can be due to relative, or absolute (or a combination of both), spike suppression in the human brain”.

      Reviewer #3 (Public review):

      In this paper the authors conduct two experiments an fMRI experiment and intracranial recordings of neurons in two patients P1 and P2. In both experiments, they employ a SSVEP paradigm in which they show images at a fast rate (e.g. 6Hz) and then they show face images at a slower rate (e.g. 1.2Hz), where the rest of the images are a variety of object images. In the first patient, they record from neurons over a region in the mid fusiform gyrus that is face-selective and in the second patient, they record neurons from a region more medially that is not face selective (it responds more strongly to objects than faces). Results find similar selectivity between the electrophysiology data and the fMRI data in that the location which shows higher fMRI to faces also finds face-selective neurons and the location which finds preference to non faces also shows non face preferring neurons.

      Strengths:

      The data is important in that it shows that there is a relationship between category selectivity measured from electrophysiology data and category-selective from fMRI. The data is unique as it contains a lot of single and multiunit recordings (245 units) from the human fusiform gyrus - which the authors point out - is a humanoid specific gyrus.

      Weaknesses:

      My major concerns are two-fold:

      (i) There is a paucity of data; Thus, more information (results and methods) is warranted; and in particular there is no comparison between the fMRI data and the SEEG data.

      We thank reviewer #3 for their positive evaluation of our paper. If the reviewer means paucity of data presentation, we agree and we provide more presentation below, although the methods and results information appear as complete to us. The comparison between fMRI and SEEG is there, but can only be indirect (i.e., collected at different times and not related on a trial-by-trial basis for instance). In addition, our manuscript aims at providing a short empirical contribution to further our understanding of the relationship between neural responses and BOLD signal, not to provide a model of neurovascular coupling.

      (ii) One main claim of the paper is that there is evidence for suppressed responses to faces in the non-face selective region. That is, the reduction in activation to faces in the non-face selective region is interpreted as a suppression in the neural response and consequently the reduction in fMRI signal is interpreted as suppression. However, the SSVEP paradigm has no baseline (it alternates between faces and objects) and therefore it cannot distinguish between lower firing rate to faces vs suppression of response to faces.

      We understand the concern of the reviewer, but we respectfully disagree that our paradigm cannot distinguish between lower firing rate to faces vs. suppression of response to faces. Indeed, since the stimuli are presented periodically (6 Hz), we can objectively distinguish stimulus-related activity from spontaneous neuronal firing. The baseline corresponds to spikes that are non-periodic, i.e., unrelated to the (common face and object) stimulation. For a subset of neurons, even this non-periodic baseline activity is suppressed, above and beyond the suppression of the 6 Hz response illustrated on Figure 2. We mention it in the manuscript, but we agree that we do not present illustrations of such decrease in the time-domain for SU, which we did not consider as being necessary initially (please see below for such presentation).

      (1) Additional data: the paper has 2 figures: figure 1 which shows the experimental design and figure 2 which presents data, the latter shows one example neuron raster plot from each patient and group average neural data from each patient. In this reader's opinion this is insufficient data to support the conclusions of the paper. The paper will be more impactful if the researchers would report the data more comprehensively.

      We answer to more specific requests for additional evidence below, but the reviewer should be aware that this is a short report, which reaches the word limit. In our view, the group average neural data should be sufficient to support the conclusions, and the example neurons are there for illustration. And while we cannot provide the raster plots for a large number of neurons, the anonymized data will be made available upon publication of the final version of the paper.

      (a) There is no direct comparison between the fMRI data and the SEEG data, except for a comparison of the location of the electrodes relative to the statistical parametric map generated from a contrast (Fig 2a,d). It will be helpful to build a model linking between the neural responses to the voxel response in the same location - i.e., estimate from the electrophysiology data the fMRI data (e.g., Logothetis & Wandell, 2004).

      As mentioned above the comparison between fMRI and SEEG is indirect (i.e., collected at different times and not related on a trial-by-trial basis for instance) and would not allow to make such a model.

      (b) More comprehensive analyses of the SSVEP neural data: It will be helpful to show the results of the frequency analyses of the SSVEP data for all neurons to show that there are significant visual responses and significant face responses. It will be also useful to compare and quantify the magnitude of the face responses compared to the visual responses.

      The data has been analyzed comprehensively, but we would not be able to show all neurons with such significant visual responses and face-selective responses.

      (c) The neuron shown in E shows cyclical responses tied to the onset of the stimuli, is this the visual response?

      Correct, it’s the visual response at 6 Hz.

      If so, why is there an increase in the firing rate of the neuron before the face stimulus is shown in time 0?

      Because the stimulation is continuous. What is displayed at 0 is the onset of the face stimulus, with each face stimulus being preceded by 4 images of nonface objects.

      The neuron's data seems different than the average response across neurons; This raises a concern about interpreting the average response across neurons in panel F which seems different than the single neuron responses

      The reviewer is correct, and we apologize for the confusion. This is because the average data on panel F has been notch-filtered for the 6 Hz (and harmonic responses), as indicated in the methods (p.11):  ‘a FFT notch filter (filter width = 0.05 Hz) was then applied on the 70 s single or multi-units time-series to remove the general visual response at 6 Hz and two additional harmonics (i.e., 12 and 18 Hz)’.

      Here is the same data without the notch-filter (the 6Hz periodic response is clearly visible):

      Author response image 2.

      For sake of clarity, we prefer presenting the notch-filtered data in the paper, but the revised version will make it clear in the figure caption that the average data has been notch-filtered.

      (d) Related to (c) it would be useful to show raster plots of all neurons and quantify if the neural responses within a region are homogeneous or heterogeneous. This would add data relating the single neuron response to the population responses measured from fMRI. See also Nir 2009.

      We agree with the reviewer that this is interesting, but again we do not think that it is necessary for the point made in the present paper. Responses in these regions appear rather heterogenous, and we are currently working on a longer paper with additional SEEG data (other patients tested for shorter sessions) to define and quantify the face-selective neurons in the MidFusiform gyrus with this approach (without relating it to the fMRI contrast as reported here).

      (e) When reporting group average data (e.g., Fig 2C,F) it is necessary to show standard deviation of the response across neurons.

      We agree with the reviewer and have modified Figure 2 accordingly in the revised manuscript.

      (f) Is it possible to estimate the latency of the neural responses to face and object images from the phase data? If so, this will add important information on the timing of neural responses in the human fusiform gyrus to face and object images.

      The fast periodic paradigm to measure neural face-selectivity has been used in tens of studies since its original reports:

      - in EEG: Rossion et al., 2015: https://doi.org/10.1167/15.1.18

      - in SEEG: Jonas et al., 2016: https://doi.org/10.1073/pnas.1522033113

      In this paradigm, the face-selective response spreads to several harmonics (1.2 Hz, 2.4 Hz, 3.6 Hz, etc.) (which are summed for quantifying the total face-selective amplitude). This is illustrated below by the averaged single units’ SNR spectra across all recording sessions for both participants.

      Author response image 3.

      There is no unique phase-value, each harmonic being associated with a phase-value, so that the timing cannot be unambiguously extracted from phase values. Instead, the onset latency is computed directly from the time-domain responses, which is more straightforward and reliable than using the phase. Note that the present paper is not about the specific time-courses of the different types of neurons, which would require a more comprehensive report, but which is not necessary to support the point made in the present paper about the SEEG-fMRI sign relationship.

      g) Related to (e) In total the authors recorded data from 245 units (some single units and some multiunits) and they found that both in the face and nonface selective most of the recoded neurons exhibited face -selectivity, which this reader found confusing: They write “ Among all visually responsive neurons, we found a very high proportion of face-selective neurons (p < 0.05) in both activated and deactivated MidFG regions (P1: 98.1%; N = 51/52; P2: 86.6%; N = 110/127)’. Is the face selectivity in P1 an increase in response to faces and P2 a reduction in response to faces or in both it’s an increase in response to faces

      Face-selectivity is defined as a DIFFERENTIAL response to faces compared to objects, not necessarily a larger response to faces. So yes, face-selectivity in P1 is an increase in response to faces and P2 a reduction in response to faces.

      (1) Additional methods

      (a) it is unclear if the SSVEP analyses of neural responses were done on the spikes or the raw electrical signal. If the former, how is the SSVEP frequency analysis done on discrete data like action potentials?

      The FFT is applied directly on spike trains using Matlab’s discrete Fourier Transform function. This function is suitable to be applied to spike trains in the same way as to any sampled digital signal (here, the microwires signal was sampled at 30 kHz, see Methods).

      In complementary analyses, we also attempted to apply the FFT on spike trains that had been temporally smoothed by convolving them with a 20ms square window (Le Cam et al., 2023, cited in the paper ). This did not change the outcome of the frequency analyses in the frequency range we are interested in.

      (b) it is unclear why the onset time was shifted by 33ms; one can measure the phase of the response relative to the cycle onset and use that to estimate the delay between the onset of a stimulus and the onset of the response. Adding phase information will be useful.

      The onset time was shifted by 33ms because the stimuli are presented with a sinewave contrast modulation (i.e., at 0ms, the stimulus has 0% contrast). 100% contrast is reached at half a stimulation cycle, which is 83.33ms here, but a response is likely triggered before reaching 100% contrast. To estimate the delay between the start of the sinewave (0% contrast) and the triggering of a neural response, we tested 7 SEEG participants with the same images presented in FPVS sequences either as a sinewave contrast (black line) modulation or as a squarewave (i.e. abrupt) contrast modulation (red line).  The 33ms value is based on these LFP data obtained in response to such sinewave stimulation and squarewave stimulation of the same paradigm. This delay corresponds to 4 screen refresh frames (120 Hz refresh rate = 8.33ms by frame) and 35% of the full contrast, as illustrated below (please see also Retter, T. L., & Rossion, B. (2016). Uncovering the neural magnitude and spatio-temporal dynamics of natural image categorization in a fast visual stream. Neuropsychologia, 91, 9–28).

      Author response image 4.

      (2) Interpretation of suppression:

      The SSVEP paradigm alternates between 2 conditions: faces and objects and has no baseline; In other words, responses to faces are measured relative to the baseline response to objects so that any region that contains neurons that have a lower firing rate to faces than objects is bound to show a lower response in the SSVEP signal. Therefore, because the experiment does not have a true baseline (e.g. blank screen, with no visual stimulation) this experimental design cannot distinguish between lower firing rate to faces vs suppression of response to faces.

      The strongest evidence put forward for suppression is the response of non-visual neurons that was also reduced when patients looked at faces, but since these are non-visual neurons, it is unclear how to interpret the responses to faces.

      We understand this point, but how does the reviewer know that these are non-visual neurons? Because these neurons are located in the visual cortex, they are likely to be visual neurons that are not responsive to non-face objects. In any case, as the reviewer writes, we think it’s strong evidence for suppression.

      We thank all three reviewers for their positive evaluation of our paper and their constructive comments.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This paper concerns mechanisms of foraging behavior in C. elegans. Upon removal from food, C. elegans first executes a stereotypical local search behavior in which it explores a small area by executing many random, undirected reversals and turns called "reorientations." If the worm fails to find food, it transitions to a global search in which it explores larger areas by suppressing reorientations and executing long forward runs (Hills et al., 2004). At the population level, the reorientation rate declines gradually. Nevertheless, about 50% of individual worms appear to exhibit an abrupt transition between local and global search, which is evident as a discrete transition from high to low reorientation rate (Lopez-Cruz et al., 2019). This observation has given rise to the hypothesis that local and global search correspond to separate internal states with the possibility of sudden transitions between them (Calhoun et al., 2014). The main conclusion of the paper is that it is not necessary to posit distinct internal states to account for discrete transitions from high to low reorientation rates. On the contrary, discrete transitions can occur simply because of the stochastic nature of the reorientation behavior itself.

      Strengths:

      The strength of the paper is the demonstration that a more parsimonious model explains abrupt transitions in the reorientation rate.

      Weaknesses:

      (1) Use of the Gillespie algorithm is not well justified. A conventional model with a fixed dt and an exponentially decaying reorientation rate would be adequate and far easier to explain. It would also be sufficiently accurate - given the appropriate choice of dt - to support the main claims of the paper, which are merely qualitative. In some respects, the whole point of the paper - that discrete transitions are an epiphenomenon of stochastic behavior - can be made with the authors' version of the model having a constant reorientation rate (Figure 2f).

      We apologize, but we are not sure what the reviewer means by “fixed dt”. If the reviewer means taking discrete steps in time (dt), and modeling whether a reorientation occurs, we would argue that the Gillespie algorithm is a better way to do this because it provides floating-point precision time resolution, rather than a time resolution limited by dt, which we hopefully explain in the comments below.

      The reviewer is correct that discrete transitions are an epiphenomenon of stochastic behavior as we show in Figure 2f. However, abrupt stochastic jumps that occur with a constant rate do not produce persistent changes in the observed rate because it is by definition, constant. The theory that there are local and global searches is based on the observation that individual worms often abruptly change their rates. But this observation is only true for a fraction of worms. We are trying to argue that the reason why this is not observed for all, or even most worms is because these are the result of stochastic sampling, not a sudden change in search strategy.

      (2) In the manuscript, the Gillespie algorithm is very poorly explained, even for readers who already understand the algorithm; for those who do not it will be essentially impossible to comprehend. To take just a few examples: in Equation (1), omega is defined as reorientations instead of cumulative reorientations; it is unclear how (4) follows from (2) and (3); notation in (5), line 133, and (7) is idiosyncratic. Figure 1a does not help, partly because the notation is unexplained. For example, what do the arrows mean, what does "*" mean?

      We apologize for this, you are correct,  is cumulative reorientations, and we will edit the text as follows:

      Experimentally, reorientation rate is measured as the number of reorientation events that occurred in an observational window. However, these are discrete stochastic events, so we should describe them in terms of propensity, i.e. the probability of observing a transitional event (in this case, a reorientation) is:

      Here, P(W+1,t) is the probability of observing a reorientation event at time t, and a<sub>1</sub> is the propensity for this event to occur. Observationally, the frequency of reorientations observed decays over time, so we can define the propensity as:

      Where α is the initial propensity at t=0.

      We can model this decay as the reorientation propensity coupled to a decaying factor (M):

      Where the propensity of this event (a<sub>2</sub>) is:

      Since M is a first-order decay process, when integrated, the cumulative M observed is:

      We can couple the probability of observing a reorientation to this decay by redefining (a<sub>1</sub> as:

      So that now:

      A critical detail should be noted. While reorientations are modeled as discrete events, the amount of M at time t\=0 is chosen to be large (M<sub>0</sub>←1,000), so that over the timescale of 40 minutes, the decay in M is practically continuous. This ensures that sudden changes in reorientations are not due to sudden changes in M, but due to the inherent stochasticity of reorientations.

      To model both processes, we can create the master equation:

      Since these are both Poisson processes, the probability density function for a state change i occurring in time t is:

      The probability that an event will not occur in time interval t is:

      The probability that no events will occur for ALL transitions in this time interval is:

      We can draw a random number (r<sub>1</sub> ∈[0,1]) that represents the probability of no events in time interval t, so that this time interval can be assigned by rearranging equation 11:

      where:

      This is the time interval for any event (W+1 or M-1) happening at t + t. The probability of which event occurs is proportional to its propensity:

      We can draw a second number (r<sub>2</sub> ∈[0,1]) that represents this probability so that which event occurs at time t + t is determined by the smallest n that satisfies:

      so that:

      The elegant efficiency of the Gillespie algorithm is two-fold. First, it models all transitions simultaneously, not separately. Second, it provides floating-point time resolution. Rather than drawing a random number, and using a cumulative probability distribution of interval-times to decide whether an event occurs at discrete steps in time, the Gillespie algorithm uses this distribution to draw the interval-time itself. The time resolution of the prior approach is limited by step size, whereas the Gillespie algorithm’s time resolution is limited by the floating-point precision of the random number that is drawn.

      We are happy to add this text to improve clarity.

      We apologize for the arrow notation confusion. Arrow notation is commonly used in pseudocode to indicate variable assignment, and so we used it to indicate variable assignment updates in the algorithm.

      We added Figure 2a to help explain the Gillespie algorithm for people who are unfamiliar with it, but you are correct, some notation, like probabilities, were left unexplained. We will address this to improve clarity.

      (3) In the model, the reorientation rate dΩ⁄dt declines to zero but the empirical rate clearly does not. This is a major flaw. It would have been easy to fix by adding a constant to the exponentially declining rate in (1). Perhaps fixing this obvious problem would mitigate the discrepancies between the data and the model in Figure 2d.

      You are correct that the model deviates slightly at longer times, but this result is consistent with Klein et al. that show a continuous decline of reorientations. However, we could add a constant to the model, since an infinite run length is likely not physiological.

      (4) Evidence that the model fits the data (Figure 2d) is unconvincing. I would like to have seen the proportion of runs in which the model generated one as opposed to multiple or no transitions in reorientation rate; in the real data, the proportion is 50% (Lopez). It is claimed that the "model demonstrated a continuum of switching to non-switching behavior" as seen in the experimental data but no evidence is provided.

      We should clarify that the 50% proportion cited by López-Cruz was based on an arbitrary difference in slopes, and by assessing the data visually. We sought to avoid this subjective assessment by plotting the distribution of slopes and transition times produced by the method used in López-Cruz. We should also clarify by what we meant by “a continuum of switching and non-switching” behavior. Both the transition time distributions and the slope-difference distributions do not appear to be the result of two distributions. This is unlike roaming and dwelling on food, where two distinct distributions of behavioral metrics can be identified based on speed and angular speed (Flavell et al, 2009, Fig S2a). We will add a permutation test to verify the mean differences in slopes and transition times between the experiment and model are not significant.

      (5) The explanation for the poor fit between the model and data (lines 166-174) is unclear. Why would externally triggered collisions cause a shift in the transition distribution?

      Thank you, we should rewrite the text to clarify this better. There were no externally triggered collisions; 10 animals were used per experiment. They would occasionally collide during the experiment, but these collisions were excluded from the data that were provided. However, worms are also known to increase reorientations when they encounter a pheromone trail, and it is unknown (from this dataset) which orientations may have been a result of this phenomenon.

      (6) The discussion of Levy walks and the accompanying figure are off-topic and should be deleted.

      Thank you, we agree that this topic is tangential, and we will remove it.

      Reviewer #2 (Public review):

      Summary:

      In this study, the authors build a statistical model that stochastically samples from a time-interval distribution of reorientation rates. The form of the distribution is extracted from a large array of behavioral data, and is then used to describe not only the dynamics of individual worms (including the inter-individual variability in behavior), but also the aggregate population behavior. The authors note that the model does not require assumptions about behavioral state transitions, or evidence accumulation, as has been done previously, but rather that the stochastic nature of behavior is "simply the product of stochastic sampling from an exponential function".

      Strengths:

      This model provides a strong juxtaposition to other foraging models in the worm. Rather than evoking a behavioral transition function (that might arise from a change in internal state or the activity of a cell type in the network), or evidence accumulation (which again maps onto a cell type, or the activity of a network) - this model explains behavior via the stochastic sampling of a function of an exponential decay. The underlying model and the dynamics being simulated, as well as the process of stochastic sampling, are well described and the model fits the exponential function (Equation 1) to data on a large array of worms exhibiting diverse behaviors (1600+ worms from Lopez-Cruz et al). The work of this study is able to explain or describe the inter-individual diversity of worm behavior across a large population. The model is also able to capture two aspects of the reorientations, including the dynamics (to switch or not to switch) and the kinetics (slow vs fast reorientations). The authors also work to compare their model to a few others including the Levy walk (whose construction arises from a Markov process) to a simple exponential distribution, all of which have been used to study foraging and search behaviors.

      Weaknesses:

      This manuscript has two weaknesses that dampen the enthusiasm for the results. First, in all of the examples the authors cite where a Gillespie algorithm is used to sample from a distribution, be it the kinetics associated with chemical dynamics, or a Lotka-Volterra Competition Model, there are underlying processes that govern the evolution of the dynamics, and thus the sampling from distributions. In one of their references, for instance, the stochasticity arises from the birth and death rates, thereby influencing the genetic drift in the model. In these examples, the process governing the dynamics (and thus generating the distributions from which one samples) is distinct from the behavior being studied. In this manuscript, the distribution being sampled is the exponential decay function of the reorientation rate (lines 100-102). This appears to be tautological - a decay function fitted to the reorientation data is then sampled to generate the distributions of the reorientation data. That the model performs well and matches the data is commendable, but it is unclear how that could not be the case if the underlying function generating the distribution was fit to the data.

      Thank you, we apologize that this was not clearer. In the Lotka-Volterra model, the density of predators and prey are being modeled, with the underlying assumption that rates of birth and death are inherently stochastic. In our model, the number of reorientations are being modeled, with the assumption (based on the experiments), that the occurrence of reorientations is stochastic, just like the occurrence (birth) of a prey animal is stochastic. However, the decay in M is phenomenological, and we speculate about the nature of M later in the manuscript.

      You are absolutely right that the decay function for M was fitted to the population average of reorientations and then sampled to generate the distributions of the reorientation data. This was intentional to show that the parameters chosen to match the population average would produce individual trajectories with comparable stochastic “switching” as the experimental data. All we’re trying to show really is that observed sudden changes in reorientation that appear persistent can be produced by a stochastic process without resorting to binary state assignments. In Calhoun, et al 2014 it is reported all animals produced switch-like behavior, but in Klein et al, 2017 it is reported that no animals showed abrupt transitions. López-Cruz et al seem to show a mix of these results, which can be easily explained by an underlying stochastic process.

      The second weakness is somewhat related to the first, in that absent an underlying mechanism or framework, one is left wondering what insight the model provides. Stochastic sampling a function generated by fitting the data to produce stochastic behavior is where one ends up in this framework, and the authors indeed point this out: "simple stochastic models should be sufficient to explain observably stochastic behaviors." (Line 233-234). But if that is the case, what do we learn about how the foraging is happening? The authors suggest that the decay parameter M can be considered a memory timescale; which offers some suggestion, but then go on to say that the "physical basis of M can come from multiple sources". Here is where one is left for want: The mechanisms suggested, including loss of sensory stimuli, alternations in motor integration, ionotropic glutamate signaling, dopamine, and neuropeptides are all suggested: these are basically all of the possible biological sources that can govern behavior, and one is left not knowing what insight the model provides. The array of biological processes listed is so variable in dynamics and meaning, that their explanation of what governs M is at best unsatisfying. Molecular dynamics models that generate distributions can point to certain properties of the model, such as the binding kinetics (on and off rates, etc.) as explanations for the mechanisms generating the distributions, and therefore point to how a change in the biology affects the stochasticity of the process. It is unclear how this model provides such a connection, especially taken in aggregate with the previous weakness.

      Providing a roadmap of how to think about the processes generating M, the meaning of those processes in search, and potential frameworks that are more constrained and with more precise biological underpinning (beyond the array of possibilities described) would go a long way to assuaging the weaknesses.

      Thank you, these are all excellent points. We should clarify that in López-Cruz et al, they claim that only 50% of the animals fit a local/global search paradigm. We are simply proposing there is no need for designating local and global searches if the data don’t really support it. The underlying behavior is stochastic, so the sudden switches sometimes observed can be explained by a stochastic process where the underlying rate is slowing down, thus producing the persistently slow reorientation rate when an apparent “switch” occurs. What we hope to convey is that foraging doesn’t appear to follow a decision paradigm, but instead a gradual change in reorientations which for individual worms, can occasionally produce reorientation trajectories that appear switch-like.

      As for M, you are correct, we should be more explicit. A decay in reorientation rate, rather than a sudden change, is consistent with observations made by López-Cruz et al.  They found that the neurons AIA and ADE redundantly suppress reorientations, and that silencing either one was sufficient to restore the large number of reorientations during early foraging. The synaptic output of AIA and ADE was inhibited over long timescales (tens of minutes) by presynaptic glutamate binding to MGL-1, a slow G-Protein coupled receptor expressed in AIA and ADE. Their results support a model where sensory neurons suppress the synaptic output of AIA and ADE, which in turn leads to a large number of reorientations early in foraging. As time passes, glutamatergic input from the sensory neurons decrease, which leads to disinhibition of AIA and ADE, and a subsequent suppression of reorientations.

      The sensory inputs into AIA and ADE are sequestered into two separate circuits, with AIA receiving chemosensory input and ADE receiving mechanosensory input. Since the suppression of either AIA or ADE is sufficient to increase reorientations, the decay in reorientations is likely due to the synaptic output of both of these neurons decaying in time. This correlates with an observed decrease in sensory neuron activity as well, so the timescale of reorientation decay could be tied to the timescale of sensory neuron activity, which in turn is influencing the timescale of AIA/ADE reorientation suppression. This implies that our factor “M” is likely the sum of several different sensory inputs decaying in time.

      The molecular basis of which sensory neuron signaling factors contribute to decreased AIA and ADE activity is made more complicated by the observation that the glutamatergic input provided by the sensory neurons was not essential, and that additional factors besides glutamate contribute to the signaling to AIA and ADE. In addition to this, it is simply not the sensory neuron activity that decays in time, but also the sensitivity of AIA and ADE to sensory neuron input that decays in time. Simply depolarizing sensory neurons after the animals had starved for 30 minutes was insufficient to rescue the reorientation rates observed earlier in the foraging assay. This observation could be due to decreased presynaptic vesicle release, and/or decreased receptor localization on the postsynaptic side.

      In summary, there are two neuronal properties that appear to be decaying in time. One is sensory neuron activity, and the other is decreased potentiation of presynaptic input onto AIA and ADE. Our factor “M” is a phenomenological manifestation of these numerous decaying factors.

      Reviewer #3 (Public review):

      Summary:

      This intriguing paper addresses a special case of a fundamental statistical question: how to distinguish between stochastic point processes that derive from a single "state" (or single process) and more than one state/process. In the language of the paper, a "state" (perhaps more intuitively called a strategy/process) refers to a set of rules that determine the temporal statistics of the system. The rules give rise to probability distributions (here, the probability for turning events). The difficulty arises when the sampling time is finite, and hence, the empirical data is finite, and affected by the sampling of the underlying distribution(s). The specific problem being tackled is the foraging behavior of C. elegans nematodes, removed from food. Such foraging has been studied for decades, and described by a transition over time from 'local'/'area-restricted' search'(roughly in the initial 10-30 minutes of the experiments, in which animals execute frequent turns) to 'dispersion', or 'global search' (characterized by a low frequency of turns). The authors propose an alternative to this two-state description - a potentially more parsimonious single 'state' with time-changing parameters, which they claim can account for the full-time course of these observations.

      Figure 1a shows the mean rate of turning events as a function of time (averaged across the population). Here, we see a rapid transient, followed by a gradual 4-5 fold decay in the rate, and then levels off. This picture seems consistent with the two-state description. However, the authors demonstrate that individual animals exhibit different "transition" statistics (Figure 1e) and wish to explain this. They do so by fitting this mean with a single function (Equations 1-3).

      Strengths:

      As a qualitative exercise, the paper might have some merit. It demonstrates that apparently discrete states can sometimes be artifacts of sampling from smoothly time-changing dynamics. However, as a generic point, this is not novel, and so without the grounding in C. elegans data, is less interesting.

      Weaknesses:

      (1) The authors claim that only about half the animals tested exhibit discontinuity in turning rates. Can they automatically separate the empirical and model population into these two subpopulations (with the same method), and compare the results?

      Thank you, we should clarify that the observation that about half the animals exhibit discontinuity was not made by us, but by López-Cruz et al. The observed fraction of 50% was based on a visual assessment of the dual regression method we described. To make the process more objective, we decided to simply plot the distributions of the metrics they used for this assessment to see if two distinct populations could be observed. However, the distributions of slope differences and transition times do not produce two distinct populations. Our stochastic approach, which does not assume abrupt state-transitions, also produces comparable distributions. To quantify this, we will perform permutation tests on the means and variances differences between experimental and model data.

      (2) The equations consider an exponentially decaying rate of turning events. If so, Figure 2b should be shown on a semi-logarithmic scale.

      We are happy to add this panel as well.

      (3) The variables in Equations 1-3 and the methods for simulating them are not well defined, making the method difficult to follow. Assuming my reading is correct, Omega should be defined as the cumulative number of turning events over time (Omega(t)), not as a "turn" or "reorientation", which has no derivative. The relevant entity in Figure 1a is apparently <Omega (t)>, i.e. the mean number of events across a population which can be modelled by an expectation value. The time derivative would then give the expected rate of turning events as a function of time.

      Thank you, you are correct. Please see response to Reviewer #1.

      (4) Equations 1-3 are cryptic. The authors need to spell out up front that they are using a pair of coupled stochastic processes, sampling a hidden state M (to model the dynamic turning rate) and the actual turn events, Omega(t), separately, as described in Figure 2a. In this case, the model no longer appears more parsimonious than the original 2-state model. What then is its benefit or explanatory power (especially since the process involving M is not observable experimentally)?

      Thank you, yes we see how as written this was confusing. In our response to Reviewer #1, we added an important detail:

      While reorientations are modeled as discrete events, which is observationally true, the amount of M at time t\=0 is chosen to be large (M<sub>0</sub>←1,000), so that over the timescale of 40 minutes, the decay in M is practically continuous. This ensures that sudden changes in reorientations are not due to sudden changes in M, but due to the inherent stochasticity of reorientations.

      However you are correct that if M was chosen to have a binary value of 0 or 1, then this would indeed be the two state model. Adding this as an additional model would be a good idea to compare how this matches the experimental data, and we are happy to add it.

      (5) Further, as currently stated in the paper, Equations 1-3 are only for the mean rate of events. However, the expectation value is not a complete description of a stochastic system. Instead, the authors need to formulate the equations for the probability of events, from which they can extract any moment (they write something in Figure 2a, but the notation there is unclear, and this needs to be incorporated here).

      Thank you, yes please see our response to Reviewer #1.

      (6) Equations 1-3 have three constants (alpha and gamma which were fit to the data, and M0 which was presumably set to 1000). How does the choice of M0 affect the results?

      Thank you, this is a good question. We will test this down to a binary state of M as mentioned in comment #4.

      (7) M decays to near 0 over 40 minutes, abolishing omega turns by the end of the simulations. Are omega turns entirely abolished in worms after 30-40 minutes off food? How do the authors reconcile this decay with the leveling of the turning rate in Figure 1a?

      Yes, reviewer #1 recommended adding a baseline reorientation rate which is likely more biologically plausible. However, we should also note that in Klein et al they observed a continuous decay over 50 minutes.

      (8) The fit given in Figure 2b does not look convincing. No statistical test was used to compare the two functions (empirical and fit). No error bars were given (to either). These should be added. In the discussion, the authors explain the discrepancy away as experimental limitations. This is not unreasonable, but on the flip side, makes the argument inconclusive. If the authors could model and simulate these limitations, and show that they account for the discrepancies with the data, the model would be much more compelling. To do this, I would imagine that the authors would need to take the output of their model (lists of turning times) and convert them into simulated trajectories over time. These trajectories could be used to detect boundary events (for a given size of arena), collisions between individuals, etc. in their simulations and to see their effects on the turn statistics.

      Thank you, we will add error bars and perform a permutation test on the mean and variance differences between experiment and model over the 40 minute window.

      (9) The other figures similarly lack any statistical tests and by eye, they do not look convincing. The exception is the 6 anecdotal examples in Figure 2e. Those anecdotal examples match remarkably closely, almost suspiciously so. I'm not sure I understood this though - the caption refers to "different" models of M decay (and at least one of the 6 examples clearly shows a much shallower exponential). If different M models are allowed for each animal, this is no longer parsimonious. Are the results in Figure 2d for a single M model? Can Figure 2e explain the data with a single (stochastic) M model?

      Thank you, yes, we will perform permutation tests on the mean and variance differences in the observed distributions in figure 2d. We certainly don’t want the panels in Figure 2e to be suspicious! These comparisons were drawn from calculating the correlations between all model traces and all experimental traces, and then choosing the top hits. Every time we run the simulation, we arrive at a different set of examples. Since it was recommended we add a baseline rate, these examples will be a completely different set when we run the simulation, again.

      We apologize for the confusion regarding M. Since the worms do not all start out with identical reorientation rates, we drew the initial M value from a distribution centered on M0 and a variance to match the initial distribution of observed experimental rates.

      (10) The left axes of Figure 2e should be reverted to cumulative counts (without the normalization).

      Thank you, we will add this. We want to clarify that we normalized it because we chose these examples based on correlation to show that the same types of sudden changes in search strategy can occur with a model that doesn’t rely on sudden rate changes.

      (11) The authors give an alternative model of a Levy flight, but do not give the obvious alternative models:

      a) the 1-state model in which P(t) = alpha exp (-gamma t) dt (i.e. a single stochastic process, without a hidden M, collapsing equations 1-3 into a single equation).

      b) the originally proposed 2-state model (with 3 parameters, a high turn rate, a low turn rate, and the local-to-global search transition time, which can be taken from the data, or sampled from the empirical probability distributions). Why not? The former seems necessary to justify the more complicated 2-process model, and the latter seems necessary since it's the model they are trying to replace. Including these two controls would allow them to compare the number of free parameters as well as the model results. I am also surprised by the Levy model since Levy is a family of models. How were the parameters of the Levy walk chosen?

      Thank you, we will remove this section completely, as it is tangential to the main point of the paper.

      (12) One point that is entirely missing in the discussion is the individuality of worms. It is by now well known that individual animals have individual behaviors. Some are slow/fast, and similarly, their turn rates vary. This makes this problem even harder. Combined with the tiny number of events concerned (typically 20-40 per experiment), it seems daunting to determine the underlying model from behavioral statistics alone.

      Thank you, yes we should have been more explicit in the reasoning behind drawing the initial M from a distribution (response to comment #9). We assume that not every worm starts out with the same reorientation rate, but that some start out fast (high M) and some start out slow (low M). However, we do assume M decays with the same kinetics, which seems sufficient to produce the observed phenomena.

      (13) That said, it's well-known which neurons underpin the suppression of turning events (starting already with Gray et al 2005, which, strangely, was not cited here). Some discussion of the neuronal predictions for each of the two (or more) models would be appropriate.

      Thank you, yes we will add Gray et al, but also the more detailed response to Reviewer #2.

      (14) An additional point is the reliance entirely on simulations. A rigorous formulation (of the probability distribution rather than just the mean) should be analytically tractable (at least for the first moment, and possibly higher moments). If higher moments are not obtainable analytically, then the equations should be numerically integrable. It seems strange not to do this.

      Thank you for suggesting this, we will add these analyses.

      In summary, while sample simulations do nicely match the examples in the data (of discontinuous vs continuous turning rates), this is not sufficient to demonstrate that the transition from ARS to dispersion in C. elegans is, in fact, likely to be a single 'state', or this (eq 1-3) single state. Of course, the model can be made more complicated to better match the data, but the approach of the authors, seeking an elegant and parsimonious model, is in principle valid, i.e. avoiding a many-parameter model-fitting exercise.

      As a qualitative exercise, the paper might have some merit. It demonstrates that apparently discrete states can sometimes be artifacts of sampling from smoothly time-changing dynamics. However, as a generic point, this is not novel, and so without the grounding in C. elegans data, is less interesting.

      Thank you, we agree that this is a generic phenomenon, which is partly why we did this. The data from López-Cruz seem to agree in part with Calhoun et al, that claim abrupt transitions occur, and Klein et al, which claim they do not occur. Since the underlying phenomenon is stochastic, we propose the mixed observations of sudden and gradual changes in search strategy are simply the result of a stochastic process, which can produce both phenomena for individual observations.

    1. Author Response

      Reviewer 1:

      Comment 1.1: The distinction of PIGS from nearby OPA, which has also been implied in navigation and ego-motion, is not as clear as it could be.

      Response1.1: The main functional distinction between TOS/OPA and PIGS is that TOS/OPA responds preferentially to moving vs. stationary stimuli (even concentric rings), likely due to its overlap with the retinotopic motion-selective visual area V3A, for which this is a defining functional property (e.g. Tootell et al., 1997, J Neurosci). In comparison, PIGS does not show such a motion-selectivity. Instead, PIGS responds preferentially to more complex forms of motion within scenes. In this revision, we tried to better highlight this point in the Discussion (see also the response to the first comment from Reviewer #2).

      Reviewer 2:

      Comment 2.1: First, the scene-selective region identified appears to overlap with regions that have previously been identified in terms of their retinotopic properties. In particular, it is unclear whether this region overlaps with V7/IPS0 and/or IPS1. This is particularly important since prior work has shown that OPA often overlaps with v7/IPS0 (Silson et al, 2016, Journal of Vision). The findings would be much stronger if the authors could show how the location of PIGS relates to retinotopic areas (other than V6, which they do currently consider). I wonder if the authors have retinotopic mapping data for any of the participants included in this study. If not, the authors could always show atlas-based definitions of these areas (e.g. Wang et al, 2015, Cerebral Cortex).

      Response 2.1: We thank the reviewers for reminding us to more clearly delineate this issue of possible overlap, including the information provided by Silson et al, 2016. The issue of possible overlap between area TOS/OPA and the retinotopic visual areas, both in humans and non-human primates, was also clarified by our team in 2011 (Nasr et al., 2011). As you can see in the enclosed figure, and consistent with those previous studies, TOS/OPA overlaps with visual areas V3A/B and V7. Whereas PIGS is located more dorsally close to IPS2-4. As shown here, there is no overlap between PIGS and TOS/OPA and there is no overlap between PIGS and areas V3A/B and V7. To more directly address the reviewer’s concern, in the next revision, we will show the relative position of PIGS and the retinotopic areas (at least) in one individual subject.

      Author response image 1.

      The relative location of PIGS, TOS/OPA and the retinotopic visual areas. The left panel showed the result of high-resolution (7T; voxel size = 1 mm; no spatial smoothing) polar angle mapping in one individual. The right panel shows the location of scene-selective areas PIGS and TOS/OPA in the same subject (7T; voxel size = 1 mm; no spatial smoothing). While area TOS/OPA shows some overlap with the retinotopic visual areas V3A/B and V7, PIGS shows partial overlap with area IPS2-4. In both panels, the activity maps are overlaid on the subjects’ own reconstructed brain surface.

      Comment 2.2: Second, recent studies have reported a region anterior to OPA that seems to be involved in scene memory (Steel et al, 2021, Nature Communications; Steel et al, 2023, The Journal of Neuroscience; Steel et al, 2023, biorXiv). Is this region distinct from PIGS? Based on the figures in those papers, the scene memory-related region is inferior to V7/IPS0, so characterizing the location of PIGS to V7/IPS0 as suggested above would be very helpful here as well. If PIGS overlaps with either of V7/IPS0 or the scene memory-related area described by Steel and colleagues, then arguably it is not a newly defined region (although the characterization provided here still provides new information).

      Response 2.2: The lateral-place memory area (LPMA) is located on the lateral brain surface, anterior relative to the IPS (see Figure 1 from Steel et al., 2021 and Figure 3 from Steel et al., 2023). In contrast, PIGS is located on the posterior brain surface, also posterior relative to the IPS. In other words, they are located on two different sides of a major brain sulcus. In this revision we have clarified this point, including the citations by Steel and colleagues.

      Comments 2.3: Another reason that it would be helpful to relate PIGS to this scene memory area is that this scene memory area has been shown to have activity related to the amount of visuospatial context (Steel et al, 2023, The Journal of Neuroscience). The conditions used to show the sensitivity of PIGS to ego-motion also differ in the visuospatial context that can be accessed from the stimuli. Even if PIGS appears distinct from the scene memory area, the degree of visuospatial context is an alternative account of what might be represented in PIGS.

      Response 2.3: The reviewer raises an interesting point. One minor confusion is that we may be inadvertently referring to two slightly different types of “visuospatial context”. Specifically, the stimuli used in the ego-motion experiment here (i.e. coherently vs. incoherently changing scenes) represent the same scenes, and the only difference between the two conditions is the sequence of images across the experimental blocks. In that sense, the two experimental conditions may be considered to have the same visuospatial context. However, it could be also argued that the coherently changing scenes provide more information about the environmental layout. In that case, considering the previous reports that PPA/TPA and RSC/MPA may also be involved in layout encoding (Epstein and Kanwisher 1998; Wolbers et al. 2011), we expected to see more activity within those regions in response to coherently compared incoherently changing scenes. These issues are now more explicitly discussed in the revised article.

      Reviewer 3:

      Comment 3.1: There are few weaknesses in this work. If pressed, I might say that the stimuli depicting ego-motion do not, strictly speaking, depict motion, but only apparent motion between 2s apart photographs. However, this choice was made to equate frame rates and motion contrast between the 'ego-motion' and a control condition, which is a useful and valid approach to the problem. Some choices for visualization of the results might be made differently; for example, outlines of the regions might be shown in more plots for easier comparison of activation locations, but this is a minor issue.

      Response 3.1: We thank the reviewer for these constructive suggestions, and we agree with their comment that the ego-motion stimuli are not smooth, even though they were refreshed every 100 ms. However, the stimuli were nevertheless coherent enough to activate areas V6 and MT, two major areas known to respond preferentially to coherent compared to incoherent motion.

      Epstein, R., and N. Kanwisher. 1998. 'A cortical representation of the local visual environment', Nature, 392: 598-601.

      Wolbers, T., R. L. Klatzky, J. M. Loomis, M. G. Wutte, and N. A. Giudice. 2011. 'Modality-independent coding of spatial layout in the human brain', Curr Biol, 21: 984-9.

    1. I liked MyFitnessPal the most and deemed the other ones ineffective for me was because I can just easily scan things in and it was just easy to track things, easy to put it in. I can create my own foods, I would say probably its accessibility and easiness.

      Old information! MyFitnessPal implemented an update where the barcode scanner is now only available with the paid version. I had the paid version for a while but now that I have the free version, I'll usually start typing in what I ate, find the closest match, and adjust the serving to make sure calories are the most accurate measurement I get; however, this method creates high inaccuracy for micronutrients.

    1. We can likewise do the same in reverse by grabbing all indices up to the first index. In other words, the item in index 0.

      Unclear: "all indices up to the first index" can only refer to the single index 0. This is nonsensical phrasing. - If it is meant that we can grab all indices from the beginning of a list up to certain index then the example should be e.g. print (first_list[:2]) and the description should be changed accordingly. - If it is meant that we can grab indexes from behind (from right to left) up to the first index (being 0), the example/syntax given is wrong. Then it should be print (first_list[::-1]. This seems more likely given that the description would make sense as it stands and the mistake would just concern two signs in the example. But the concept of slice notation should be better introduced then.

    1. Driven by Dealers. Powered by Process. How ReconCash Was Born “It started with a $175 PSI — and a $3,200 mistake that wasn’t supposedly covered.” After years working for the nation’s largest auto auction company, I kept seeing the same thing: dealers paying for post-sale inspections that did little to fully protect them. Claims were denied. Cosmetic issues were missed. And good dealers — honest buyers — were left eating losses they shouldn’t have had to absorb.I watched smart operators lose money not because they didn’t do their homework — but because the system wasn’t built to protect them. It was built to protect the auction and consignor. They are motivated by one thing, to keep the deal together to collect buy and sell fees.That’s when I knew something had to change. We Flipped the Model I left the corporate auction world and started building the system I wish dealers had from day one: One that actually worked for the buyer, The buyer pays by far the highest auction fees including PSI and Simulcast fees. One that covered cosmetic and mechanical issues, One that included end-to-end arbitration handling — so dealers didn’t have to waste hours chasing cases or sitting on hold. Arbitrations are handled inconsistently by arbitrators and we hold them accountable to treat the little guy like the big guy. That system became ReconCash. Built for Scale. Designed for Independence. From the beginning, we knew this had to scale nationally for consistency. We created a licensing model that allows independent Inspection operators to run ReconCash territories, using our platform, branding, training, and support. Every territory is exclusive, local, and dealer-focused — because real trust is built locally. Today, ReconCash helps dealers recover thousands per month, and we’re just getting started. Our mission is simple: To protect dealers and buyers with fair, accurate, and independent arbitration inspections—delivered through a trusted national network of licensed operators.

      Some of the origin story was confusing and I suggested some improvements so this keeps the focus on how ReconCash addresses the problem with a solution from the perspective of the end users instead of ReconCash.

      Driven by Dealers. Powered by Process. How ReconCash Was Born

      ReconCash was born from a $175 pre-sale inspection (PSI) and a $3,200 cost for damage that the auction failed to disclose.

      In the traditional model, auctions, as the sellers, are responsible for documenting all damage in a PSI. Cosmetic and mechanical issues are often overlooked, leaving dealers who buy cars to review, file claims, and navigate a limited arbitration period to recover losses. When claims are denied or missed, dealer margins are unfairly impacted.

      Dealers weren’t losing money due to poor diligence—they were losing because the system wasn’t built to protect them. Auctions are motivated to complete deals and collect fees, not necessarily to ensure every issue is disclosed.

      The problem was clear: dealers needed a system that ensures fair transactions and simplifies the arbitration process. That system became ReconCash.

      A Dealer-First Approach

      ReconCash flips the traditional model:

      Protecting dealers who buy — covering both cosmetic and mechanical issues.

      Simplifying arbitration — we guide the dealer through the claim period, helping them identify overlooked damage and recover funds efficiently.

      Creating fair transactions — ensuring auctions fulfill their responsibility while safeguarding dealer margins.

      Built for Scale. Designed for Independence

      ReconCash was designed to scale nationally while keeping operations local. Our licensing model empowers independent inspection operators to run exclusive ReconCash territories using our platform, branding, training, and support. Every territory is dealer-focused because trust is built locally.

      Today, ReconCash helps dealers recover thousands per month—and we’re just getting started.

      Our Mission

      To protect dealers and buyers with fair, accurate, and independent arbitration inspections, delivered through a trusted national network of licensed operators.

    1. In what ways have you found social media bad for your mental health and good for your mental health?

      Some ways I have found social media to be bad for my mental health is when I began to have it consume my daily life. When I happen to doom scroll all day instead of doing my homework that leads to be being mentally stressed out because I didn't do my homework. But ways that social media is good for my mental health is that it encourages and shows me how to do self care. Whether thats through recommending movies, showing new ways on how to cook, listenting to music, etc.

    2. In what ways have you found social media bad for your mental health and good for your mental health?

      Like chapter 13.2 says, social media is bbad for people's mental health when trauma dumping happens. People express negative emotions on the internet as a way to vent, but this behavior would bring negative emotions to others. And there are also some negative communities that are bad for people's mental health. These two are what I can truly feel on the internet. There are also good communities on the internet. These communities can encourage people and brings positive emotions. When I use social media, some information also make me relaxed.

    3. I would admit I'm hugely addicted to social media. It had both positive and negative influences on me. I would be super anxious after posting my photos on the account. I can't help but check how many people liked or commented on my post every minute. The results might reflect how popular I am, and I would associate the likes and comments with my self-validation. I figured out that this idea was a bit ridiculous, so I regarded it as a tool just to make contact with my friends. I feel more comfortable with social media right now and consider it a way to bond with my long-distance friends, which is much better for my mental health.

    1. Requirements Strong work ethic and accountability Commitment to excellence and taking responsibility for results Basic inspection or mechanical knowledge (or ability to hire) Technical foundation or willingness to bring on qualified staff Commitment to brand standards and customer service Dedication to maintaining our quality standards and customer experience

      Right now, the “Requirements” section mostly lists soft skills and general attributes, but it’s missing the practical, operator-specific requirements that are critical for success in this model. It should reflect both personal qualities and business readiness.

      Here’s a rewritten version that includes the missing elements while keeping it concise and scannable:

      Territory Owner Requirements

      Strong Work Ethic & Accountability Take ownership of your territory, follow our proven systems, and deliver consistent results.

      Relevant Experience or Team Background in auto inspections, mechanics, or the ability to hire qualified staff.

      Financial Readiness Access to sufficient capital to operate your territory until revenue ramps up.

      Local Market Knowledge & Relationships Familiarity with your territory and connections to dealers to quickly grow your inspection pipeline.

      Commitment to Brand Standards & Customer Service Dedication to maintaining our quality standards, professionalism, and a positive customer experience.

    2. Behaviors in Action Integrity: Always document findings with evidence; never cut corners. Dealer-Centric: Explain inspection results in plain language. Ownership: Treat dealer disputes with urgency, as if your name is on it Consistency: Follow national SOPs for inspections and communication. Growth: Share best practices with peers; mentor new inspectors. Why Culture Matters to Licensees Protects the brand you are investing in. Creates repeat dealer trust, which drives recurring revenue. Makes your business scalable — inspectors and admins know “the ReconCash way.” Positions you as part of a respected national standard. Cultural Pillars Integrity First Every inspection, every claim, every conversation is built on honesty. We protect both dealers and buyers by standing neutral and consistent. Dealer-Centric Service Deliver inspections that are clear, professional, and transparent. Make it easy for dealers to trust the process and return to ReconCash. Ownership Mindset Territory owners run their businesses with pride and accountability. We treat every inspection as if our own reputation is on the line. Consistency Nationwide Local independence, but one standard of fairness across all 272 territories. Reports, processes, and customer experience feel seamless nationwide. People Growth Train, mentor, and develop inspectors and admins for long-term success. Provide opportunity to grow into leadership and territory ownership.

      this section reads more like internal culture and values for employees or inspectors rather than a page aimed at prospective territory owners. On the operator page, the focus should be on their business opportunity, support, and what they gain, not the internal culture in so much detail.

      Here’s how to improve it:

      Operate with Integrity, Build Dealer Trust, and Grow Your Business

      Running a successful ReconCash territory isn’t just about inspections — it’s about building a reputation, creating repeat business, and scaling your operations.

      Key Principles for Operators:

      Integrity & Reliability Deliver honest, accurate inspections every time. Consistency builds dealer trust and strengthens your reputation.

      Dealer-Centric Service Communicate inspection results clearly and professionally. Happy dealers return for repeat inspections, driving recurring revenue.

      Operational Consistency Use our proven systems, SOPs, and branded processes to run your business efficiently. Scalability comes naturally when your team knows “the ReconCash way.”

      Growth & Leadership Leverage training, tools, and support to expand your network, develop your team, and increase your revenue potential.

      Why These Principles Matter to You:

      Protects the value of the territory you’re investing in.

      Encourages repeat business from dealers, which fuels revenue.

      Makes your business scalable and easier to manage.

      Positions you as part of a trusted national standard.

    3. Apply Now to Reserve Your Area →

      We ask for experience in the 'reserve your area' form (but the values are years) and I think you'll want to ask for skills (mechanic, auto body, current inspector).

  2. social-media-ethics-automation.github.io social-media-ethics-automation.github.io
    1. Digital detox. November 2023. Page Version ID: 1187412856. URL:

      This wikipedia entry explains that a digital detox is a detox from devices such as smartphones, computers, iPads. I think that this would be really good for people, or even just a social media detox. The entry also explained that it is for a certain period of time. So it could just be one day, see how many times you reach for it. I've tried this and it feels terrible but it's why I try a social media detox.

    2. Merriam-Webster. On ‘Doomsurfing’ and ‘Doomscrolling’. 2023. URL: https://www.merriam-webster.com/wordplay/doomsurfing-doomscrolling-words-were-watching (visited on 2023-12-08).

      While the connotations of doom are quite negative, the content we consume while doom scrolling might actually be cheerful in native. As such, the definition of doom scrolling may not necessarily denote the content that is consumed, but instead the state or potential future of someone that doom scrolls. This has parallels to the title 'doomer' that used to used online to describe people who are hopeless about their current life circumstances and have instead decided to not do anything about it.

    3. Anya Kamenetz. Facebook's own data is not as conclusive as you think about teens and mental health. NPR, October 2021. URL: https://www.npr.org/2021/10/06/1043138622/facebook-instagram-teens-mental-health (visited on 2023-12-08).

      Platforms like Instagram encourage people to compare their lives to unrealistic images. Even though Facebook is aware of these negative impacts, it hasn't made any real changes. I don't think social media is entirely bad, but the constant pressure to maintain a perfect image and get likes can easily damage a person's self-confidence. Instead of persuading teenagers to abandon social media, we should focus on teaching them how to use it healthily and helping them understand that what they see online isn't always true.

    1. We don’t get paid unless you do. Under our revenue share model, there’s no cost if the claim is denied, but we are highly motivated to escalate and exhaust all options.

      Better? You take no risk. Our proven methods make us highly successful at getting claims reimbursed.

    2. Get paid for missed damages — Without the Runaround.

      Perhaps this works with dealers but it seems that one of these options is better:

      Get reimbursed for missed vehicle damage — without the hassle that keeps most claims unpaid.

      Recover money for overlooked vehicle damage and turn every car into more profit — without the runaround.

      Most missed damage claims go unpaid. We fix that — fast, fair, and frustration-free.

      Stop losing money on overlooked damage. We help you get reimbursed — quickly and cleanly.

      Missed damage shouldn’t mean missed profit. We make reimbursements simple and fast.

    1. Many have anecdotal experiences with their own mental health and those they talk to. For example, cosmetic surgeons have seen how photo manipulation on social media has influenced people’s views of their appearance:

      I think social media can both connect people and hurt them. It helps users express themselves and find communities, but it also increases comparison and self-doubt. Many surveys indicate that social media has a significant impact on People's Daily lives, but the overall influence depends on how people use these platforms and whether they can balance their online and offline lives.

    2. 13.1.1. Digital Detox?

      I find the concept of "digital detox" interesting, but also overly idealistic. The article argues that simply viewing social media as harmful oversimplifies the complexities of reality, I agree with that. As a person who uses social media daily, I find complete abstinence is unrealistic. Instead, I believe it's important to recognize how platforms manipulate our attention and emotions. The issue isn't just about the tools themselves, but also how we use them. This perspective is more practical than simply labeling technology as "bad."

    3. You know, it forces kids to not just live their experience but be nostalgic for their experience while they’re living it, watch people watch them, watch people watch them watch them.

      This sentence is really interesting because it reminds me of how many people's first instinct when restaurant food comes is to take a picture of it. Even while having fun and being excited for food, the first instinct is to take a picture to remember it/show it to people. It's almost like having an extra observer in your brain judging everything.

    1. “Incel” [m19] is short for “involuntarily celibate,” meaning they are men who have centered their identity on wanting to have sex with women, but with no women “giving” them sex. Incels objectify women and sex, claiming they have a right to have women want to have sex with them. Incels believe they are being unfairly denied this sex because of the few sexually attractive men (”Chads” [m20]), and because feminism told women they could refuse to have sex. Some incels believe their biology (e.g., skull shape) means no women will “give” them sex. They will be forever alone, without sex, and unhappy. The incel community has produced multiple mass murderers and terrorist attacks [m21].

      The idea of a incel has become more and more wide spread with the invention of the internet, giving people who believe such things a echo chamber to reinforce their ideas. It is possible then, for incels to have existed before the invention of the internet. Whiles the barriers to incel-dum are low, incel ideology and the community might have been more difficult to come by, as the ideas they they help were more widely looked down against. This might have made it difficult for incels to openly express their ideas when attempting to find similar people without facing ridicule.

    1. Reviewer #1 (Public review):

      Summary:

      Fogel & Ujfalussy report an extension of a visualization tool that was originally designed to enable an understanding of detailed biophysical neuron models. Named "extended currentscape", this new iteration enables visual assessment of individual currents across a neuron's spatially extended dendritic arbor with simultaneous readout of somatic currents and voltage. The overall aim was to permit a visually intuitive understanding for how a model neuron's inputs determine its output. This goal was worthwhile and the authors achieved it. Their manuscript makes two additional contributions of note: (1) a clever algorithmic approach to model the axial propagation of ionic currents (recursively traversing acyclic graph subsections) and (2) interesting, albeit not easily testable, insights into important neurophysiological phenomena such as complex spike generation and place field dynamics. Overall, this study provides a valuable and well-characterized biophysical modeling resource to the neuroscience community.

      Strengths:

      The authors significantly extended a previously published open-source biophysical modeling tool. Beyond providing important new capabilities, the potential impact of "extended currentscape" is boosted by its integration with preexisting resources in the field.

      The code is well-documented and freely available via GitHub.

      The author's clever portioning algorithm to relate dendritic/synaptic currents to somatic yielded multiple intriguing observations regarding when and why CA1 pyramidal neurons fire complex spikes versus single action potentials. This topic carries major implications for how the hippocampus represents and stores information about an animal's environment.

      Weaknesses:

      While extended currentscape is clearly a valuable contribution to the neuroscience community, this reviewer would argue that it is framed in a way that oversells its capabilities. The Abstract, Introduction, Results, and Methods all contain phrases implying that extended currentscape infers dendritic/synaptic currents contributing to somatic output., i.e. backwards inference of unknown inputs from a known output. This is not the case; inputs are simulated and then propagated through the model neuron using a clever partitioning algorithm that essentially traverses a biologically undirected graph structure by treating it like a time series of tiny directed graphs. This is an impressive solution, but it does not infer a neuron's input structure.

      Because a directed acyclic graph architecture is shown in Figure 2, it is unintuitive that the authors can infer bidirectional current flow, e.g. Figure 3 showing current flowing from basal dendrites and axon to soma, and further towards the apical dendrites. This is explained in Methods, but difficult to parse from Results amidst lots of rather abstract jargon (target, reference, collision, compartment). Figure 2 would have presented an opportunity to clearly illustrate the author's portioning algorithm by (1) rooting it in the exact morphology of one of their multicompartmental model neurons and (2) illustrating that "target" and "reference" have arbitrary morphological meanings; they describe the direction of current flow which is reevaluated at each time step.

      Analyses in Figure 7, C and D, are insightfully devised and illuminating. However, they could use some reconciliation with Figure 5 regarding initiation of individual APs versus CSBs within place fields.

      The intriguing observations generated by extended currentscape also point to its main weakness, which the authors openly acknowledge: as of now, no experimental methods exist to conclusively tests its predictions.

    2. Reviewer #2 (Public review):

      Summary


      The electrical activity of neurons and neuronal circuits is dictated by the concerted activity of multiple ionic currents. Because directly investigating these currents experimentally isn't possible with current methods, researchers rely on biophysical models to develop hypotheses and intuitions about their dynamics. Models of neural activity produce large amounts of data that is hard to visualize and interpret. The currentscape technique helps visualize the contributions of currents to membrane potential activity, but it's limited to model neurons without spatial properties. The extended currentscape technique overcomes this limitation by tracking the contributions of the different currents from distant locations. This extension allows tracking not only the types of currents that contribute to the activity in a given location, but also visualizing the spatial region where the currents originate. The method is applied to study the initiation of complex spike bursts in a model hippocampal place cell. 



      Strengths.


      The visualization method introduced in this work represents a significant improvement over the original currentscape technique. The extended currentscape method enables investigation of the contributions of currents in spatially extended models of neurons and circuits. 



      Weaknesses.


      The case study is interesting and highlights the usefulness of the visualization method. A simpler case study may have been sufficient to exemplify the method, while also allowing readers to compare the visualizations against their own intuitions of how currents should flow in a simpler setting.

    3. Author response:

      We are very pleased to hear the overall positive views and constructive criticisms of eLife Editors and Reviewers on our work. In particular, we appreciate their global assessment that the work offers a valuable tool for neuroscientists to visualize and assess dendritic computations.

      We will clarify in a revised version of the manuscript that we do not infer the synaptic inputs of the neuron. Also, we will add a new simulation with simpler morphology to illustrate the method under more intuitive conditions. We will also clarify the meaning of the "target" and "reference" compartments. These labels do not depend on the direction of the current flow, but we can freely chose any compartment to be the target, and then the axial currents will be evaluated relative to that compartment in each time step.

    1. Reviewer #2 (Public review):

      Summary:

      This manuscript looks at a wide variety of likely important drivers of arbovirus transmission across municipalities in Brazil. The results are intriguing due to their relevance and breadth, but the approach also brings challenges, which make the results hard to interpret.

      Strengths:

      Important and complex problem, excellent spatiotemporal resolution, collection of important covariates, and holistic analysis.

      Weaknesses:

      There are two key weaknesses. First, it is difficult to understand the actual contributions of each included covariate. The principal fit metric is WAIC, and importance is characterized by rank based on univariate fit. WAIC is a valuable comparison metric, but does not indicate how well the best model (or any other) fits the data. Figures 5B and S2-S4 show what look like good fits, but it also seems possible that most of this fit could be coming from the random effects rather than the covariates. It would be helpful to show the RE-only model as a comparator in these figures and also to consider other metrics that could help show overall fit (e.g., R^2). How much variance is actually being explained by the covariates?

      Relatedly, the mean absolute errors reported are approximately 2-8 across the viruses, which sounds good on the surface. But many of the actual counts are zeros, so it's hard to tell if this is really good. Comparison to the mean and median observed case counts would be helpful.

      Second, some of the results/discussion on specific variables and covariates were confusing. For example, the relationships between relative humidity and temperature vary substantially between pathogens and minimum or maximum temperature values. However, as transmission of three viruses relies on the same mosquito and minimum and maximum temperatures are highly correlated, we would expect these relationships to be very similar. One concern is clarity, and another is that some of the findings may be spurious - potentially related to how much of the variance is accounted for by the random effects alone (see above) and the wide range of covariates assessed (thus increasing the chance of something improving fit).

      Underlying much of this are likely nonlinear relationships. The authors comment on this as a likely reason for some of the specific relationships, but it is not a very strong argument because the variable selection process is completely based on (generalized) linear univariate regressions.

      Lastly, the mischaracterization of arboviral disease is a big challenge, as noted in the discussion. Only a subset of cases in Brazil are laboratory confirmed, but I couldn't find any statement about whether the cases used here were laboratory confirmed or not. I suspect that they are a combination of confirmed and suspect cases. A sensitivity analysis with only confirmed cases would increase confidence in the results.

    1. eLife Assessment

      This important study investigates how the nervous system adapts to changes in body mechanics using a tendon transfer surgery that imposes a mismatch between muscle contraction and mechanical action. Using electromyography (EMG) to track muscle activity in two macaque monkeys, the authors conclude that there is a two-phase recovery process that reflects different underlying strategies. However, neither monkey's data includes a full set of EMG and kinematic measurements, and the two datasets are not sufficiently aligned with each other from a behavioural point of view; as a result, the evidence supporting the conclusions is solid but could be improved.

    2. Reviewer #1 (Public review):

      Summary:

      Many studies have investigated adaptation to altered sensorimotor mappings or to an altered mechanical environment. This paper asks a different but also important question in motor control and neurorehabilitation: how does the brain adapt to changes in the controlled plant? The authors addressed this question by performing a tendon transfer surgery in two monkeys during which the swapped tendons flexing and extending the digits. They then monitored changes in task performance, muscle activation and kinematics post-recovery over several months, to assess changes in putative neural strategies.

      Strengths:

      (1) The authors performed complicated tendon transfer experiments to address their question of how the nervous system adapts to changes in the organisation of the neuromusculoskeletal system, and present very interesting data characterising neural (and in one monkey, also behavioural) changes post tendon transfer over several months.

      (2) The fact that the authors had to employ to two slightly different tasks -one more artificial, the other more naturalistic- in the two monkeys and yet found qualitatively similar changes across them makes the findings more compelling.

      (3) The paper is quite well written, and the analyses are sound, although some analyses could be improved (suggestions below).

      Weaknesses:

      (1) I think this is an important paper, paper but I'm puzzled about a tension in the results. On the one hand, it looks like the behavioural gains post-TT happen rather smoothly over time (Figure 5). On the other, muscle synergy activations changes abruptly at specific days (around day ~65 for Monkey A and around day ~45 for monkey B; e.g., Figure 6). How do the authors reconcile this tension? In other words, how do they think that this drastic behavioural transition can arise from what appears to be step-by-step, continuous changes in muscle coordination? Is it "just" subtle changes in movements/posture exploiting the mechanical coupling between wrist and finger movements combined with subtle changes in synergies and they just happen to all kick in at the same time? This feels to me the core of the paper and should be addressed more directly.

      (2) The muscles synergy analyses, which are an important part of the paper, could be improved. In particular:

      (2a) When measuring the cross-correlation between the activation of synergies, the authors should include error bars, and should also look at the lag between the signals.

      (2b) Figure 7C and related figures, the authors state that the activation of muscle synergies revert to pre-TT patterns toward the end of the experiments. However, there are noticeable differences for both monkeys (at the end of the "task range" for synergy B for monkey A, and around 50 % task range for synergy B for monkey B). The authors should measure this, e.g., by quantifying the per-sample correlation between pre-TT and post-TT activation amplitudes. Same for Figures 8I,J, etc.

      (2c) In Figures 9 and 10, the authors show the cross-correlation of the activation coefficients of different synergies; the authors should also look at the correlation between activation profiles because it provides additional information.

      (2d) Figure 11: the authors talk about a key difference in how Synergy B (the extensor finger) evolved between monkeys post-TT. However, to me this figure feels more like a difference in quantity -the time course- than quality, since for both monkeys the aaEMG levels pretty much go back to close to baseline levels -even if there's a statistically significant difference only for Monkey B. What am I missing?

      (2e) Lines 408-09 and above: The authors claim that "The development of a compensatory strategy, primarily involving the wrist flexor synergy (Synergy C), appears crucial for enabling the final phase of adaptation", which feels true intuitively and also based on the analysis in Figure 8, but Figure 11 suggests this is only true for Monkey A . How can these statements be reconciled?

      (3) Experimental design: at least for the monkey who was trained on the "artificial task" (Monkey A), it would have been good if the authors had also tested him on naturalistic grasping, like the second monkey, to see to what extent the neural changes generalise across behaviours or are task-specific. Do the authors have some data that could be used to assessed this even if less systematically?

      (4) Monkey's B behaviour pre-tendon transfer seems more variable than that of Monkey A (e.g., the larger error bars in Figure 5 compared to monkey A, the fluctuating cross-correlation between FDS pre and EDC post in Figure 6Q), this should be quantified to better ground the results since it also shows more variability post-TT.

      (5) Minor: Figure 12 is interesting and supports the idea that monkeys may exploit the biomechanical coupling between wrist and fingers as part of their function recovery. It would be interesting to measure whether there is a change in such coupling (tenodesis) over time, e.g., by plotting change in wrist angle vs change in MCP angle as a scatter plot (one dot per trial), and in the same plot show all the days, colour coded by day. Would the relationship remain largely constant or fluctuate slightly early on? I feel this analysis could also help address my point (1) above.

    3. Reviewer #2 (Public review):

      Summary:

      This study tackles an important question for both basic science understanding and translational relevance - how does the nervous system learn to control a changing body? Of course, all bodies change slowly over time, including basic parameters like size and weight distribution, but many types of diseases and injuries also alter the body and require neural adaptation to sustain normal control. A dramatic example from the clinic is the use of tendon transfer surgery in patients with near tetraplegia that allows them to use more proximal arm muscles to control the hand. Here, the authors sought to ask what strategies may be used when an animal adapts its motor control in response to tendon transfer. They focus on whether recovered functions leverage fractionated control over each muscle separately or, alternatively, whether there is evidence for modular control in which pre-existing synergies are recruited differently after the surgery. Overall, this work is very promising and advances the use of tendon transfer in animal models as a powerful way to study motor control flexibility, but the incomplete data and difficulty comparing between the two subjects mean that evidence is lacking for some of the conclusions.

      Strengths:

      A major strength of this paper is its motivating idea of using tendon transfer between flexor and extensor muscles in non-human primate wrist control to ask what adaptations are possible, how they evolve over time, and what might be the underlying neural control strategies. This is a creative and ambitious approach. Moreover, these surgeries are likely very challenging to do properly, and the authors rigorously documented the effectiveness of the transfer, particularly for Monkey A.

      The results are promising, and there are two very interesting findings suggested by the data. First, when a single muscle out of a related group is manipulated, there is aberrant muscle activity detected across related muscles that are coordinated with each other and impacted as a group. For example, when the main finger extensor muscle now becomes a flexor, the timing of its activation is changed, and this is accompanied by similar changes in a more minor finger extensor as well as in wrist extensor muscles. This finding was observed in both monkeys and likely reflects a modular adaptive response. Second, there is a biphasic response in the weeks following injury, with an early phase in which the magnitude of an extensor synergy was increased and the timing of flexor and extensor recruitment was altered, followed by a later phase in which the timing and overall magnitude are restored.

      Weaknesses:

      The most notable weakness of the study is the incompleteness of the data. Monkey A has excellent quality EMG in all relevant muscles, but no analysis of video data, while Monkey B has some video data kinematics and moderate quality EMG, but the signal in the transferred FDS muscle was lost. These issues could be overcome by aligning data between the two monkeys, but the behavior tasks performed by each monkey are different, and so are the resulting muscle synergies detected (e.g., for synergies C and D), and different timepoints were analyzed in each monkey. As a result, it is difficult to make general conclusions from the study, and it awaits further analysis or the addition of another subject.

      A second weakness is the insufficient analysis of the movements themselves, particularly for Monkey A. The main metrics analyzed were the time from task engagement (touch) to action onset and the time spent in an off-target location - neither of these measures can be related directly to muscle activity or the movement. Since the authors have video data for both monkeys, it is surprising that it was not used to extract landmarks for kinematic analysis, or at least hand/endpoint trajectory, and how it is adjusted over time. Adding more behavior data and aligning it with the EMG data would be very helpful for characterizing motor recovery and is needed to support conclusions about underlying neural control strategies for functional improvement.

      Considering specific conclusions, the statement that the monkeys learned to use "tenodesis" over time by increasing activation of a wrist flexor muscle synergy does not seem to be fully supported by the data. Monkey A data includes EMG for two wrist flexors and a clear wrist flexor synergy, but it seems that, when comparing baseline and the final post-surgery timepoints, the main change is decreased activity around grasp after tendon transfer (at 0% of the task range if I understand this correctly) (Figure 8E and Figure S2-H vs R and -I vs S). It is clear that Monkey B increases the flexion of the wrist joint over time from the kinematic data, but the activity pattern in the only recorded wrist flexor (PL) doesn't change much with time (Figure S2-AN) and this monkey does not have a clear wrist flexor synergy (PL is active in the flexor synergy A while synergy C mainly reflects deltoid activity). Given these issues, it is not clear how to align the EMG and kinematic data and interpret these findings.

      A more minor point regarding conclusions: statements about poor task performance and high energy expenditure being the costs that drive exploration for a new strategy are speculative and should be presented as such. Although the monkeys did take longer to complete the tasks after the surgery, they were still able to perform it successfully and in less than a second and no measurements of energy expenditure were taken.

      A small concern is whether the tendon transfer effect may fail over time, either due to scar tissue formation or tendon tearing, and it would be ideal if the integrity of the intervention were re-assessed at the end of the study.

    4. Reviewer #3 (Public review):

      Summary:

      In this study, Philipp et al. investigate how a monkey learns to compensate for a large, chronic biomechanical perturbation - a tendon transfer surgery, swapping the actions of two muscles that flex and extend the fingers. After performing the surgery and confirming that the muscle actions are swapped, the authors follow the monkeys' performance on grasping tasks over several months. There are several main findings:

      (1) There is an initial stage of learning (around 60 days), where monkeys simply swap the activation timing of their flexors and extensors during the grasp task to compensate for the two swapped muscles.

      (2) This is (seemingly paradoxically) followed by a stage where muscle activation timing returns almost to what it was pre-surgery, suggesting that monkeys suddenly swap to a new strategy that is better than the simple swap.

      (3) Muscle synergies seem remarkably stable through the entire learning course, indicating that monkeys do not fractionate their muscle control to swap the activations of only the two transferred muscles.

      (4) Muscle synergy activation shows a similar learning course, where the flexion synergy and extension synergy activations are temporarily swapped in the first learning stage and then revert to pre-surgery timing in the second learning stage.

      (5) The second phase of learning seems to arise from making new, compensatory movements (supported by other muscle synergies) that get around the problem of swapped tendons.

      Strengths:

      This study is quite remarkable in scope, studying two monkeys over a period of months after a difficult tendon-transfer surgery. As the authors point out, this kind of perturbation is an excellent testbed for the kind of long-term learning that one might observe in a patient after stroke or injury, and provides unique benefits over more temporary perturbations like visuomotor transformations and studying learning through development. Moreover, while the two-stage learning course makes sense, I found the details to be genuinely surprising--specifically the fact that: (1) muscle synergies continue to be stable for months after the surgery, despite now being maladaptive; and (2) muscle activation timing reverts to pre-surgery levels by the end of the learning course. These two facts together initially make it seem like the monkey simply ignores the new biomechanics by the end of the learning course, but the authors do well to explain that this is mainly because the monkeys develop a new kind of movement to circumvent the surgical manipulation.

      I found these results fascinating, especially in comparison to some recent work in motor cortex, showing that a monkey may be able to break correlations between the activities of motor cortical neurons, but only after several sessions of coaching and training (Oby et al. PNAS 2019). Even then, it seemed like the monkey was not fully breaking correlations but rather pushing existing correlations harder to succeed at the virtual task (a brain-computer interface with perturbed control).

      Weaknesses:

      I found the analysis to be reasonably well considered and relatively thorough. However, I do have a few suggestions that I think may elevate the work, should the authors choose to pursue them.

      First, I find myself wondering about the physical healing process from the tendon transfer surgery and how it might contribute to the learning. Specifically, how long does it take for the tendons to heal and bear forces? If this itself takes a few months, it would be nice to see some discussion of this.

      Second, I see that there are some changes in the muscle loadings for each synergy over the days, though they are relatively small. The authors mention that the cosine distances are very small for the conserved synergies compared to distances across synergies, but it would be good to get a sense for how variable this measure is within synergy. For example, what is the cosine similarity for a conserved synergy across different pre-surgery days? This might help inform whether the changes post-surgery are within a normal variation or whether they reflect important changes in how the muscles are being used over time.

      Last, and maybe most difficult (and possibly out of scope for this work): I would have ideally liked to see some theoretical modeling of the biomechanics so I could more easily understand what the tendon transfer did or how specific synergies affect hand kinematics before and after the surgery. Especially given that the synergies remained consistent, such an analysis could be highly instructive for a reader or to suggest future perturbations to further probe the effects of tendon transfer on long-term learning.

  3. social-media-ethics-automation.github.io social-media-ethics-automation.github.io
    1. Spamming. December 2023. Page Version ID: 1187995774. URL: https://en.wikipedia.org/w/index.php?title=Spamming&oldid=1187995774 (visited on 2023-12-08).

      Spamming is when a mass group of people is sent multiple unsolicited messages from someone they don't know. People do fall into the trap of these messages, and it's a problem when they are potentially hacked, etc. An example of this I can think of is when my phone got multiple messages sent to me about my SSN has been compromised, etc, but i knew it was spam because it was sent by a Gmail user on iMessage.

    1. Have you ever faced consequences for breaking social media rules (or for being accused of it)?

      I have been accused of breaking a social media rule, but it was rather small. I posted a photo with a song, which was later removed from social media due to the artist being controversial in the news. This wasn't specifically my fault, but they did have to restrict my post.

    1. As colonial economies grew, they quickly became an important market for British manufacturing exports. Colonists with disposable income and access to British markets attempted to mimic British culture. By the middle of the eighteenth century, middling-class colonists could also afford items previously thought of as luxuries like British fashions, dining wares, and more.

      why: Economic growth in the American colonies expanded opportunities for trade and wealth gain, especially among the middling classes, including merchants, and farmers. As incomes rose, these colonists sought to demonstrate their social status by purchasing luxury British goods such as fine clothing, silverware, and fashionable furnishings. This consumer behavior was not just about material comfort but reflected a deeper cultural affirmation and identification with British customs and values. At the same time, many colonists saw themselves as British subjects entitled to the same political rights and protections as those living in Britain. By adopting British styles and goods, they reinforced their sense of belonging within the British Empire. However, this connection created tensions when British policies such as taxation without representation, threatened those rights.

  4. drive.google.com drive.google.com
    1. insensitive to the problems of the masses

      Problems being ignored by the masses of people who are unaffected by now, but could potentially be effected later seems like one of his main topics.

    2. As the weeks and months went by, we realized that we were thevictims of a broken promise.

      they waited weeks and months after they made the agreement and nothing but the bare minimum happened, it just shows that all they wanted to do was just small things so they could shut up instead of actually informing a change.

    1. for - article - LinkedIn - Has Language trapped humanity? - pre linguistic reality

      Summary - very interesting exploration of our pre linguistic life - We modern humans spend most of our lives in the symbolosphere. - It is so ubiquitous that we don't even know it's relative and not absolute, like fish that don't know there's such a thing as water - until they are pulled out of it - Feral children are the ones who have been pulled out of the ocean of language, but they suffer a fate that none of us, from our conditioned language perspective would want to suffer - So how do we, who are deeply conditioned into language look at our situation of being so deeply conditioned? Is there life after (and before) language?

    1. each program, we observed teachers design lessons to make Soe i i in a joint productive activity with instructional conversations as the a . iti rt pe strategy. They structured both small- and large-group Se oe we i ion i i bined and constructed their know : interaction in which students com . Teachers scaffolded students with questions and supports that a their current level of competence to demonstrate mete advanced s it and Fach of the programs supports its teacher candidates to ee build on the social nature of learning in their courses, In their ame os i the programs themselves. i the structure and cultures of ane i k make clear, learning i i i hapters of this book ma , As the vignettes in the previous c ee in productive communities intersects with the other dimensions of “ a learning. It is linked to how learning becomes developmentally ee and contextualized and how students apply and transfer what t y mew to a variety of situations in and outside of school. And as we wi S a chapter 9, it is very much a part of how learning becomes equitable , oriented toward social justice. . ; In the remainder of this chapter we provide examples of reac “2% dates facilitating learning as a social process in their clinical wor se ° school sites and then describe the strategies the teacher preparation p grams use to help the candidates learn to teach that way. DEEPER LEARNING THROUGH JOINT PRODUCTIVE ACTIVITY Sara, a teacher candidate at CU Denver and her mentor Kim, a ae eee at Laredo Elementary in Aurora, Colorado, use these standards as t a : e lessons for the classroom of fifth graders they teach together. an ° ee Denver professional development school, enrolls a diverse sore ° a students, 61 percent of whom are Hispanic, 19 percent lac ; “ an white, 4 percent Asian, and roughly 1 percent Hawaiian/Pacific Islan Sea Native American; 4 percent of students identify as two or mist aces. ane half are English language learners; 11 percent nw special learning i lify for free or reduced-price meals. . ™ ora the eon highlighted here, Sara and _ engage stadents in a textualized learning through social interaction—in this case a on personal experiences with one another to generate and use sensory to enrich their writing. Learning in Communities of Practice A crisp wind and intense sun beat down on the carefully manicured lawn that lines the walkway up to Laredo Elementary. Below undulating American and Colorado flags, bold blue letters above the entrance exclaim: “Laredo Lions.” At 9:15 a.m. on a Wednesday morning, in a portable classroom at the edge of a grassy courtyard, Kim’s class is in full swing. Nineteen fifth-grade students—all of them Hispanic or African American—are sitting on a carpet at the front of the classroom with an easy view of the screen that displays student work projected from a nearby document camera. Kim is standing and enthusiastically walking students through samples of student work that was turned in the day before. Sara, sitting nearby, is very much a part of the conversation. The lessoh is focused on how to infuse writing with sensory details so that readers can see/hear/feel/taste/smell the events that the student-authors are de- scribing. The assignment asks students to pick any memorable moment in their fives that evoked strong emotion from them. One girl writes about breaking her feg during a soccer match; another writes about her first day in an American school after immigrating from Ethiopia; a third writes about being with her sister during her miscarriage. Kim: Luis has come so far in his writing—everyone give him a hand [Students enthusiastically clap.] Yesterday Luis shared with us about going to the Lan- tern Festival but, Luis, instead of just telling us you went, | want you to be able to show everyone. What were the lanterns doing? [Students start to chime in.] Hold on, give him a second. [pause] Luis: Moving, crackling, flickering. Kim: Which one do you like best? [pause] Luis: [shrugs shoulders] Kim: Okay, try this—close your eyes. Can you imagine it? Luis: Yes! The lanterns were flickering! Kim: Great—that word is more specific and now we can see it like you saw it! This process continues for two more student-authors whose writing needs a bit more specificity. Kim ends her mini-lesson with: “We're going to continue to get better, and when | read what you work on today I'll expect to see this level of sen- sory detail in all of your stories. | want to be able to really visualize what you're de- scribing—| want this from you today, tomorrow, and in ten years!” At this point Sara launches into the next portion of the lesson wherein small groups of students work together to describe different sensory objects without looking at them first. Sara: You may notice that there are brown bags on each of your tables. Inside of these bags is a mystery surprise. You know how | love my mysteries! [Students laugh, and some say “yes” and “she does like mysteries!”] The (continues) ===

      Planning with others and using instructional conversations as the guiding strategy multiplies impact.

    1. their principles corrupted at their very setting out, and as they generally go a goodmany together, they inflame one another’s expectations to such a degree, in thecourse of the voyage, that they fix upon a period for their return before their arriva

      A good point. There is a culture of greed, but that does not exonerate him. He, as the most powerful there, is chiefly responsible.

    Annotators

    1. While older people were more likely to experience loneliness, pain, and health problems, the relationship between loneliness and pain was consistent across the lifespan.

      It makes sense how older people have it more difficult, because as we age, the more health problems arise, and it's even doubled when we feel lonely. But, the interesting part is that it's not mainly focused on the older generations like we think, but it's all age groups.

    2. The findings show that individuals who reported feeling lonely were more than twice as likely to experience physical pain compared to those who did not.

      Was age a factor? As you grow older do you experience more physical pain than a younger, but also lonely person?

    1. This is something for the category of 'most interesting things found inside of a typewiter'. As I was inspecting and preparing to test a new-to-me SC 5LE, when I opened the ribbon cover, I saw this. I wish I had taken a better picture of what i looked like after I got the shell off of the machine, but I was pretty intent on getting it outside and making sure it wasn't inhabited. I ID'd it later as a mud wasp/mud dauber nest.

      via Patty Perkins at https://www.facebook.com/groups/TypewriterCollectors/posts/10162788538859678/

      mud wasp/mud dauber nest in a typewriter<br />

    1. Reviewer #1 (Public review):

      Summary:

      The authors propose a new technique which they name "Multi-gradient Permutation Survival Analysis (MEMORY)" that they use to identify "Genes Steadily Associated with Prognosis (GEARs)" using RNA-seq data from the TCGA database. The contribution of this method is one of the key stated aims of the paper. The majority of the paper focuses on various downstream analyses that make use of the specific GEARs identified by MEMORY to derive biological insights, with a particular focus on lung adenocarcinoma (LUAD) and breast invasive carcinoma (BRCA) which are stated to be representative of other cancers and are observed to have enriched mitosis and immune signatures, respectively. Through the lens of these cancers, these signatures are the focus of significant investigation in the paper.

      Strengths:

      The approach for MEMORY is well-defined and clearly presented, albeit briefly. This affords statisticians and bioinformaticians the ability to effectively scrutinize the proposed methodology and may lead to further advancements in this field. The scientific aspects of the paper (e.g., the results based on the use of MEMORY and the downstream bioinformatics workflows) are conveyed effectively and in a way that is digestible to an individual that is not deeply steeped in the cancer biology field.

      Weaknesses:

      Comparatively little of the paper is devoted to the justification of MEMORY (i.e., the authors' method) for identification of genes that are important broadly for the understanding of cancer. The authors' approach is explained in the methods section of the paper, but no comparison or reference is made to any other methods that have been developed for similar purposes, and no results are shown to illustrate the robustness of the proposed method (e.g., is it sensitive to subtle changes in how it is implemented).

      For example, in the first part of the MEMORY algorithm, gene expression values are dichotomized at the sample median, and a log-rank test is performed. This would seemingly result in an unnecessary loss of information for detecting an association between gene expression and survival. Moreover, while dichotomizing gene expressions at the median is optimal from an information theory perspective (i.e., it creates equally sized groups), there is no reason to believe that median-dichotomization is correct vis-à-vis the relationship between gene expression and survival. If a gene really matters and expression only differentiates survival more towards the tail of the empirical gene expression distribution, median-dichotomization could dramatically lower power to detect group-wise differences. Notwithstanding this point, the reviewer acknowledges that dichotomization offers a straightforward approach to model gene expression and is widely used. This approach is nonetheless an example of a limitation of the current version of MEMORY that could be addressed to improve the methodology.

      If I understand correctly, for each cancer the authors propose to search for the smallest subsample size (i.e., the smallest value of k_{j}) were there is at least one gene with a survival analysis p-value <0.05 for each of the 1000 sampled datasets. Then, any gene with a p-value <0.05 in 80% of the 1000 sampled datasets would be called a GEAR for that cancer. The 80% value here is arbitrary but that is a minor point. I acknowledge that something must be chosen.

      Presumably the gene with the largest effect for the cancer will define the value of K_{j} and, if the effect is large, this may result in other genes with smaller effects not being defined as a GEAR for that cancer by virtue of the 80% threshold. Thus, a gene being a GEAR is related to the strength of association for other genes in addition to its own strength of association. One could imagine that a gene that has a small-to-moderate effect consistently across many cancers may not show up as GEAR in any of them (if there are [potentially different] genes with more substantive effects for those cancers). Is this desirable?

      The term "steadily associated" implies that a signal holds up across subsample gradients. Effectively this makes the subsampling a type of indirect adjustment to ensure the evidence of association is strong enough. How well this procedure performs in repeated use (i.e., as a statistical procedure) is not clear.

      Assuredly subsampling sets the bar higher than requiring a nominal p-value to be beneath the 0.05 threshold based on analysis of the full data set. The author's note that the MEMORY has several methodological limitations, "chief among them is the need for rigorous, large-scale multiple-testing adjustment before any GEAR list can be considered clinically actionable." The reviewer agrees and would add that it may be difficult to address this limitation within the author's current framework. Moreover, should the author's method be used before such corrections are available given their statement? Perhaps clarification of what it means to be clinically actionable could help here. If a researcher uses MEMORY to screen for GEARs based on the current methodology, what do the authors recommend be done to select a subset of GEARs worthy of additional research/investment?

    2. Reviewer #4 (Public review):

      Thank you to the authors for their detailed responses and changes in relation to my questions. They have addressed all my concerns around methodological and inference clarity. I would still recommend against the use of feature/pathway selection techniques where there is no way of applying formal error control. I am pleased to read, however, that the authors are planning to develop this in future work. My edited review reflects these changes:

      The authors apply what I gather is a novel methodology titled "Multi-gradient Permutation Survival Analysis" to identify genes that are robustly associated with prognosis ("GEARs") using tumour expression data from 15 cancer types available in the TCGA. The resulting lists of GEARs are then interrogated for biological insights using a range of techniques including connectivity and gene enrichment analysis.

      I reviewed this paper primarily from a statistical perspective. Evidently an impressive amount of work has been conducted, concisely summarised, and great effort has been undertaken to add layers of insight to the findings. I am no stranger to what an undertaking this would have been. My primary concern, however, is that the novel statistical procedure proposed, and applied to identify the gene lists, as far as I can tell offers no statistical error control nor quantification. Consequently we have no sense what proportion of the highlighted GEAR genes and networks are likely to just be noise.

      Major comments:

      The main methodology used to identify the GEAR genes, "Multi-gradient Permutation Survival Analysis" does not formally account for multiple testing and offers no formal error control. Meaning we are left without knowing what the family wise (aka type 1) error rate is among the GEAR lists, nor the false discovery rate. I appreciate the emphasis on reproducibility, but I would generally recommend against the use of any feature selection methodology which does not provide error quantification because otherwise we do not know if we are encouraging our colleagues and/or readers to put resource into lists of genes that contain more noise than not. I am glad though and appreciative that the authors intend to develop this in future work.

      The authors make a good point that, despite lack of validation in an external independent dataset, it is still compelling work given the functional characterisation and literature validation. I am pleased though that the authors agree validation in an independent dataset is an important next step, and plan to do so in future work.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      The authors propose a new technique which they name "Multi-gradient Permutation Survival Analysis (MEMORY)" that they use to identify "Genes Steadily Associated with Prognosis (GEARs)" using RNA-seq data from the TCGA database. The contribution of this method is one of the key stated aims of the paper. The vast majority of the paper focuses on various downstream analyses that make use of the specific GEARs identified by MEMORY to derive biological insights, with a particular focus on lung adenocarcinoma (LUAD) and breast invasive carcinoma (BRCA) which are stated to be representative of other cancers and are observed to have enriched mitosis and immune signatures, respectively. Through the lens of these cancers, these signatures are the focus of significant investigation in the paper.

      Strengths:

      The approach for MEMORY is well-defined and clearly presented, albeit briefly. This affords statisticians and bioinformaticians the ability to effectively scrutinize the proposed methodology and may lead to further advancements in this field.

      The scientific aspects of the paper (e.g., the results based on the use of MEMORY and the downstream bioinformatics workflows) are conveyed effectively and in a way that is digestible to an individual who is not deeply steeped in the cancer biology field.

      Weaknesses:

      I was surprised that comparatively little of the paper is devoted to the justification of MEMORY (i.e., the authors' method) for the identification of genes that are important broadly for the understanding of cancer. The authors' approach is explained in the methods section of the paper, but no rationale is given for why certain aspects of the method are defined as they are. Moreover, no comparison or reference is made to any other methods that have been developed for similar purposes and no results are shown to illustrate the robustness of the proposed method (e.g., is it sensitive to subtle changes in how it is implemented).

      For example, in the first part of the MEMORY algorithm, gene expression values are dichotomized at the sample median and a log-rank test is performed. This would seemingly result in an unnecessary loss of information for detecting an association between gene expression and survival. Moreover, while dichotomizing at the median is optimal from an information theory perspective (i.e., it creates equally sized groups), there is no reason to believe that median-dichotomization is correct vis-à-vis the relationship between gene expression and survival. If a gene really matters and expression only differentiates survival more towards the tail of the empirical gene expression distribution, median-dichotomization could dramatically lower the power to detect group-wise differences.

      Thanks for these valuable comments!! We understand the reviewer’s concern regarding the potential loss of information caused by median-based dichotomization. In this study, we adopted the median as the cut-off value to stratify gene expression levels primarily for the purpose of data balancing and computational simplicity. This approach ensures approximately equal group sizes, which is particularly beneficial in the context of limited sample sizes and repeated sampling. While we acknowledge that this method may discard certain expression nuances, it remains a widely used strategy in survival analysis. To further evaluate and potentially enhance sensitivity, alternative strategies such as percentile-based cutoffs or survival models using continuous expression values (e.g., Cox regression) may be explored in future optimization of the MEMORY pipeline. Nevertheless, we believe that this dichotomization approach offers a straightforward and effective solution for the initial screening of survival-associated genes. We have now included this explanation in the revised manuscript (Lines 391–393).

      Specifically, the authors' rationale for translating the Significant Probability Matrix into a set of GEARs warrants some discussion in the paper. If I understand correctly, for each cancer the authors propose to search for the smallest sample size (i.e., the smallest value of k_{j}) were there is at least one gene with a survival analysis p-value <0.05 for each of the 1000 sampled datasets. I base my understanding on the statement "We defined the sampling size k_{j} reached saturation when the max value of column j was equal to 1 in a significant-probability matrix. The least value of k_{j} was selected". Then, any gene with a p-value <0.05 in 80% of the 1000 sampled datasets would be called a GEAR for that cancer. The 80% value here seems arbitrary but that is a minor point. I acknowledge that something must be chosen. More importantly, do the authors believe this logic will work effectively in general? Presumably, the gene with the largest effect for a cancer will define the value of K_{j}, and, if the effect is large, this may result in other genes with smaller effects not being selected for that cancer by virtue of the 80% threshold. One could imagine that a gene that has a small-tomoderate effect consistently across many cancers may not show up as a gear broadly if there are genes with more substantive effects for most of the cancers investigated. I am taking the term "Steadily Associated" very literally here as I've constructed a hypothetical where the association is consistent across cancers but not extremely strong. If by "Steadily Associated" the authors really mean "Relatively Large Association", my argument would fall apart but then the definition of a GEAR would perhaps be suboptimal. In this latter case, the proposed approach seems like an indirect way to ensure there is a reasonable effect size for a gene's expression on survival.

      Thank you for the comment and we apologize for the confusion! 𝐴<sub>𝑖𝑗</sub> refers to the value of gene i under gradient j in the significant-probability matrix, primarily used to quantify the statistical probability of association with patient survival for ranking purposes. We believe that GEARs are among the top-ranked genes, but there is no established metric to define the optimal threshold. An 80% threshold is previously employed as an empirical standard in studies related to survival estimates [1]. In addition, we acknowledge that the determination of the saturation point 𝑘<sub>𝑗</sub> is influenced by the earliest point at which any gene achieves consistent significance across 1000 permutations. We recognize that this may lead to the under representation of genes with moderate but consistent effects, especially in the presence of highly significant genes that dominate the statistical landscape. We therefore empirically used 𝐴<sub>𝑖𝑗</sub> > 0.8 the threshold to distinguish between GEARs and non-GEARs. Of course, this parameter variation may indeed result in the loss of some GEARs or the inclusion of non-GEARs. We also agree that future studies could investigate alternative metrics and more refined thresholds to improve the application of GEARs.

      Regarding the term ‘Steadily Associated’, we define GEARs based on statistical robustness across subsampled survival analyses within individual cancer types, rather than cross-cancer consistency or pan-cancer moderate effects. Therefore, our operational definition of “steadiness” emphasizes within-cancer reproducibility across sampling gradients, which does not necessarily exclude high-effect-size genes. Nonetheless, we agree that future extensions of MEMORY could incorporate cross-cancer consistency metrics to capture genes with smaller but reproducible pan-cancer effects.

      The paper contains numerous post-hoc hypothesis tests, statements regarding detected associations and correlations, and statements regarding statistically significant findings based on analyses that would naturally only be conducted in light of positive results from analyses upstream in the overall workflow. Due to the number of statistical tests performed and the fact that the tests are sometimes performed using data-driven subgroups (e.g., the mitosis subgroups), it is highly likely that some of the findings in the work will not be replicable. Of course, this is exploratory science, and is to be expected that some findings won't replicate (the authors even call for further research into key findings). Nonetheless, I would encourage the authors to focus on the quantification of evidence regarding associations or claims (i.e., presenting effect estimates and uncertainty intervals), but to avoid the use of the term statistical significance owing to there being no clear plan to control type I error rates in any systematic way across the diverse analyses there were performed.

      Thank you for the comment! We agree that rigorous control of type-I error is essential once a definitive list of prognostic genes is declared. The current implementation of MEMORY, however, is deliberately positioned as an exploratory screening tool: each gene is evaluated across 10 sampling gradients and 1,000 resamples per gradient, and the only quantity carried forward is its reproducibility probability (𝐴<sub>𝑖𝑗</sub>).

      Because these probabilities are derived from aggregate “votes” rather than single-pass P-values, the influence of any one unadjusted test is inherently diluted. In another words, whether or not a per-iteration BH adjustment is applied does not materially affect the ranking of genes by reproducibility, which is the key output at this stage. However, we also recognize that a clinically actionable GEARs catalogue will require extensive, large-scale multiple-testing adjustments. Accordingly, future versions of MEMORY will embed a dedicated false-positive control framework tailored to the final GEARs list before any translational application. We have added this point in the ‘Discussion’ in the revised manuscript (Lines 350-359).

      A prespecified analysis plan with hypotheses to be tested (to the extent this was already produced) and a document that defines the complete scope of the scientific endeavor (beyond that which is included in the paper) would strengthen the contribution by providing further context on the totality of the substantial work that has been done. For example, the focus on LUAD and BRCA due to their representativeness could be supplemented by additional information on other cancers that may have been investigated similarly but where results were not presented due to lack of space.

      We thank the reviewer for requesting greater clarity on the analytic workflow. The MEMORY pipeline was fully specified before any results were examined and is described in ‘Methods’ (Lines 386–407). By contrast, the pathway-enrichment and downstream network/mutation analyses were deliberately exploratory: their exact content necessarily depended on which functional categories emerged from the unbiased GEAR screen.

      Our screen revealed a pronounced enrichment of mitotic signatures in LUAD and immune signatures in BRCA.

      We then chose these two cancer types for deeper “case-study” analysis because they contained the largest sample sizes among all cancers showing mitotic- or immune-dominated GEAR profiles, and provided the greatest statistical power for follow-up investigations. We have added this explanation into the revised manuscript (Line 163, 219-220).

      Reviewer #2 (Public review):

      Summary:

      The authors are trying to come up with a list of genes (GEAR genes) that are consistently associated with cancer patient survival based on TCGA database. A method named "Multi-gradient Permutation Survival Analysis" was created based on bootstrapping and gradually increasing the sample size of the analysis. Only the genes with consistent performance in this analysis process are chosen as potential candidates for further analyses.

      Strengths:

      The authors describe in detail their proposed method and the list of the chosen genes from the analysis. The scientific meaning and potential values of their findings are discussed in the context of published results in this field.

      Weaknesses:

      Some steps of the proposed method (especially the definition of survival analysis similarity (SAS) need further clarification or details since it would be difficult if anyone tries to reproduce the results. In addition, the multiplicity (a large number of p-values are generated) needs to be discussed and/or the potential inflation of false findings needs to be part of the manuscript.

      Thank you for the reviewer’s insightful comments. Accordingly, in the revised manuscript, we have provided a more detailed explanation of the definition and calculation of Survival-Analysis Similarity (SAS) to ensure methodological clarity and reproducibility (Lines 411-428); and the full code is now publicly available on GitHub (https://github.com/XinleiCai/MEMORY). We have also expanded the ‘Discussion’ to clarify our position on false-positive control: future releases of MEMORY will incorporate a dedicated framework to control false discoveries in the final GEARs catalogue, where itself will be subjected to rigorous, large-scale multiple-testing adjustment.

      If the authors can improve the clarity of the proposed method and there is no major mistake there, the proposed approach can be applied to other diseases (assuming TCGA type of data is available for them) to identify potential gene lists, based on which drug screening can be performed to identify potential target for development.

      Thank you for the suggestion. All source code has now been made publicly available on GitHub for reference and reuse. We agree that the GEAR lists produced by MEMORY hold considerable promise for drugscreening and target-validation efforts, and the framework could be applied to any disease with TCGA-type data. Of course, we also notice that the current GEAR catalogue should first undergo rigorous, large-scale multipletesting correction to further improve its precision before broader deployment.

      Reviewer #3 (Public review):

      Summary:

      The authors describe a valuable method to find gene sets that may correlate with a patient's survival. This method employs iterative tests of significance across randomised samples with a range of proportions of the original dataset. Those genes that show significance across a range of samples are chosen. Based on these gene sets, hub genes are determined from similarity scores.

      Strengths:

      MEMORY allows them to assess the correlation between a gene and patient prognosis using any available transcriptomic dataset. They present several follow-on analyses and compare the gene sets found to previous studies.

      Weaknesses:

      Unfortunately, the authors have not included sufficient details for others to reproduce this work or use the MEMORY algorithm to find future gene sets, nor to take the gene findings presented forward to be validated or used for future hypotheses.

      Thank you for the reviewer’s comments! We apologize for the inconvenience and the lack of details.

      Followed the reviewer’s valuable suggestion, we have now made all source code and relevant scripts publicly available on GitHub to ensure full reproducibility and facilitate future use of the MEMORY algorithm for gene discovery and hypothesis generation.

      Reviewer #4 (Public review):

      The authors apply what I gather is a novel methodology titled "Multi-gradient Permutation Survival Analysis" to identify genes that are robustly associated with prognosis ("GEARs") using tumour expression data from 15 cancer types available in the TCGA. The resulting lists of GEARs are then interrogated for biological insights using a range of techniques including connectivity and gene enrichment analysis.

      I reviewed this paper primarily from a statistical perspective. Evidently, an impressive amount of work has been conducted, and concisely summarised, and great effort has been undertaken to add layers of insight to the findings. I am no stranger to what an undertaking this would have been. My primary concern, however, is that the novel statistical procedure proposed, and applied to identify the gene lists, as far as I can tell offers no statistical error control or quantification. Consequently, we have no sense of what proportion of the highlighted GEAR genes and networks are likely to just be noise.

      Major comments:

      (1) The main methodology used to identify the GEAR genes, "Multi-gradient Permutation Survival Analysis" does not formally account for multiple testing and offers no formal error control. Meaning we are left with no understanding of what the family-wise (aka type 1) error rate is among the GEAR lists, nor the false discovery rate. I would generally recommend against the use of any feature selection methodology that does not provide some form of error quantification and/or control because otherwise we do not know if we are encouraging our colleagues and/or readers to put resources into lists of genes that contain more noise than not. There are numerous statistical techniques available these days that offer error control, including for lists of p-values from arbitrary sets of tests (see expansion on this and some review references below).

      Thank you for your thoughtful and important comment! We fully agree that controlling type I error is critical when identifying gene sets for downstream interpretation or validation. As an exploratory study, our primary aim was to define and screen for GEARs by using the MEMORY framework; however, we acknowledge that the current implementation of MEMORY does not include a formal procedure for error control. Given that MEMORY relies on repeated sampling and counts the frequency of statistically significant p-values, applying standard p-value–based multiple-testing corrections at the individual test level would not meaningfully reduce the false-positive rate in this framework.

      We believe that error control should instead be applied at the level of the final GEAR catalogue. However, we also recognize that conventional correction methods are not directly applicable. In future versions of MEMORY, we plan to incorporate a dedicated and statistically appropriate false-positive control module tailored specifically to the aggregated outputs of the pipeline. We have clarified this point explicitly in the revised manuscript. (Lines 350-359)

      (2) Similarly, no formal significance measure was used to determine which of the strongest "SAS" connections to include as edges in the "Core Survival Network".

      We agree that the edges in the Core Survival Network (CSN) were selected based on the top-ranked SAS values rather than formal statistical thresholds. This was a deliberate design choice, as the CSN was intended as a heuristic similarity network to prioritize genes for downstream molecular classification and biological exploration, not for formal inference. To address potential concerns, we have clarified this intent in the revised manuscript, and we now explicitly state that the network construction was based on empirical ranking rather than statistical significance (Lines 422-425).

      (3) There is, as far as I could tell, no validation of any identified gene lists using an independent dataset external to the presently analysed TCGA data.

      Thank you for the comment. We acknowledge that no independent external dataset was used in the present study to validate the GEARs lists. However, the primary aim of this work was to systematically identify and characterize genes with robust prognostic associations across cancer types using the MEMORY framework. To assess the biological relevance of the resulting GEARs, we conducted extensive downstream analyses including functional enrichment, mutation profiling, immune infiltration comparison, and drug-response correlation. These analyses were performed across multiple cancer types and further supported by a wide range of published literature.

      We believe that this combination of functional characterization and literature validation provides strong initial support for the robustness and relevance of the GEARs lists. Nonetheless, we agree that validation in independent datasets is an important next step, and we plan to carry this out in future work to further strengthen the clinical application of MEMORY.

      (4) There are quite a few places in the methods section where descriptions were not clear (e.g. elements of matrices referred to without defining what the columns and rows are), and I think it would be quite challenging to re-produce some aspects of the procedures as currently described (more detailed notes below).

      We apologize for the confusion. In the revised manuscript, we have provided a clearer and more detailed description of the computational workflow of MEMORY to improve clarity and reproducibility.

      (5) There is a general lack of statistical inference offered. For example, throughout the gene enrichment section of the results, I never saw it stated whether the pathways highlighted are enriched to a significant degree or not.

      We apologize for not clearly stating this information in the original manuscript. In the revised manuscript, we have updated the figure legend to explicitly report the statistical significance of the enriched pathways (Line 870, 877, 879-880).

      Reviewer #1 (Recommendations for the authors):

      Overall, the paper reads well but there are numerous small grammatical errors that at times cost me non-trivial amounts of time to understand the authors' key messages.

      We apologize for the grammatical errors that hindered clarity. In response, we have thoroughly revised the manuscript for grammar, spelling, and overall language quality.

      Reviewer #2 (Recommendations for the authors):

      Major comments:

      (1) Line 427: survival analysis similarity (SAS) definition. Any reference on this definition and why it is defined this way? Can the SAS value be negative? Based on line 429 definition, if A and B are exactly the same, SAS ~ 1; completely opposite, SAS =0; otherwise, SAS could be any value, positive or negative. So it is hard to tell what SAS is measuring. It is important to make sure SAS can measure the similarity in a systematic and consistent way since it is used as input in the following network analysis.

      We apologize for the confusion caused by the ambiguity in the original SAS formula. The SAS metric was inspired by the Jaccard index, but we modified the denominator to increase contrast between gene pairs. Specifically, the numerator counts the number of permutations in which both genes are simultaneously significant (i.e., both equal to 1), while the denominator is the sum of the total number of significant events for each gene minus twice the shared significant count. An additional +1 term was included in the denominator to avoid division by zero. This formulation ensures that SAS is always non-negative and bounded between 0 and 1, with higher values indicating greater similarity. We have clarified this definition and updated the formula in the revised manuscript (Lines 405-425). 

      (2) For the method with high dimensional data, multiplicity adjustment needs to be discussed, but it is missing in the manuscript. A 5% p-value cutoff was used across the paper, which seems to be too liberal in this type of analysis. The suggestion is to either use a lower cutoff value or use False Discovery Rate (FDR) control methods for such adjustment. This will reduce the length of the gene list and may help with a more focused discussion.

      We appreciate the reviewer’s suggestion regarding multiplicity. MEMORY is intentionally positioned as an exploratory screen: each gene is tested across 10 sampling gradients and 1,000 resamples, and only its reproducibility probability (𝐴<sub>𝑖𝑗</sub>) is retained. Because this metric is an aggregate of 1,000 “votes” the influence of any single unadjusted P-value is already strongly diluted; adding a per-iteration BH/FDR step therefore has negligible impact on the reproducibility ranking that drives all downstream analyses.

      That said, we recognize that a clinically actionable GEARs catalogue must undergo formal, large-scale multipletesting correction. Future releases of MEMORY will incorporate an error control module applied to the consolidated GEAR list before any translational use. We have now added a statement to this effect in the revised manuscript (Lines 350-359).

      (3) To allow reproducibility from others, please include as many details as possible (software, parameters, modules etc.) for the analyses performed in different steps.

      All source codes are now publically available on GitHub. We have also added the GitHub address in the section Online Content.

      Minor comments or queries:

      (4) The manuscript needs to be polished to fix grammar, incomplete sentences, and missing figures.

      Thank you for the suggestion. We have thoroughly proofread the manuscript to correct grammar, complete any unfinished sentences, and restore or renumber all missing figure panels. All figures are now properly referenced in the text.

      (5) Line 131: "survival probability of certain genes" seems to be miss-leading. Are you talking about its probability of associating with survival (or prognosis)?

      Sorry for the oversight. What we mean is the probability that a gene is found to be significantly associated with survival across the 1,000 resamples. We have revised the statement to “significant probability of certain genes” (Line 102).

      (6) Lines 132, 133: "remained consistent": the score just needs to stay > 0.8 as the sample increases, or the score needs to be monotonously non-decreasing?

      We mean the score stay above 0.8. We understand “remained consistent” is confusing and now revised it to “remained above 0.8”.

      (7) Lines 168-170 how can supplementary figure 5A-K show "a certain degree of correlation with cancer stages"?

      Sorry for the confusion! We have now revised Supplementary Figure 5A–K to support the visual impression with formal statistics. For each cancer type, we built a contingency table of AJCC stage (I–IV) versus hub-gene subgroup (Low, Mid, High) and applied Pearson’s 𝑥<sup>2</sup> test (Monte-Carlo approximation, 10⁵ replicates when any expected cell count < 5). The 𝑥<sup>2</sup> statistic and p-value are printed beneath every panel; eight of the eleven cancers show a significant association (p-value < 0.05), while LUSC, THCA and PAAD do not.We have replaced the vague phrase “a certain degree of correlation” with this explicit statistical statement in the revised manuscript (Lines 141-143).

      (8) Lines 172-174: since the hub genes are a subset of GEAR genes through CSN construction, it is not a surprise of the consistency. any explanation about PAAD that is shown only in GOEA with GEARs but not with hub genes?

      Thanks for raising this interesting point! In PAAD the Core Survival Network is unusually diffuse: the top-ranked SAS edges are distributed broadly rather than converging on a single dense module. Because of this flat topology, the ten highest-degree nodes (our hub set) do not form a tightly interconnected cluster, nor are they collectively enriched in the mitosis-related pathway that dominates the full GEAR list. This might explain that the mitotic enrichment is evident when all PAAD GEARs were analyzed but not when the analysis is confined to the far smaller—and more functionally dispersed—hub-gene subset.

      (9) Lines 191: how the classification was performed? Tool? Cutoff values etc?

      The hub-gene-based molecular classification was performed in R using hierarchical clustering. Briefly, we extracted the 𝑙𝑜𝑔<sub>2</sub>(𝑇𝑃𝑀 +1) expression matrix of hub genes, computed Euclidean distances between samples, and applied Ward’s minimum variance method (hclust, method = "ward.D2"). The resulting dendrogram was then divided into three groups (cutree, k = 3), corresponding to low, mid, and high expression classes. These parameters were selected based on visual inspection of clustering structure across cancer types. We have added this information to the revised ‘Methods’ section (Lines 439-443).

      (10) Lines 210-212: any statistics to support the conclusion? The bar chat of Figure 3B seems to support that all mutations favor ML & MM.

      We agree that formal statistical support is important for interpreting groupwise comparisons. In this case, however, several of the driver events, such as ROS1 and ERBB2, had very small subgroup counts, which violate the assumptions of Pearson’s 𝑥<sup>2</sup> test. While we explored 𝑥<sup>2</sup> and Fisher’s exact tests, the results were unstable due to sparse counts. Therefore, we chose to present these distributions descriptively to illustrate the observed subtype preferences across different driver mutations (Figure 3B). We have revised the manuscript text to clarify this point (Lines 182-188).

      (11) Line 216: should supplementary Figure 6H-J be "6H-I"?

      We apologize for the mistake. We have corrected it in the revised manuscript.

      (12) Line 224: incomplete sentence starting with "To further the functional... ".

      Thanks! We have made the revision and it states now “To further expore the functional implications of these mutations, we enriched them using a pathway system called Nested Systems in Tumors (NeST)”.

      (13) Lines 261-263: it is better to report the median instead of the mean. Use log scale data for analysis or use non-parametric methods due to the long tail of the data.

      Thank you for the very helpful suggestion. In the revised manuscript, we now report the median instead of the mean to better reflect the distribution of the data. In addition, we have applied log-scale transformation where appropriate and replaced the original statistical tests with non-parametric Wilcoxon ranksum tests to account for the long-tailed distribution. These changes have been implemented in both the main text and figure legends (Lines 234–237, Figure 5F).

      (14) Line 430: why based on the first sampling gradient, i.e. k_1 instead of the k_j selected? Or do you mean k_j here?

      Thanks for this question! We deliberately based SAS on the vectors from the first sampling gradient ( 𝑘<sub>1</sub>, ≈ 10 % of the cohort). At this smallest sample size, the binary significance patterns still contain substantial variation, and many genes are not significant in every permutation. Based on this, we think the measure can meaningfully identify gene pairs that behave concordantly throughout the gradient permutation. 

      We have now added a sentence to clarify this in the Methods section (Lines 398–403).

      (15) Need clarification on how the significant survival network was built.

      Thank you for pointing this out. We have now provided a more detailed clarification of how the Survival-Analysis Similarity (SAS) metric was defined and applied in constructing the core survival network (CSN), including the rationale for key parameter choices (Lines 409–430). Additionally, we have made full source code publicly available on GitHub to facilitate transparency and reproducibility (https://github.com/XinleiCai/MEMORY).

      (16) Line 433: what defines the "significant genes" here? Are they the same as GEAR genes? And what are total genes, all the genes?

      We apologize for the inconsistency in terminology, which may have caused confusion. In this context,

      “significant genes” refers specifically to the GEARs (Genes Steadily Associated with Prognosis). The SAS values were calculated between each GEAR and all genes. We have revised the manuscript to clarify this by consistently using the term “GEARs” throughout.

      (17) Line 433: more detail on how SAS values were used will be helpful. For example, were pairwise SAS values fed into Cytoscape as an additional data attribute (on top of what is available in TCGA) or as the only data attribute for network building?

      The SAS values were used as the sole metric for defining connections (edges) between genes in the construction of the core survival network (CSN). Specifically, we calculated pairwise SAS values between each GEAR and all other genes, then selected the top 1,000 gene pairs with the highest SAS scores to construct the network. No additional data attributes from TCGA (such as expression levels or clinical features) were used in this step. These selected pairs were imported into Cytoscape solely based on their SAS values to visualize the CSN.

      (18) Line 434: what is "ranking" here, by degree? Is it the same as "nodes with top 10 degrees" at line 436?

      The “ranking” refers specifically to the SAS values between gene pairs. The top 1,000 ranked SAS values were selected to define the edges used in constructing the Core Survival Network (CSN).

      Once the CSN was built, we calculated the degree (number of connections) for each node (i.e., each gene). The

      “top 10 degrees” mentioned on Line 421 refers to the 10 genes with the highest node degrees in the CSN. These were designated as hub genes for downstream analyses.

      We have clarified this distinction in the revised manuscript (Line 398-403).

      (19) Line 435: was the network built in Cytoscape? Or built with other tool first and then visualized in Cytoscape?

      The network was constructed in R by selecting the top 1,000 gene pairs with the highest SAS values to define the edges. This edge list was then imported into Cytoscape solely for visualization purposes. No network construction or filtering was performed within Cytoscape itself. We have clarified this in the revised ‘Methods’ section (Lines 424-425).

      (20) Line 436: the degree of each note was calculated, what does it mean by "degree" here and is it the same as the number of edges? How does it link to the "higher ranked edges" in Line 165?

      The “degree” of a node refers to the number of edges connected to that node—a standard metric in graph theory used to quantify a node’s centrality or connectivity in the network. It is equivalent to the number of edges a gene shares with others in the CSN.

      The “higher-ranked edges” refer to the top 1,000 gene pairs with the highest SAS values, which we used to construct the Core Survival Network (CSN). The degree for each node was computed within this fixed network, and the top 10 nodes with the highest degree were selected as hub genes. Therefore, the node degree is largely determined by this pre-defined edge set.

      (21) Line 439: does it mean only 1000 SAS values were used or SAS values from 1000 genes, which should come up with 1000 choose 2 pairs (~ half million SAS values).

      We computed the SAS values between each GEAR gene and all other genes, resulting in a large number of pairwise similarity scores. Among these, we selected the top 1,000 gene pairs with the highest SAS values—regardless of how many unique genes were involved—to define the edges in the Core Survival Network (CSN). In another words, the network is constructed from the top 1,000 SAS-ranked gene pairs, not from all possible combinations among 1,000 genes (which would result in nearly half a million pairs). This approach yields a sparse network focused on the strongest co-prognostic relationships.

      We have clarified this in the revised ‘Methods’ section (Lines 409–430).

      (22) Line 496: what tool is used and what are the parameters set for hierarchical clustering if someone would like to reproduce the result?

      The hierarchical clustering was performed in R using the hclust function with Ward's minimum variance method (method = "ward.D2"), based on Euclidean distance computed from the log-transformed expression matrix (𝑙𝑜𝑔<sub>2</sub>(𝑇𝑃𝑀 +1)). Cluster assignment was done using the cutree function with k = 3 to define low, mid, and high expression subgroups. These settings have now been explicitly stated in the revised ‘Methods’ section (Lines 439–443) to facilitate reproducibility.

      (23) Lines 901-909: Figure 4 missing panel C. Current panel C seems to be the panel D in the description.

      Sorry for the oversights and we have now made the correction (Line 893).

      (24) Lines 920-928: Figure 6C: considering a higher bar to define "significant".

      We agree that applying a more stringent cutoff (e.g., p < 0.01) may reduce potential false positives. However, given the exploratory nature of this study, we believe the current threshold remains appropriate for the purpose of hypothesis generation.

      Reviewer #3 (Recommendations for the authors):

      (1) The title says the genes that are "steadily" associated are identified, but what you mean by the word "steadily" is not defined in the manuscript. Perhaps this could mean that they are consistently associated in different analyses, but multiple analyses are not compared.

      In our manuscript, “steadily associated” refers to genes that consistently show significant associations with patient prognosis across multiple sample sizes and repeated resampling within the MEMORY framework (Lines 65–66). Specifically, each gene is evaluated across 10 sampling gradients (from ~10% to 100% of the cohort) with 1,000 permutations at each level. A gene is defined as a GEAR if its probability of being significantly associated with survival remains ≥ 0.8 throughout the whole permutation process. This stability in signal under extensive resampling is what we refer to as “steadily associated.”

      (2) I think the word "gradient" is not appropriately used as it usually indicates a slope or a rate of change. It seems to indicate a step in the algorithm associated with a sampling proportion.

      Thank you for pointing out the potential ambiguity in our use of the term “gradient.” In our study, we used “gradient” to refer to stepwise increases in the sample proportion used for resampling and analysis. We have now revised it to “progressive”.

      (3) Make it clear that the name "GEARs" is introduced in this publication.

      Done.

      (4) Sometimes the document is hard to understand, for example, the sentence, "As the number of samples increases, the survival probability of certain genes gradually approaches 1." It does not appear to be calculating "gene survival probability" but rather a gene's association with patient survival. Or is it that as the algorithm progresses genes are discarded and therefore do have a survival probability? It is not clear.

      What we intended to describe is the probability that a gene is judged significant in the 1,000 resamples at a given sample-size step, that is, its reproducibility probability in the MEMORY framework. We have now revised the description (Lines 101-104).

      (5) The article lacks significant details, like the type of test used to generate p-values. I assume it is the log-rank test from the R survival package. This should be explicitly stated. It is not clear why the survminer R package is required or what function it has. Are the p-values corrected for multiple hypothesis testing at each sampling?

      We apologize for the lack of details. In each sampling iteration, we used the log-rank test (implemented via the survdiff function in the R survival package) to evaluate the prognostic association of individual genes. This information has now been explicitly added to the revised manuscript.

      The survminer package was originally included for visualization purposes, such as plotting illustrative Kaplan– Meier curves. However, since it did not contribute to the core statistical analysis, we have now removed this package from the Methods section to avoid confusion (Lines 386-407).

      As for multiple-testing correction, we did not adjust p-values in each iteration, because the final selection of GEARs is based on the frequency with which a gene is found significant across 1,000 resamples (i.e., its reproducibility probability). Classical FDR corrections at the per-sample level do not meaningfully affect this aggregate metric. That said, we fully acknowledge the importance of multiple-testing control for the final GEARs catalogue. Future versions of the MEMORY framework will incorporate appropriate adjustment procedures at that stage.

      (6) It is not clear what the survival metric is. Is it overall survival (OS) or progression-free survival (PFS), which would be common choices?

      It’s overall survival (OS).

      (7) The treatment of the patients is never considered, nor whether the sequencing was performed pre or posttreatment. The patient's survival will be impacted by the treatment that they receive, and many other factors like commodities, not just the genomics.

      We initially thought there exist no genes significantly associated with patient survival (GEARs) without counting so many different influential factors. This is exactly what motivated us to invent the

      MEMORY. However, this work proves “we were wrong”, and it demonstrates the real power of GEARs in determining patient survival. Of course, we totally agree with the reviewer that incorporating therapy variables and other clinical covariates will further improve the power of MEMORY analyses.

      (8) As a paper that introduces a new analysis method, it should contain some comparison with existing state of the art, or perhaps randomised data.

      Our understanding is --- the MEMORY presents as an exploratory and proof-of-concept framework. Comparison with regular survival analyses seems not reasonable. We have added some discussion in revised manuscript (Lines 350-359).

      (9) In the discussion it reads, "it remains uncertain whether there exists a set of genes steadily associated with cancer prognosis, regardless of sample size and other factors." Of course, there are many other factors that may alter the consistency of important cancer genes, but sample size is not one of them. Sample size merely determines whether your study has sufficient power to detect certain gene effects, it does not effect whether genes are steadily associated with cancer prognosis in different analyses. (Of course, this does depend on what you mean by "steadily".)

      We totally agree with reviewer that sample size itself does not alter a gene’s biological association with prognosis; it only affects the statistical power to detect that association. Because this study is exploratory and we were initially uncertain whether GEARs existed, we first examined the impact of sample-size variation—a dominant yet experimentally tractable source of heterogeneity—before considering other, less controllable factors.

      Reviewer #4 (Recommendations for the authors):

      Other more detailed comments:

      (1) Introduction

      L93: When listing reasons why genes do not replicate across different cohorts / datasets, there is also the simple fact that some could be false positives

      We totally agree that some genes may simply represent false-positive findings apart from biological heterogeneity and technical differences between cohorts. Although the MEMORY framework reduces this risk by requiring high reproducibility across 1,000 resamples and multiple sample-size tiers, it cannot eliminate false positives completely. We have added some discussion and explicitly note that external validation in independent datasets is essential for confirming any GEAR before clinical application.

      (2) Results Section

      L143: Language like "We also identified the most significant GEARs in individual cancer types" I think is potentially misleading since the "GEAR" lists do not have formal statistical significance attached.

      We removed “significant” ad revised it to “top 1” (Line 115).

      L153 onward: The pathway analysis results reported do not include any measures of how statistically significant the enrichment was.

      We have now updated the figure legends to clearly indicate that the displayed pathways represent the top significantly enriched results based on adjusted p-values from GO enrichment analyses (Lines 876-878).

      L168: "A certain degree of correlation with cancer stages (TNM stages) is observed in most cancer types except for COAD, LUSC and PRAD". For statements like this statistical significance should be mentioned in the same sentence or, if these correlations failed to reach significance, that should be explicitly stated.

      In the revised Supplementary Figure 5A–K, we now accompany the visual trends with formal statistical testing. Specifically, for each cancer type, we constructed a contingency table of AJCC stage (I–IV) versus hub-gene subgroup (Low, Mid, High) and applied Pearson’s 𝑥<sup>2</sup> test (using Monte Carlo approximation with 10⁵ replicates if any expected cell count was < 5). The resulting 𝑥<sup>2</sup> statistic and p-value are printed beneath each panel. Of the eleven cancer types analyzed, eight showed statistically significant associations (p < 0.05), while COAD, LUSC, and PRAD did not. Accordingly, we have make the revision in the manuscript (Line 137139).

      L171-176: When mentioning which pathways are enriched among the gene lists, please clarify whether these levels of enrichment are statistically significant or not. If the enrichment is significant, please indicate to what degree, and if not I would not mention.

      We agree that the statistical significance of pathway enrichment should be clearly stated and made the revision throughout the manuscript (Line 869, 875, 877).

      (3) Methods Section

      L406 - 418: I did not really understand, nor see it explained, what is the motivation and value of cycling through 10%, 20% bootstrapped proportions of patients in the "gradient" approach? I did not see this justified, or motivated by any pre-existing statistical methodology/results. I do not follow the benefit compared to just doing one analysis of all available samples, and using the statistical inference we get "for free" from the survival analysis p-values to quantify sampling uncertainty.

      The ten step-wise sample fractions (10 % to 100 %) allow us to transform each gene’s single log-rank P-value into a reproducibility probability: at every fraction we repeat the test 1,000 times and record the proportion of permutations in which the gene is significant. This learning-curve-style resampling not only quantifies how consistently a gene associates with survival under different power conditions but also produces the 0/1 vectors required to compute Survival-Analysis Similarity (SAS) and build the Core Survival Network. A single one-off analysis on the full cohort would yield only one P-value per gene, providing no binary vectors at all—hence no basis for calculating SAS or constructing the network. 

      L417: I assume p < 0.05 in the survival analysis means the nominal p-value, unadjusted for multiple testing. Since we are in the context of many tests please explicitly state if so.

      Yes, p < 0.05 refers to the nominal, unadjusted p-value from each log-rank test within a single permutation. In MEMORY these raw p-values are converted immediately into 0/1 “votes” and aggregated over 1 000 permutations and ten sample-size tiers; only the resulting reproducibility probability (𝐴<sub>𝑖𝑗</sub>) is carried forward. No multiple-testing adjustment is applied at the individual-test level, because a per-iteration FDR or BH step would not materially affect the final 𝐴<sub>𝑖𝑗</sub> ranking. We have revised the manuscript (Line 396)

      L419-426: I did not see defined what the rows are and what the columns are in the "significant-probability matrix". Are rows genes, columns cancer types? Consequently I was not really sure what actually makes a "GEAR". Is it achieving a significance probability of 0.8 across all 15 cancer subtypes? Or in just one of the tumour datasets?

      In the significant-probability matrix, each row represents a gene, and each column corresponds to a sampling gradient (i.e., increasing sample-size tiers from ~10% to 100%) within a single cancer type. The matrix is constructed independently for each cancer.

      GEAR is defined as achieving a significance probability of 0.8 within a single tumor type. Not need to achieve significance probability across all 15 cancer subtypes.

      L426: The significance probability threshold of 0.8 across 1,000 bootstrapped nominal tests --- used to define the GEAR lists --- has, as far as I can tell, no formal justification. Conceptually, the "significance probability" reflects uncertainty in the patients being used (if I follow their procedure correctly), but as mentioned above, a classical p-value is also designed to reflect sampling uncertainty. So why use the bootstrapping at all?

      Moreover, the 0.8 threshold is applied on a per-gene basis, so there is no apparent procedure "built in" to adapt to (and account for) different total numbers of genes being tested. Can the authors quantify the false discovery rate associated with this GEAR selection procedure e.g. by running for data with permuted outcome labels? And why do the gradient / bootstrapping at all --- why not just run the nominal survival p-values through a simple Benjamini-Hochberg procedure, and then apply and FDR threshold to define the GEAR lists? Then you would have both multiplicity and error control for the final lists. As it stands, with no form of error control or quantification of noise rates in the GEAR lists I would not recommend promoting their use. There is a long history of variable selection techniques, and various options the authors could have used that would have provided formal error rates for the final GEAR lists (see seminal reviews by eg Heinze et al 2018 Biometrical

      Journal, or O'Hara and Sillanpaa, 2009, Bayesian Analysis), including, as I say, simple application of a Benjamini-Hochberg to achive multiplicity adjusted FDR control.

      Thank you. We chose the 10 × 1,000 resampling scheme to ask a different question from a single Benjamini–Hochberg scan: does a gene keep re-appearing as significant when cohort composition and statistical power vary from 10 % to 100 % of the data? Converting the 1,000 nominal p-values at each sample fraction into a reproducibility probability 𝐴<sub>𝑖𝑗</sub> allows us to screen for signals that are stable across wide sampling uncertainty rather than relying on one pass through the full cohort. The 0.8 cut-off is an intentionally strict, empirically accepted robustness threshold (analogous to stability-selection); under the global null the chance of exceeding it in 1,000 draws is effectively zero, so the procedure is already highly conservative even before any gene-wise multiplicity correction [1]. Once MEMORY moves beyond this exploratory stage and a final, clinically actionable GEAR catalogue is required, we will add a formal FDR layer after the robustness screen, but for the present proof-of-concept study, we retain the resampling step specifically to capture stability rather than to serve as definitive error control.

      L427-433: I gathered that SAS reflects, for a particular pair of genes, how likely they are to be jointly significant across bootstraps. If so, perhaps this description or similar could be added since I found a "conceptual" description lacking which would have helped when reading through the maths. Does it make sense to also reflect joint significance across multiple cancer types in the SAS? Or did I miss it and this is already reflected?

      SAS is indeed meant to quantify, within a single cancer type, how consistently two genes are jointly significant across the 1,000 bootstrap resamples performed at a given sample-size tier. In another words, SAS is the empirical probability that the two genes “co-light-up” in the same permutation, providing a measure of shared prognostic behavior beyond what either gene shows alone. We have added this plain language description to the ‘Methods’ (Lines 405-418).

      In the current implementation SAS is calculated separately for each cancer type; it does not aggregate cosignificance across different cancers. Extending SAS to capture joint reproducibility across multiple tumor types is an interesting idea, especially for identifying pan-cancer gene pairs, and we note this as a potential future enhancement of the MEMORY pipeline.

      L432: "The SAS of significant genes with total genes was calculated, and the significant survival network was constructed" Are the "significant genes" the "GEAR" list extracted above according to the 0.8 threshold? If so, and this is a bit pedantic, I do not think they should be referred to as "significant genes" and that this phrase should be reserved for formal statistical significance.

      We have replaced “significant genes” with “GEAR genes” to avoid any confusion (Lines 421-422).

      L434: "some SAS values at the top of the rankings were extracted, and the SAS was visualized to a network by Cytoscape. The network was named core survival network (CSN)". I did not see it explicitly stated which nodes actually go into the CSN. The entire GEAR list? What threshold is applied to SAS values in order to determine which edges to include? How was that threshold chosen? Was it data driven? For readers not familiar with what Cytoscape is and how it works could you offer more of an explanation in-text please? I gather it is simply a piece of network visualisation/wrangling software and does not annotate additional information (e.g. external experimental data), which I think is an important point to clarify in the article without needing to look up the reference.

      We have now clarified these points in the revised ‘Methods’ section, including how the SAS threshold was selected and which nodes were included in the Core Survival Network (CSN). Specifically, the CSN was constructed using the top 1,000 gene pairs with the highest SAS values. This threshold was not determined by a fixed numerical cutoff, but rather chosen empirically after comparing networks built with varying numbers of edges (250, 500, 1,000, 2,000, 6,000, and 8,000; see Reviewer-only Figure 1). We observed that, while increasing the number of edges led to denser networks, the set of hub genes remained largely stable. Therefore, we selected 1,000 edges as a balanced compromise between capturing sufficient biological information and maintaining computational efficiency and interpretability.

      The resulting node list (i.e., the genes present in those top-ranked pairs) is provided in Supplementary Table 4. Cytoscape was used solely as a network visualization platform, and no external annotations or experimental data were added at this stage. We have added a brief clarification in the main text to help readers understand.

      L437: "The effect of molecular classification by hub genes is indicated that 1000 to 2000 was a range that the result of molecular classification was best." Can you clarify how "best" is assessed here, i.e. by what metric and with which data?

      We apologize for the confusion. Upon constructing the network, we observed that the number of edges affected both the selection of hub genes and the computational complexity. We analyzed the networks with 250, 500, 1,000, 2,000, 6,000 and 8,000 edges, and found that the differences in selected hub genes were small (Author response image 1). Although the networks with fewer edges had lower computational complexity, the choice of 1000 edges was a compromise to the balance between sufficient biological information and manageable computational complexity. Thus, we chose the network with 1,000 edges as it offered a practical balance between computational efficiency and the biological relevance of the hub genes.

      Author response image 1.

      The intersection of the network constructed by various number of edges.

      References

      (1) Gebski, V., Garès, V., Gibbs, E. & Byth, K. Data maturity and follow-up in time-to-event analyses.International Journal of Epidemiology 47, 850–859 (2018).

    1. Respondent 7 mentions that ChatGPT is a very good language machine. “You can't rely on it for factual information, but it’s incredibly good with grammar and sentence structures.” They sometimes ask ChatGPT to rewrite a paragraph that does not flow right. “I prompt it to give me three alternatives for that paragraph.

      Ik denk dat dit een risicovolle actie is. Alhoewel ChatGPT spelling/grammaticafouten in een tekst zou kunnen weergeven/verbeteren, lijkt me dat het herschrijven van een stuk tekst voor betere flow snel de "personal touch" van de journalist uit het stuk kan halen. Daarnaast denk ik dat dit soort gebruik ook sneller kan leiden tot een mate van luiheid waar sneller gegrepen wordt naar automatische herschrijving in plaats van zelf de tekst herschrijven en je stem als schrijver te behouden.

    1. School is more of a war zone-a place to survive.

      Personally speaking, I have never encounter very severe struggles since my school has a very good environment. However, I did hear a story from my dad. He grew up in an environment that is relatively poor. Once the school gets dismissed, he was blocked by a group of 4 high school kids, who tried to rob him. In order to save his money, he resisted them. But they got knives in their hands, and poked to my dad‘s back. Fortunately, it only left a scar instead of killing him. Therefore, I just want to say that school is very much a war zone, where people have to try their best to survive.

    1. Reviewer #1 (Public review):

      Summary:

      The Neuronal microtubule cytoskeleton is essential long long-range transport in axons and dendrites. The axon-specific plus-end out microtubule organization vs the dendritic-specific plus-end in organization allows for selective transport into each neurite, setting up neuronal polarity. In addition, the dendritic microtubule organization is thought to be important for dendritic pruning in Drosophila during metamorphosis. However, the precise mechanisms that organize microtubules in neurons are still incompletely understood.

      In the current manuscript, the authors describe the spectraplakin protein Shot as important in developmental dendritic pruning. They find that Shot has dendritic microtubule polarity defects, which, based on their rescues and previous work, is likely the reason for the pruning defect.

      Since Shot is a known actin-microtubule crosslinker, they also investigate the putative role of actin and find that actin is also important for dendritic pruning. Finally, they find that several factors that have been shown to function as a dendritic MTOC in C. elegans also show a defect in Drosophila upon depletion.

      Strengths:

      Overall, this work was technically well-performed, using advanced genetics and imaging. The author reports some interesting findings identifying new players for dendritic microtubule organization and pruning.

      Weaknesses:

      The evidence for Shot interacting with actin for its functioning is contradictory. The Shot lacking the actin interaction domain did not rescue the mutant; however, it also has a strong toxic effect upon overexpression in wildtype (Figure S3), so a potential rescue may be masked. Moreover, the C-terminus-only construct, which carries the GAS2-like domain, was sufficient to rescue the pruning. This actually suggests that MT bundling/stabilization is the main function of Shot (and no actin binding is needed). On the other hand, actin depolymerization leads to some microtubule defects and subtle changes in shot localization in young neurons (not old ones). More importantly, it did not enhance the microtubule or pruning defects of the Shot domain, suggesting these act in the same pathway. Interesting to note is that Mical expression led to microtubule defects but not to pruning defects. This argues that MT organization effects alone are not enough to cause pruning defects. This may be be good to discuss. For the actin depolymerization, the authors used overexpression of the actin-oxidizing Mical protein. However, Mical may have another target. It would be good to validate key findings with better characterized actin targeting tools.

      In analogy to C. elegans, where RAB-11 functions as a ncMTOC to set up microtubules in dendrites, the authors investigated the role of these in Drosophila. Interestingly, they find that rab-11 also colocalizes to gamma tubulin and its depletion leads to some microtubule defects. Furthermore, they find a genetic interaction between these components and Shot; however, this does not prove that these components act together (if at all, it would be the opposite). This should be made more clear. What would be needed to connect these is to address RAB-11 localization + gamma-tubulin upon shot depletion.

      All components studied in this manuscript lead to a partial reversal of microtubules in the dendrite. However, it is not clear from how the data is represented if the microtubule defect is subtle in all animals or whether it is partially penetrant stronger effect (a few animals/neurons have a strong phenotype). This is relevant as this may suggest that other mechanisms are also required for this organization, and it would make it markedly different from C. elegans. This should be discussed and potentially represented differently.

    2. Reviewer #2 (Public review):

      Summary:

      In their manuscript, the authors reveal that the spectraplakin Shot, which can bind both microtubules and actin, is essential for the proper pruning of dendrites in a developing Drosophila model. A molecular basis for the coordination of these two cytoskeletons during neuronal development has been elusive, and the authors' data point to the role of Shot in regulating microtubule polarity and growth through one of its actin-binding domains. The authors also propose an intriguing new activity for a spectraplakin: functioning as part of a microtubule-organizing center (MTOC).

      Strengths:

      (1) A strength of the manuscript is the authors' data supporting the idea that Shot regulates dendrite pruning via its actin-binding CH1 domain and that this domain is also implicated in Shot's ability to regulate microtubule polarity and growth (although see comments below); these data are consistent with the authors' model that Shot acts through both the actin and microtubule cytoskeletons to regulate neuronal development.

      (2) Another strength of the manuscript is the data in support of Rab11 functioning as an MTOC in young larvae but not older larvae; this is an important finding that may resolve some debates in the literature. The finding that Rab11 and Msps coimmunoprecipitate is nice evidence in support of the idea that Rab11(+) endosomes serve as MTOCs.

      Weaknesses:

      (1) A significant, major concern is that most of the authors' main conclusions are not (well) supported, in particular, the model that Shot functions as part of an MTOC. The story has many interesting components, but lacks the experimental depth to support the authors' claims.

      (2) One of the authors' central claims is that Shot functions as part of a non-centrosomal MTOC, presumably a MTOC anchored on Rab11(+) endosomes. For example, in the Introduction, last paragraph, the authors summarize their model: "Shot localizes to dendrite tips in an actin-dependent manner where it recruits factors cooperating with an early-acting, Rab11-dependent MTOC." This statement is not supported. The authors do not show any data that Shot localizes with Rab11 or that Rab11 localization or its MTOC activity is affected by the loss of Shot (or otherwise manipulating Shot). A genetic interaction between Shot and Rab11 is not sufficient to support this claim, which relies on the proteins functioning together at a certain place and time. On a related note, the claim that Shot localization to dendrite tips is actin-dependent is not well supported: the authors show that the CH1 domain is needed to enrich Shot at dendrite tips, but they do not directly manipulate actin (it would be helpful if the authors showed the overexpression of Mical disrupted actin, as they predict).

      (3) The authors show an image that Shot colocalizes with the EB1-mScarlet3 comet initiation sites and use this representative image to generate a model that Shot functions as part of an MTOC. However, this conclusion needs additional support: the authors should quantify the frequency of EB1 comets that originate from Shot-GFP aggregates, report the orientation of EB1 comets that originate from Shot-GFP aggregates (e.g., do the Shot-GFP aggregates correlate with anterogradely or retrogradely moving EB1 comets), and characterize the developmental timing of these events. The genetic interaction tests revealing ability of shot dsRNA to enhance the loss of microtubule-interacting proteins (Msps, Patronin, EB1) and Rab11 are consistent with the idea that Shot regulates microtubules, but it does not provide any spatial information on where Shot is interacting with these proteins, which is critical to the model that Shot is acting as part of a dendritic MTOC.

      (4) It is unclear whether the authors are proposing that dendrite pruning defects are due to an early function of Shot in regulating microtubule polarity in young neurons (during 1st instar larval stages) or whether Shot is acting in another way to affect dendrite pruning. It would be helpful for the authors to present and discuss a specific model regarding Shot's regulation of dendrite pruning in the Discussion.

      (5) The authors argue that a change in microtubule polarity contributes to dendrite pruning defects. For example, in the Introduction, last paragraph, the authors state: "Loss of Shot causes pruning defects caused by mixed orientation of dendritic microtubules." The authors show a correlative relationship, not a causal one. In Figure 4, C and E, the authors show that overexpression of Mical disrupts microtubule polarity but not dendrite pruning, raising the question of whether disrupting microtubule polarity is sufficient to cause dendrite pruning defects. The lack of an association between a disruption in microtubule polarity and dendrite pruning in neurons overexpressing Mical is an important finding.

      (6) The authors show that a truncated Shot construct with the microtubule-binding domain, but no actin-binding domain (Shot-C-term), can rescue dendrite pruning defects and Khc-lacZ localization, whereas the longer Shot construct that lacks just one actin-binding domain ("delta-CH1") cannot. Have the authors confirmed that both proteins are expressed at equivalent levels? Based on these results and their finding that over-expression of Shot-delta-CH1 disrupts dendrite pruning, it seems possible that Shot-delta-CH1 may function as a dominant-negative rather than a loss-of-function. Regardless, the authors should develop a model that takes into account their findings that Shot, without any actin-binding domains and only a microtubule-binding domain, shows robust rescue.

      (7) The authors state that: "The fact that Shot variants lacking the CH1 domain cannot rescue the pruning defects of shot[3] mutants suggested that dendrite tip localization of Shot was important for its function." (pages 10-11). This statement is not accurate: the Shot C-term construct, which lacks the CH1 domain (as well as other domains), is able to rescue dendrite pruning defects.

      (8) The authors state that: "In further support of non-functionality, overexpression of Shot[deltaCH1] caused strong pruning defects (Fig. S3)." (page 8). Presumably, these results indicate that Shot-delta-CH1 is functioning as a dominant-negative since a loss-of-function protein would have no effect. The authors should revise how they interpret these results. This comment is related to another comment about the ability of Shot constructs to rescue the shot[3] mutant.

    3. Author response:

      We thank the reviewers for their comments. We are paraphrasing their three main criticisms below and provide responses and outlines of how we are going to address them.

      Criticism 1: Actin binding by Shot may not be required for Shot's function in dendritic microtubule organization (Point 1 by Reviewer 1, points 6-8 by reviewer 2).

      This criticism is mainly based on our finding that, while a version of Shot lacking just the high affinity actin binding site cannot rescue the pruning and orientation defects of shot<sup>3</sup> mutants, expression of a construct harboring just the microtubule and EB1 binding sites can. The reviewers also point out that a Shot construct lacking one of its actin binding domains (deltaCH1), causes pruning defects when overexpressed in wild type cells.

      We thank the reviewers for this comment. We concede that we did not properly explain our reasoning and conclusions regarding the role of actin binding in Shot dendritic function. From the literature, there is evidence that Shot fragments containing the C-terminal microtubule binding domain alone have positive effects on neuronal microtubule stability and organization by a gain-of-function mechanism. This is likely due to two reasons: firstly, the activity of these constructs is unrestrained by localization. For example, in axons, full length Shot localizes adjacent to the membrane and to growth cones, while a Shot C-terminal construct (lacking the actin-binding and spectrin-repeat domains) decorates axonal microtubules [1]. Secondly, the actin binding site appears to inhibit microtubule binding by an intramolecular mechanism that is relieved by actin binding [2]. Overexpression of such a construct also dramatically improves axonal microtubule defects in aged neurons [3]. Thus, actin recruitment may locally activate Shot's microtubule binding activity.

      To address this criticism, we will test if other UAS-Shot transgenes lacking the actin binding or microtubule binding domains can rescue the defects of Shot mutants. We will also try to provide more evidence that the C-terminal Shot construct exerts a gain-of-function effect on microtubules. We will adjust our interpretation accordingly.

      Criticism 2: The relationship between reversal of dendritic microtubule orientation and dendrite pruning defects could be correlative rather than causal (paragraph 1 by Reviewer 1, point 5 by reviewer 2).

      This criticism is based on our finding that Mical overexpression causes a partial reversal of dendritic microtubule orientation but no apparent dendrite pruning defects.

      We thank the reviewers for this comment. In fact, knockdown of EB1, which affects dendritic microtubule organisation via kinesin-2 [4], does not cause dendrite pruning defects by itself either, but strongly enhances the pruning defects caused by other microtubule manipulations [5]. This is likely because loss of EB1 destabilizes the dendritic cytoskeleton and thus also promotes dendrite degeneration. All other conditions that cause dendritic microtubule reversal also cause dendrite pruning defects [5 - 9]. As Mical is a known pruning factor [10], its overexpression may actually also destabilize dendrites, e. g., by severing actin filaments. However, we showed in the current manuscript that Mical overexpression causes a partial reversal of dendritic microtubule polarity and strongly enhances the dendrite pruning defects caused by Shot knockdown.

      To address this criticism, we will rephrase the corresponding section of our manuscript and specify that conditions that cause reversal of dendritic microtubule orientation either cause dendrite pruning defects, or act as genetic enhancers of pruning defects caused by other microtubule regulators. This wording better explains the relationship between dendritic microtubule orientation and dendrite pruning and also includes the Mical overexpression condition.

      Criticism 3: The presented data do not prove that Shot, Rab11 and Patronin act in a common pathway to establish dendritic plus end-in microtubule orientation (paragraphs 2-3 by Reviewer 1, point 1-4 by reviewer 2).

      While these factors genetically interact with each other during dendrite pruning, it is not clear whether (1) they colocalize at the tips of growing dendrites during early growth stages; (2) their respective localizations depend on each other; (3) they act at the same developmental stage in microtubule orientation.  

      We thank the reviewers for this comment. For technical reasons (e. g., incompatible transgenes, GAL4 drivers too weak), we could only partially address these questions at the time. We have now expanded our toolkit with additional drivers and fluorescently tagged transgenes. We will therefore test whether Shot and Rab11 or Patronin and Rab11 colocalize in growing dendrites during the early L1 stage, and if loss of Shot affects the localization or the activity of Patronin and Rab11 in dendrites. We will adapt our interpretation accordingly, and also add a comprehensive model.

      References

      (1) Alves Silva et al. (2012) J. Neurosci. 32:9143

      (2) Applewhite et al. (2013) Mol. Biol. Cell 24:2885

      (3) Okenve-Ramos et al. (2024) PLoS Biol. 22:e3002504

      (4) Mattie et al. (2010) Curr. Biol. 20:2169

      (5) Herzmann et al. (2018) Development 145:dev156950

      (6) Wang et al. (2019) eLife 8:e39964

      (7) Rui et al. (2020) EMBO Rep. 21:e48843

      (8) Tang et al. (2020) EMBO J. 39:e103549

      (9) Bu et al. (2022) Cell Rep. 39:110887

      (10) Kirilly et al. (2009) Nat. Neurosci. 12:1497

    1. Reviewer #1 (Public review):

      Summary:

      This study aims to address an important and timely question: how does the mesoscale architecture of cortical and subcortical circuits reorganize during sensorimotor learning? By using high-density, chronically implanted ultra-flexible electrode arrays, the authors track spiking activity across ten brain regions as mice learn a visual Go/No-Go task. The results indicate that learning leads to more sequential and temporally compressed patterns of activity during correct rejection trials, alongside changes in functional connectivity ranks that reflect shifts in the relative influence of visual, frontal, and motor areas throughout learning. The emergence of a more task-focused subnetwork is accompanied by broader and faster propagation of stimulus information across recorded regions.

      Strengths:

      A clear strength of this work is its recording approach. The combination of stable, high-throughput multi-region recordings over extended periods represents a significant advance for capturing learning-related network dynamics at the mesoscale. The conceptual framework is well motivated, building on prior evidence that decision-relevant signals are widely distributed across the brain. The analysis approach, combining functional connectivity rankings with information encoding metrics is well motivated but needs refinement. These results provide some valuable evidence of how learning can refine both the temporal precision and the structure of interregional communication, offering new insights into circuit reconfiguration during learning.

      Weaknesses:

      The technical approach is strong and the conceptual framing is compelling, but several aspects of the evidence remain incomplete. In particular, it is unclear whether the reported changes in connectivity truly capture causal influences, as the rank metrics remain correlational and show discrepancies with the manipulation results. The absolute response onset latencies also appear slow for sensory-guided behavior in mice, and it is not clear whether this reflects the method used to define onset timing or factors such as task structure or internal state. Furthermore, the small number of animals, combined with extensive repeated measures, raises questions about statistical independence and how multiple comparisons were controlled. The optogenetic experiments, while intended to test the functional relevance of rank-increasing regions, leave it unclear how effectively the targeted circuits were silenced. Without direct evidence of reliable local inhibition, the behavioral effects or lack thereof are difficult to interpret. Details on spike sorting are limited.

    2. Reviewer #2 (Public review):

      Summary:

      Wang et al. measure from 10 cortical and subcortical brain as mice learn a go/no-go visual discrimination task. They found that during learning, there is a reshaping of inter-areal connections, in which a visual-frontal subnetwork emerges as mice gain expertise. Also visual stimuli decoding became more widespread post-learning. They also perform silencing experiments and find that OFC and V2M are important for the learning process. The conclusion is that learning evoked a brain-wide dynamic interplay between different brain areas that together may promote learning.

      Strengths:

      The manuscript is written well and the logic is rather clear. I found the study interesting and of interest to the field. The recording method is innovative and requires exceptional skills to perform. The outcomes of the study are significant, highlighting that learning evokes a widespread and dynamics modulation between different brain areas, in which specific task-related subnetworks emerge.

      Weaknesses:

      I had several major concerns:

      (1) The number of mice was small for the ephys recordings. Although the authors start with 7 mice in Figure 1, they then reduce to 5 in panel F. And in their main analysis, they minimize their analysis to 6/7 sessions from 3 mice only. I couldn't find a rationale for this reduction, but in the methods they do mention that 2 mice were used for fruitless training, which I found no mention in the results. Moreover, in the early case, all of the analysis is from 118 CR trials taken from 3 mice. In general, this is a rather low number of mice and trial numbers. I think it is quite essential to add more mice.

      (2) Movement analysis was not sufficient. Mice learning a go/no-go task establish a movement strategy that is developed throughout learning and is also biased towards Hit trials. There is an analysis of movement in Figure S4, but this is rather superficial. I was not even sure that the 3 mice in Figure S4 are the same 3 mice in the main figure. There should be also an analysis of movement as a function of time to see differences. Also for Hits and FAs. I give some more details below. In general, most of the results can be explained by the fact that as mice gain expertise, they move more (also in CR during specific times) which leads to more activation in frontal cortex and more coordination with visual areas. More needs to be done in terms of analysis, or at least a mention of this in the text.

      (3) Most of the figures are over-detailed, and it is hard to understand the take-home message. Although the text is written succinctly and rather short, the figures are mostly overwhelming, especially Figures 4-7. For example, Figure 4 presents 24 brain plots! For rank input and output rank during early and late stim and response periods, for early and expert and their difference. All in the same colormap. No significance shown at all. The Δrank maps for all cases look essentially identical across conditions. The division into early and late time periods is not properly justified. But the main take home message is positive Δrank in OFC, V2M, V1 and negative Δrank in ThalMD and Str. In my opinion, one trio map is enough, and the rest could be bumped to the Supplementary section, if at all. In general, the figure in several cases do not convey the main take home messages. See more details below.

      (4) The analysis is sometimes not intuitive enough. For example, the rank analysis of input and output rank seemed a bit over complex. Figure 3 was hard to follow (although a lot of effort was made by the authors to make it clearer). Was there any difference between the output and input analysis? Also, the time period seems redundant sometimes. Also, there are other network analysis that can be done which are a bit more intuitive. The use of rank within the 10 areas was not the most intuitive. Even a dimensionality reduction along with clustering can be used as an alternative. In my opinion, I don't think the authors should completely redo their analysis, but maybe mention the fact that other analyses exist.

    3. Author response:

      Reviewer #1 (Public review):

      Weaknesses:

      The technical approach is strong and the conceptual framing is compelling, but several aspects of the evidence remain incomplete. In particular, it is unclear whether the reported changes in connectivity truly capture causal influences, as the rank metrics remain correlational and show discrepancies with the manipulation results.

      We agree that our functional connectivity ranking analyses cannot establish causal influences. As discussed in the manuscript, besides learning-related activity changes, the functional connectivity may also be influenced by neuromodulatory systems and internal state fluctuations. In addition, the spatial scope of our recordings is still limited compared to the full network implicated in visual discrimination learning, which may bias the ranking estimates. In future, we aim to achieve broader region coverage and integrate multiple complementary analyses to address the causal contribution of each region.

      The absolute response onset latencies also appear slow for sensory-guided behavior in mice, and it is not clear whether this reflects the method used to define onset timing or factors such as task structure or internal state.

      We believe this may be primarily due to our conservative definition of onset timing. Specifically, we required the firing rate to exceed baseline (t-test, p < 0.05) for at least 3 consecutive 25-ms time windows. This might lead to later estimates than other studies, such as using the latency to the first spike after visual stimulus onset (~50-60 ms, Siegle et al., Nature, 2023) or the time to half-max response (~65 ms, Goldbach et al., eLife, 2021).

      Furthermore, the small number of animals, combined with extensive repeated measures, raises questions about statistical independence and how multiple comparisons were controlled.

      We agree that a larger sample size would strengthen the robustness of the findings. However, as noted above, the current dataset has inherent limitations in both the number of recorded regions and the behavioral paradigm. Given the considerable effort required to achieve sufficient unit yields across all targeted regions, we wish to adjust the set of recorded regions, improve behavioral task design, and implement better analyses in future studies. This will allow us to both increase the number of animals and extract more precise insights into mesoscale dynamics during learning.

      The optogenetic experiments, while intended to test the functional relevance of rank increasing regions, leave it unclear how effectively the targeted circuits were silenced. Without direct evidence of reliable local inhibition, the behavioral effects or lack thereof are difficult to interpret.

      We appreciate this important point. Due to the design of the flexible electrodes and the implantation procedure, bilateral co-implantation of both electrodes and optical fibers was challenging, which prevented us from directly validating the inhibition effect in the same animals used for behavior. In hindsight, we could have conducted parallel validations using conventional electrodes, and we will incorporate such controls in future work to provide direct evidence of manipulation efficacy.

      Details on spike sorting are limited.

      We will provide more details on spike sorting, including the exact parameters used in the automated sorting algorithm and the subsequent manual curation criteria.

      Reviewer #2 (Public review):

      Weaknesses:

      I had several major concerns:

      (1) The number of mice was small for the ephys recordings. Although the authors start with 7 mice in Figure 1, they then reduce to 5 in panel F. And in their main analysis, they minimize their analysis to 6/7 sessions from 3 mice only. I couldn't find a rationale for this reduction, but in the methods they do mention that 2 mice were used for fruitless training, which I found no mention in the results. Moreover, in the early case, all of the analysis is from 118 CR trials taken from 3 mice. In general, this is a rather low number of mice and trial numbers. I think it is quite essential to add more mice.

      We apologize for the confusion. As described in the Methods section, 7 mice (Figure 1B) were used for behavioral training without electrode array or optical fiber implants to establish learning curves, and an additional 5 mice underwent electrophysiological recordings (3 for visual-based decision-making learning and 2 for fruitless learning).

      As we noted in our response to Reviewer #1, the current dataset has inherent limitations in both the number of recorded regions and the behavioral paradigm. Given the considerable effort required to achieve high-quality unit yields across all targeted regions, we wish to adjust the set of recorded regions, improve behavioral task design, and implement better analyses in future studies. These improvements will enable us to collect data from a larger sample size and extract more precise insights into mesoscale dynamics during learning.

      (2) Movement analysis was not sufficient. Mice learning a go/no-go task establish a movement strategy that is developed throughout learning and is also biased towards Hit trials. There is an analysis of movement in Figure S4, but this is rather superficial. I was not even sure that the 3 mice in Figure S4 are the same 3 mice in the main figure. There should be also an analysis of movement as a function of time to see differences. Also for Hits and FAs. I give some more details below. In general, most of the results can be explained by the fact that as mice gain expertise, they move more (also in CR during specific times) which leads to more activation in frontal cortex and more coordination with visual areas. More needs to be done in terms of analysis, or at least a mention of this in the text.

      Due to the limitation in the experimental design and implementation, movement tracking was not performed during the electrophysiological recordings, and the 3 mice shown in Figure S4 were from a separate group. We have carefully examined the temporal profiles of mouse movements and found it did not fully match the rank dynamics, and we will add these results and related discussion in the revised manuscript. However, we acknowledge that without synchronized movement recordings in the main dataset, we cannot fully disentangle movement-related neural activity from task-related signals. We will make this limitation explicit in the revised manuscript and discuss it as a potential confound, along with possible approaches to address it in future work.

      (3) Most of the figures are over-detailed, and it is hard to understand the take-home message. Although the text is written succinctly and rather short, the figures are mostly overwhelming, especially Figures 4-7. For example, Figure 4 presents 24 brain plots! For rank input and output rank during early and late stim and response periods, for early and expert and their difference. All in the same colormap. No significance shown at all. The Δrank maps for all cases look essentially identical across conditions. The division into early and late time periods is not properly justified. But the main take home message is positive Δrank in OFC, V2M, V1 and negative Δrank in ThalMD and Str. In my opinion, one trio map is enough, and the rest could be bumped to the Supplementary section, if at all. In general, the figure in several cases do not convey the main take home messages. See more details below.

      We thank the reviewer for this valuable critique. The statistical significance corresponding to the brain plots (Figure 4 and Figure 5) was presented in Figure S3 and S5, but we agree that the figure can be simplified to focus on the key results. In the revised manuscript, we will condense these figures to focus on the most important comparisons and relocate secondary plots to the Supplementary section. This will make the visual presentation more concise and the take-home message clearer.

      (4) The analysis is sometimes not intuitive enough. For example, the rank analysis of input and output rank seemed a bit over complex. Figure 3 was hard to follow (although a lot of effort was made by the authors to make it clearer). Was there any difference between the output and input analysis? Also, the time period seems redundant sometimes. Also, there are other network analysis that can be done which are a bit more intuitive. The use of rank within the 10 areas was not the most intuitive. Even a dimensionality reduction along with clustering can be used as an alternative. In my opinion, I don't think the authors should completely redo their analysis, but maybe mention the fact that other analyses exist

      We appreciate the reviewer’s comment. In brief, the input- and output-rank analyses yielded largely similar patterns across regions in CR trials, although some differences were observed in certain areas (e.g., striatum in Hit trials) where the magnitude of rank change was not identical between input and output measures. We agree that the division into multiple time periods sometimes led to redundant results; we will combine overlapping results in the revision to improve clarity.

      We did explore dimensionality reduction applied to the ranking data. However, the results were not intuitive and required additional interpretation, which did not bring more insights. Still, we acknowledge that other analysis approaches might provide complementary insights. While we do not plan to completely reanalyze the dataset at this stage, we will include a discussion of these alternative methods and their potential advantages in the revised manuscript.

      Reviewer #3 (Public review):

      Weaknesses:

      The weakness is also related to the strength provided by the method. It is demonstrated in the original method that this approach in principle can track individual units for four months (Luan et al, 2017). The authors have not showed chronically tracked neurons across learning. Without demonstrating that and taking advantage of analyzing chronically tracked neurons, this approach is not different from acute recording across multiple days during learning. Many studies have achieved acute recording across learning using similar tasks. These studies have recorded units from a few brain areas or even across brain-wide areas.

      We appreciate the reviewer’s important point. We did attempt to track the same neurons across learning in this project. However, due to the limited number of electrodes implanted in each brain region, the number of chronically tracked neurons in each region was insufficient to support statistically robust analyses. Concentrating probes in fewer regions would allow us to obtain enough units tracked across learning in future studies to fully exploit the advantages of this method.

      Another weakness is that major results are based on analyses of functional connectivity that is calculated using the cross-correlation score of spiking activity (TSPE algorithm). Functional connection strengthen across areas is then ranked 1-10 based on relative strength. Without ground truth data, it is hard to judge the underlying caveats. I'd strongly advise the authors to use complementary methods to verify the functional connectivity and to evaluate the mesoscale change in subnetworks. Perhaps the authors can use one key information of anatomy, i.e. the cortex projects to the striatum, while the striatum does not directly affect other brain structures recorded in this manuscript

      We agree that the functional connectivity measured in this study relies on statistical correlations rather than direct anatomical connections. We plan to test the functional connection data with shorter cross-correlation delay criteria to see whether the results are consistent with anatomical connections and whether the original findings still hold.

    1. Reviewer #1 (Public review):

      Summary:

      This study by Akhtar et al. aims to investigate the link between systemic metabolism and respiratory demands, and how sleep and the circadian clock regulate metabolic states and respiratory dynamics. The authors leverage genetic mutants that are defective in sleep and circadian behavior in combination with indirect respirometry and steady-state LC-MS-based metabolomics to address this question in the Drosophila model.

      First, the authors performed respirometry (on groups of 25 flies) to measure oxygen consumption (VO2) and carbon dioxide production (VCO2) to calculate the respiratory quotient (RQ) across the 24-hour day (12h:12h light-dark cycle) and assess metabolic fuel utilization. They observed that among all the genotypes tested, wild type (WT) flies and per0 flies in LD and WT flies in DD exhibit RQ >1. They concluded the >1 RQ is consistent with active lipogenesis. In contrast, the short-sleep mutants fumin (fmn) and sleepless (sss) showed significantly different RQ; the fmn exhibits a slight reduction in RQ values, suggesting increased reliance on carbohydrate metabolism, while sss exhibits even lower RQ (0.94), consistent with a shift toward lipid and protein catabolism.

      The authors then proceeded to bin these measurements in 12-hour partitions, ZT0-12 and ZT12-24, to assess diurnal differences in average values of VO2, VCO2, and RQ. They observed significant day-night differences in metabolic rates in WT-LD flies, with higher rates during the day. The diurnal differences remain in the short-sleep mutants, but the overall metabolic rates are higher. WT-DD flies exhibit the lowest respiratory activity, although the day-night differences remain in free-running conditions. Finally, per01 mutants exhibit no significant change in day-night respiratory rates, suggesting that a functional circadian clock is necessary for diurnal differences in metabolic rates.

      They then performed finer-resolution 24-hour rhythmic analysis (RAIN and JTK) to determine if VO2, VCO2, and RQ exhibit 24-hour rhythmic and if there are genotype-specific differences. Based on their criteria, VCO2 is rhythmic in all conditions tested, while VO2 is rhythmic in all conditions except in fmn-LD. Finally, RQ is rhythmic in all 3 mutants but not in WT-LD and WT-DD. Peak phases for the rhythms were deduced using JTK lag values.

      The authors proceeded to leverage a previously published steady-state metabolite dataset to investigate the potential association of RQ with metabolite profiles. Spearman correlation was performed to identify metabolites that exhibit coupling to respiratory output. Positive and negative lag analysis were subsequently performed to further characterize these associations based on the timing of the metabolite peak changes relative to RQ fluctuations. The authors suggest that a positive lag indicates that metabolite changes occur after shifts in RQ, and a negative lag signifies that metabolite changes precede RQ changes. To visualize metabolic pathways that exhibit these temporal relationships, a clustered heatmap and enrichment analysis were performed. Through these analyses, they concluded that both sleep and circadian systems are essential for aligning metabolic substrate selection with energy demands, and different metabolic pathways are misregulated in the different mutants with sleep and circadian defects.

      Strength:

      The research questions this study explores are significant, given that metabolism and respiratory demand are central to animal biology. The experimental methods used, including the well-characterized fly genetic mutants, the newly developed method for indirect calorimetry measurements, and LC-MS-based metabolomics, are all appropriate. This study provides insights into the impact of sleep and circadian rhythm disruption on metabolism and respiratory demand and serves as a foundation for future mechanistic investigations.

      Weaknesses:

      There are some conceptual flaws that the authors need to address regarding circadian biology, and some of the conclusions can be better supported by additional analysis to provide a stronger foundation for future functional investigation. At times, the methods, especially the statistical analysis, are not well articulated; they need to be better explained.

    2. Reviewer #2 (Public review):

      This is an innovative and technically strong study that integrates dual-gas respirometry with LC-MS metabolomics to examine how sleep and circadian disruption shape metabolism in Drosophila. The combination of continuous O₂/CO₂ measurements with high-temporal-resolution metabolite profiling is novel and provides fresh insight into how wild-type flies maintain anticipatory fuel alignment, while mutants shift to reactive or misaligned metabolism. The use of lag-shift correlation analysis is particularly clever, as it highlights temporal coordination rather than static associations. Together, the findings advance our understanding of how circadian clocks and sleep contribute to metabolic efficiency and redox balance.

      However, there are several areas where the manuscript could be strengthened. The authors should acknowledge that their findings may be gene-specific. Because sleep deprivation was not performed, it remains uncertain whether the observed metabolic shifts generalize to sleep loss broadly or are restricted to the fmn and sss mutants. This concern also connects to the finding of metabolic misalignment under constant darkness despite an intact clock. The conclusion that external entrainment is essential for maintaining energy homeostasis in flies may not translate to mammals. It would help to reference supporting data for the finding and discuss differences across species. Ideally, complementary circadian (light-dark cycle disruption) or sleep deprivation (for several hours) experiments, or citation of comparable studies, would strengthen the generality of the findings. Figures 1-4 are straightforward and clear, but when the manuscript transitions to the metabolite-respiration correlations, there is little description of the metabolomics methods or datasets, which should be clarified. The Discussion is at times repetitive and could be tightened, with the main message (i.e., wild-type flies align metabolism in advance, while mutants do not) kept front and center. Terms such as "anticipatory" and "reactive" should be defined early and used consistently throughout.

      Overall, this is a strong and novel contribution. With clarification of scope, refinement of presentation, and a more focused Discussion, the paper will make a significant impact.

    3. Reviewer #3 (Public review):

      Summary:

      The authors investigate how sleep loss and circadian disruption affect whole-organism metabolism in Drosophila melanogaster. They used chamber-based flow-through respirometry to measure oxygen consumption and carbon dioxide production in wild-type flies and in mutants with impaired sleep or circadian function. These measurements were then integrated with a previously published metabolomics dataset to explore how respiratory dynamics align with metabolic pathways. The central claim is that wild-type flies display anticipatory coordination of metabolic processes with circadian time, while mutants exhibit reactive shifts in substrate use, redox imbalance, and signs of mitochondrial stress.

      Strengths:

      The study has several strengths. Continuous high-resolution respirometry in flies is challenging, and its application across multiple genotypes provides good comparative insight. The conceptual framework distinguishing anticipatory from reactive metabolic regulation is interesting. The translational framing helps place the work in a broader context of sleep, circadian biology, and metabolic health.

      Weaknesses:

      At the same time, the evidence supporting the conclusions is somewhat limited. The metabolomics data were not newly generated but repurposed from prior work, reducing novelty. The biological replication in the respirometry assays is low, with only a small number of chambers per genotype. Importantly, respiratory parameters in flies are strongly influenced by locomotor activity, yet no direct measurements of activity were included, making it difficult to separate intrinsic metabolic changes from behavioral differences in mutants. In addition, repeated claims of "mitochondrial stress" are not directly substantiated by assays of mitochondrial function. The study also excluded female flies entirely, despite well-documented sex differences in metabolism, which narrows the generality of the findings.

    1. The feminist movement also grew in the 1960s. Women were active in both the civil rights movement and the labor movement, but their increasing awareness of gender inequality led women began to form a movement of their own.

      Women’s experiences in other social movements made them recognize their own unequal treatment what specific events or issues pushed them to start the feminist movement?

    2. Diem’s government, however, lacked popular support and could not contain the communist insurgency seeking the reunification of Vietnam. The U.S. provided weapons and support, but South Vietnam failed to defeat Vietcong insurgents.

      Diem’s weak leadership and lack of public support made South Vietnam unstable. Despite U.S. military aid, the Vietcong’s determination and connection with locals helped them gain the upper hand, showing the limits of American influence in Vietnam.

    1. The impact of AI should also be considered at the more global level of managing organizations and non-medical staff. Areas affected include patient triage in the emergency room and the management and distribution of human resources across different services. This is where organizational ethics comes in, with human resources management and social dialogue figuring as major concerns. Indeed, in the health sector, the layers of the social fabric are particularly thick, diverse, and interwoven: changes in a healthcare institution affect many, if not all, of its workers, with major repercussions in the lives of users and patients too. The care of individuals who interact with medical assistants or diagnostic applications is also shifting. Thus, such “evolutions, introduced in a too radical and drastic way, damage the social fabric of a society” [120]. Moreover, these transformations also blur the boundary between work and private life and alter the link between the company and its employees, both old and new [140].

      AI affects everyone from patients to healthcare workers to society. When new evolutions are introduced too quickly, they can harm the social fabric. It reminds me of how the internet changed society after COVID hit. It became our main way to work and learn, but it also took away a lot of real human connection. Instead of hanging out or talking face-to-face, we started relying on screens and text messages for almost everything.

    2. Healthcare systems, professionals, and administrators will all be impacted by the implantation of AI systems. The first impact consists in the transformation of tasks. The integration of AI is transforming professional tasks, creating new forms of work [131], and forcing a readjustment of jobs (e.g., changing roles and tasks, modifying professional identities, evolving of professional accountability). For the WHO, readjusting to workplace disruption appears to be a necessary consequence of the ethical principle of “sustainability” identified by the committee of experts on the deployment of AI. In particular, governments and companies should consider “potential job losses due to the use of automated systems for routine healthcare functions and administrative tasks” [27]. Image recognition, for example, makes radiology one of the most advanced specialties in AI system integration [132]. AI is now able to “automate part of conventional radiology” [133], reducing the diagnostic tasks usually assigned to the radiologist. The authors of the French strategy report believe that this profession could then “evolve towards increased specialization in interventional radiology for diagnostic purposes (punctures, biopsies, etc.) for complex cases or therapeutic purposes guided by medical imaging” [133]. The practice of electrocardiograms in cardiology [133] or that of dentists in their routine and laborious tasks [134] is already undergoing upheaval. The field of general medicine is also being impacted by applications available to the public, such as “medical assistant” chatbots that can analyze users’ symptoms and direct them to a specialist or pharmacist. In the case of minor ailments, such technologies de facto diminish the role of the general practitioner.

      AI is damaging healthcare and rearranging how the career looks. WHO says this adjustment is part of keeping healthcare “sustainable,” but mentions that it could also lead to unemployment. Radiology is one of the most affected areas because it isn't as hands-on as most healthcare careers.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      (...) The study describes meticulously conducted and controlled experiments, showing the impressive biochemistry work consistently produced by this group. The statistical analysis and data presentation are appropriate, with the following major comments noted:

      Response: We thank the reviewer for their thoughtful and constructive review of our manuscript. We appreciate the positive comments on our experimentation.

      Major comments

      1. Please clarify why K8ac/K12ac, K5ac/K16ac, K5ac/K12ac are not quantified (Figure 3). If undetected, state explicitly and annotate figures with "n.d." rather than leaving gaps. If detected but excluded, justify the exclusion.

      Response: We restricted ourselves to mapping those diacetylated motifs that can be readily identified by MS2. The characteristic ions of the d3-labeled and endogenous acetylated peptides in the MS2 spectra could not differentiate the diacetylated forms mentioned by the reviewer. Rather than expanding the figure with non-informative rows we amended the legend of figure 3 accordingly "Diacetylated forms K8-K12, K5-K16, K5-K12 could not be distinguished from each other by MS2 and were thus not included in the analysis".

      The statement "Nevertheless, combinations of di- and triacetylation were much more frequent if K12ac was included, suggesting that K12 is the primary target." is under-supported because only two non-K12ac combinations are shown, and only one is lower than K12ac-containing combinations. Either soften the claim ("trend toward ... in our dataset") or expand the analysis to all observed di/tri combinations with effect sizes, n, and statistical tests.

      Response: The reviewer is right our statement does properly reflect the data. It rather seems that combinations lacking K12ac are considerably less frequent (K5K8K16 tri-ac, K5K8 di-ac). We now modified the sentence as follows: "Peptides lacking K12ac were less frequent, suggesting that K12 is a primary target".

      Please provide a more detailed discussion about the known nature of NU9056 inhibition and how it fits or doesn't fit with your data. Are there any structural studies on this?

      Response: Unfortunately, NU9056 is very poorly described, neither the mode of interaction with Tip60 nor the mechanism of inhibition are known. The specificity of the chemical has not really been shown, but nevertheless it is used as a selective Tip60 inhibitor in several papers which is why we picked it in the first place. Our conclusions on the inhibitor are in the last paragraph of the discussion: "The fact that acetylation of individual lysines is inhibited with different kinetics argues against a mechanism involving competition with acetyl-CoA, but for an allosteric distortion of the catalytic center." We think that any further interpretation would likely be considered an overstatement.

      Why was the inhibitor experiment MS only performed for H2A.V and not H2A? Given the clear H2A vs H2A.V differences reported in Fig. 2, it would be useful to have the matched data for H2A.

      Response: In these costly mass spec experiments we strive to balance limited resources and most informative output. Because H2A.V and H4 are the major functional targets of Tip60, we considered that documenting the effect of the inhibitor on these substrates would be most appropriate. In hindsight, including H2A would have been nice to have, but would not change our conclusions about the inhibitor.

      The inhibitor observations are very interesting as they can highlight systems to study the loss of specific acetyl residues: can the authors perform WB/IF validation in treated cells? I understand it will not be possible with the H2A antibodies, but the difference in H4K5ac vs H4K12ac should be possible to validate in cells

      Response: We attempted to monitor changes of histone modifications upon treatment of cells with NU9056 by immunoblotting. Probing H4K5 and K12, the results were variable. We also observed occasionally that acetylation of H4K5 and H4K12 was slightly diminished in whole cell extracts, but not in nuclear extracts. This reminded us that diacetylation of H4 at K5 and K12 is a feature of cytoplasmic H4 in complex with chaperones, a mark that is placed by HAT1 (Aguldo Garcia et al., DOI: 10.1021/acs.jproteome.9b00843; Varga et al., DOI: 10.1038/s41598-019-54497-0). The observed proliferation arrest by NU9056 may thus affect chromatin assembly and indirectly K5K12 acetylation. H4K12 is also acetylated by chameau (Chm).

      We observed a reduction of acetylated H4K16 and H2A.V. H4K16 is not a preferred target of Tip60, but Tip60 acetylates MSL1 and MBDR2, two subunits of the NSL1 complex (Apostolou et al. DOI: 10.1101/2025.07.15.664872). We, therefore, consider that effects on H4 acetylation upon NU9056 treatment may at least partially be affected indirectly. Because we are not confident about the data and because our manuscript emphasizes the direct, intrinsic specificity of Tip60, we refrain from showing the corresponding Western blots.

      You highlight that H2AK10 (a major TIP60 site here) is not conserved in human canonical H2A. Please expand the discussion of the potential function and physiological relevance. Maybe in relation to H2A.V being a fusion of different human variants?

      Response: The reviewer noted an interesting aspect of the evolution of the histone H2A variants. It turns out that H2A.Z is the more ancient variant, from which H2A derived by mutation. H2A.Z/H2A.V sequences are more conserved than H2A sequences. We summarized these evolutionary notions in Baldi and Becker (DOI: 10.1007/s00412-013-0409-x). In the context of the question, this means that mammalian H2A.Z, Drosophila H2A.V and mammalian H2A still contain the ancient sequence (lacking K10), and Drosophila H2A acquired K10 by mutation. The evolutionary advantage associated with this mutation in unclear. We now added a small paragraph summarizing these ideas on page 13 of the (changes tracked in red).

      To enable direct comparisons between variants and residues, please match y-axis scales where the biology invites comparison (e.g., H2A vs H2A.V; Figs. 2-3).

      Response: We adjusted the Y-axes in Figure 2 and 3 to facilitate direct comparisons, where such comparison is informative.

      Minor comments

      1. Add 1-2 sentences in the abstract on the gap in the field being addressed by the study.

      Response: We are grateful for this suggestion and have expanded the abstract accordingly (changes tracked in red).

      Either in the introduction or discussion, comment on your prior Tip60 three-subunit data (Kiss et al.). The three-subunit complex was significantly less active on H4, as indicated in that publication, which is likely due to the absence of Eaf6.

      Response: We thank the reviewer for the opportunity to emphasize this point. Motivated by findings in the yeast and mammalian systems that Eaf6 was important for acetylation, we added this subunit to our previously reconstituted 3-subunit 'piccolo' complex. As can be seen by the comparison of the older data (Kiss et al.) and the new data, the 4-subunit TIP60 core complex is a much more potent HAT. We amended the introduction (see marked text) accordingly. We also added a paragraph on what is known about the properties and function of Eaf6 to the discussion.

      3a. Text references Fig.1E before Fig.1C, please reorder

      Response: We deleted the premature mentioning of Figure 1E and added the following explanation to the relevant panels in Figure 1: "The blot was reprobed with an antibody detecting H3 as an internal standard for nucleosome input."

      3b. Fig.1B/C legend labels appear swapped.

      Response: We thank the reviewer for spotting the swap. We corrected the figure legend.

      3c. Fig.1E, 4A, 4B: add quantification

      Response: We quantified each acetylation level, and added to the relevant panel of Figure 1 and 4 the following phrase: "The quantified levels of each acetylation mark over H3 are shown below each plot." Notably, the difference in acetylation signal strength between the two antibodies highlights the inherent variability of antibody-based detection.

      3d. Fig.2A: Note explicitly that K5-K10 and K8-K10 are unresolvable pairs to explain the shading scheme used.

      Response: The legend of Figure 2A now includes the following sentence. "Peptides that are diacetylated at either K5/K10 or K8/K10 cannot be resolved by MS2. The last row reminds of this fact by the patterning of boxes and displays the combined values."

      Ensure consistent KAT5/TIP60 naming.

      Response: Our naming follows this logic: We use 'Tip60' for the Drosophila protein and 'TIP60' for the Drosophila 'piccolo' or 'core' complexes. The mammalian protein is referred to by the capital acronym TIP60, as is established in the literature. We use KAT5/TIP60 according to the unified nomenclature in the introduction and parts of the discussion, when we refer to the enzymes in more general terms, independent of species. We scrutinized the manuscript again and made a few changes to adhere to the above scheme.

      Consider moving the first two Discussion paragraphs (field context and challenges in antibody-based detection) into the Introduction to better frame the significance.

      Response: We thank the reviewer for this suggestion that improved the manuscript a lot. We incorporated the first two paragraphs of the discussion into the introduction.

      Significance

      This is a valuable and timely study for the histone acetylation field. The substrate specificity of many individual HATs remains incompletely understood owing to (i) cross-reactivity and limited selectivity of many anti-acetyl-lysine antibodies, (ii) functional redundancy among KATs, (iii) variability across in-vitro assays (HAT domain vs full-length/complex; free histones vs oligonucleosomes), and (iv) incomplete translation of in-vitro specificity to in-vivo settings. These factors have produced conflicting reports in the literature. By combining quantitative mass spectrometry with carefully engineered oligonucleosomal arrays, the authors make a principal step toward deconvoluting TIP60 biology in a controlled yet close-to-physiologically relevant system. Conceptually, the work delineates intrinsic, site-specific preferences of the TIP60 core on variant versus canonical nucleosomes, consistent with largely distributive behaviour and site-dependent inhibitor sensitivity. The inhibitor-dependent shifts in acetylation patterns are particularly intriguing and could enable dissection of residue-specific functions, with potential translational implications for preclinical cancer research and biomarker development. Overall, this manuscript will be of interest to the chromatin community, and I am supportive of publication pending satisfactory resolution of the points raised above.

      Response: Once more we thank the reviewer for their time and efforts devoted to help us improve the manuscript.


      Reviewer #2

      Major comments

      (...) A central limitation of the study, noted by the authors, is the uncertainty regarding the biological relevance of the findings. While the in vitro system provides a controlled framework for analyzing residue specificity and kinetics, it does not address the functional significance of these results in a cellular or organismal context. This limitation is outside the scope of the current work but indicates potential directions for follow-up studies. Within its defined objectives, the study presents a methodological framework and dataset that contribute to understanding TIP60 activity in a biochemical setting.

      Response: We agree with the referee.

      Minor comments

      While the manuscript is clearly presented overall, there are two minor issues that could be addressed:

      1. In Figure 1, the panels are not ordered according to their appearance in the Results section. In addition, the legends for Figures 1B and 1C appear to be swapped.

      Response: We thank the reviewer for spotting these oversights. We deleted the premature mentioning of Figure 1E and added the following explanation to the relevant panels in Figure 1: "The blot was reprobed with an antibody detecting H3 as an internal standard for nucleosome input." We also swapped the legends.

      For the quantitative MS data (N = 2 biological replicates), the phrasing "Error bars represent the two replicate values" could be refined. With N = 2, showing individual data points or the range may convey the information more transparently than conventional error bars, which are typically associated with statistical measures (e.g., SEM) from larger sample sizes. Alternatively, a brief note explaining the choice to use two replicates and represent them with error bars could be added.

      Response: We appreciate the reviewer's comment and have revised the figure to display individual data points for the two biological replicates instead of error bars, providing a clearer representation of the data distribution. We changed the phrasing 'Error bars represent...' to "Bars represent the mean of two biological replicates (each consisting of two TIP60 core complexes and two nucleosome arrays - each analyzed with two technical replicates), with individual replicate values shown as open circles." and hope that this describes the data better.

      Significance

      Krause and colleagues, using a clean in vitro system, define the substrate specificity of the Drosophila TIP60 core complex. They identify the main acetylation sites and their kinetic dynamics on H2A, H2A.V, and H4 tails, and further characterize the inhibitory activity of NU9056. This work addresses a longstanding question in the field and provides compelling evidence to support its conclusions. Future studies will be needed to establish the biological relevance of these findings.

      Response: We thank the reviewer for a thoughtful and constructive review of our manuscript. We appreciate the suggestions that helped to improve the manuscript.


      Reviewer #3

      (...) However, the authors should revisit some additional points:

      Major comments:

      1. The Tip60 core complex is usually described as containing three subunits: Tip60, Ing3 and E(Pc). The authors also included Eaf6 in their analysis, however, their motivation to include Eaf6 specifically remains unclear. They should explain in the manuscript why Eaf6 was included and how this could affect the observed acetylation pattern.

      Response: We thank the reviewer for the opportunity to emphasize this point. Motivated by findings in the yeast and mammalian systems that Eaf6 was important for acetylation, we added this subunit to our previously reconstituted 3-subunit piccolo complex. As can be seen by the comparison of the older data (ref Kiss) and the new data, the 4-subunit Tip60 core complex is a much more potent HAT. We amended the introduction accordingly. We also added a paragraph on what is known about the properties and function of Eaf6 to the discussion. Please see the amended text marked in red.

      The authors investigated the effectiveness of two Tip60 inhibitors by testing their effects on H4K12ac using an antibody. They state that "TH1834 had no detectable effect on either complex [Tip60 or Msl], even at very high concentrations." However, the initial publication describing TH1834 also stated that this inhibitor particularly affected H2AX with not direct effect on H4 acetylation. The authors should revisit TH1834 and specifically investigate its effect on H2A and, in particular, on H2Av as H2Av is the corresponding ortholog of H2AX.

      Response: The case of TH1834 is not very strong in the literature, which is why we discontinued the line of experimentation when we did not see any effect of TH1834 (2 different batches) on the preferred substrate. The reviewer's suggestion is very good, but given our limited resources we decided to remove the data and discussion of TH1834 from the manuscript (old Figure 4A). The deletion of these very minor data does not diminish the overall conclusion and significance of the manuscript.

      The authors performed a detailed analysis of NU9056 effects. However, they did not include effects on H2A. H2A is distinct from H4 and H2Av as it is the only one containing K10 and this lysine also showed high levels of acetylation by Tip60. Therefore, a comprehensive analysis of Nu9056 effects should include analyzing its effects on H2A acetylation.

      Response: In these costly mass spec experiments, we strive to balance limited resources and most informative output. Because H2A.V and H4 are the major functional targets of Tip60, we considered that documenting the effect of the inhibitor on these substrates would be most appropriate. In hindsight, including H2A would have been nice to have, but would not change our conclusions about the inhibitor.

      The authors have previously reported non-histone substrates of Tip60. It would be interesting to test whether the two investigated Tip60 inhibitors affect acetylation of non-histone substrates of Tip60. This analysis would greatly increase the understanding of how selective these inhibitors are. (OPTIONAL)

      Response: We agree with the reviewer that the proposed experiments may be an interesting extension of our current work. However, the Becker lab will be closed down by the end of this year due to retirement, precluding major follow-up studies at this point.

      __ Minor comments: __

      1. Fig. 1 a: instead of "blue residues", would be more accurate to refer to "blue arrows"?

      Response: Yes of course - the text has been revised accordingly.

      Fig.1 b-c: it would be helpful to include which staining (silver/Ponceau?) was performed here.

      Response: The legends now contain the relevant information.

      Fig. 2a: I did not understand the shading for the K5/K8-K10ac panel from the figure legend. The explanation is present in the main text but would be helpful in the figure legend to allow easy access for readers.

      Response: We agree and revised text accordingly.

      Fig. 4 c: bar graphs on the top: the X-values are missing.

      Response: The figure has been revised accordingly.

      This sentence in the discussion seems to require revision: "Whereas the replication-dependent H2A resides in most nucleosomes in the genome, H2A.V, the only H2A variant histone in Drosophila, is incorporated by exchange of H2A, independent of replication."

      Response: We revised the sentence as follows to improve clarity. "While the replication-dependent H2A is present in most nucleosomes across the genome, H2A.V, the only H2A variant in Drosophila, is incorporated through replication-independent exchange of H2A."

      In this sentence: "A comparison with the TIP60 core complex is instructive since both enzymes are MYST acetyltransferases and bear significant similarity in their catalytic center." do the authors mean "informative" rather than "instructive"?

      Response: We replaced 'instructive' by 'informative.

      Significance

      The findings are novel and expand our knowledge of Tip60 histone tail acetylation dynamics and specificity. The manuscript does not address the biological relevance of distinct acetylation marks, which is clearly beyond the scope of the study, but discuss their relevance where possible. The analysis of NU9056 is informative and relevant in a broad context. Optionally, the authors could expand their analysis of NU9056 on its effects on non-histone Tip60 targets to increase impact further. Their analysis of TH1834, however, is currently insufficient as they focused on H4 acetylation alone, which has already been reported to not be affected by TH1834. The authors should include an analysis of TH1834 effects on H2A and H2A.V acetylation. The manuscript is well written, easy to follow and of appropriate length. The methods are elegant and the findings of the study are novel. The manuscripts targets researchers specifically interested in chromatin remodeling as well as a broader audience using the Tip60 inhibitor NU9056.

      Response: We thank the reviewer for their profound assessment and the general appreciation of our work. We agree that the analysis of the TH1834 is not satisfactory at this point and have removed the corresponding data and description from figure 4. The deletion of these very minor data does not diminish the overall conclusion and significance of the manuscript.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In their manuscript Krause et al investigate Tip60 selectivity on histone tail acetylation. They use elegant mass spectrometry analysis to analyze lysine acetylation marks and combination of acetylation marks of histone tails of the Tip60 targets H2A, H2A.V and H4. They further consider distinct dynamics by performing a time course experiment and compare Tip60 to MOF. Using these methods, the authors describe interesting and previously undescribed selectivity, dynamics and di-acetylation patterns of Tip60 that will be the starting point of follow-up studies diving into the biological relevance of these findings. Lastly, they investigate the effects of two Tip60 inhibitors and characterize the effects of NU9056 on Tip60 histone tail acetylation in detail. These studies showed that NU9056 has selective effects, impacting some lysine acetylations with greater efficiency than others. As antibodies available to investigate histone acetylations affected by NU9056 are not selective enough, these findings are relevant for any applicant of NU9056.

      However, the authors should revisit some additional points:

      Major comments:

      1. The Tip60 core complex is usually described as containing three subunits: Tip60, Ing3 and E(Pc). The authors also included Eaf6 in their analysis, however, their motivation to include Eaf6 specifically remains unclear. They should explain in the manuscript why Eaf6 was included and how this could affect the observed acetylation pattern
      2. The authors investigated the effectiveness of two Tip60 inhibitors by testing their effects on H4K12ac using an antibody. They state that "TH1834 had no detectable effect on either complex [Tip60 or Msl], even at very high concentrations." However, the initial publication describing TH1834 also stated that this inhibitor particularly affected H2AX with not direct effect on H4 acetylation. The authors should revisit TH1834 and specifically investigate its effect on H2A and, in particular, on H2Av as H2Av is the corresponding ortholog of H2AX.
      3. The authors performed a detailed analysis of NU9056 effects. However, they did not include effects on H2A. H2A is distinct from H4 and H2Av as it is the only one containing K10 and this lysine also showed high levels of acetylation by Tip60. Therefore, a comprehensive analysis of Nu9056 effects should include analyzing its effects on H2A acetylation.
      4. The authors have previously reported non-histone substrates of Tip60. It would be interesting to test whether the two investigated Tip60 inhibitors affect acetylation of non-histone substrates of Tip60. This analysis would greatly increase the understanding of how selective these inhibitors are. (OPTIONAL)

      Minor comments:

      1. Fig. 1 a): instead of "blue residues", would be more accurate to refer to "blue arrows"?
      2. Fig.1 b-c): it would be helpful to include which staining (silver/Ponceau?) was performed here
      3. Fig. 2a): I did not understand the shading for the K5/K8-K10ac panel from the figure legend. The explanation is present in the main text but would be helpful in the figure legend to allow easy access for readers.
      4. Fig. 4 c) bar graphs on the top: the X-values are missing.
      5. This sentence in the discussion seems to require revision: "Whereas the replication-dependent H2A resides in most nucleosomes in the genome, H2A.V, the only H2A variant histone in Drosophila, is incorporated by exchange of H2A, independent of replication."
      6. In this sentence: "A comparison with the TIP60 core complex is instructive since both enzymes are MYST acetyltransferases and bear significant similarity in their catalytic center." do the authors mean "informative" rather than "instructive"?

      Significance

      The findings are novel and expand our knowledge of Tip60 histone tail acetylation dynamics and specificity. The manuscript does not address the biological relevance of distinct acetylation marks, which is clearly beyond the scope of the study, but discuss their relevance where possible. The analysis of NU9056 is informative and relevant in a broad context. Optionally, the authors could expand their analysis of NU9056 on its effects on non-histone Tip60 targets to increase impact further. Their analysis of TH1834, however, is currently insufficient as they focused on H4 acetylation alone, which has already been reported to not be affected by TH1834. The authors should include an analysis of TH1834 effects on H2A and H2A.V acetylation.

      The manuscript is well written, easy to follow and of appropriate length. The methods are elegant and the findings of the study are novel. The manuscripts targets researchers specifically interested in chromatin remodeling as well as a broader audience using the Tip60 inhibitor NU9056.

      My expertise: I am a researcher working with Drosophila melanogaster and have published on the functions of the Tip60-p400 complex. I do not have extensive expertise in nucleosome arrays, the major method applied in this manuscript.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      In this study, Krause and colleagues investigate the intrinsic substrate selectivity of the four-subunit TIP60 core module from Drosophila melanogaster using synthetic nucleosome arrays. To quantitatively assess acetylation at individual lysines on histones H2A, the variant H2A.V, and H4, the authors employ targeted mass spectrometry, thereby overcoming the limitations of antibody-based approaches. Contrary to earlier reports, their results reveal that the TIP60 core complex displays a selective lysine acetylation pattern, with distinct kinetics toward specific residues on each histone tail. For example, H2A lysines K5, K8, and K10 were acetylated, with K10 exhibiting the highest modification levels. On H2A.V, K4 and K7 were modified, with K7 showing greater initial efficiency. For H4, K12 was identified as the primary target, and its acetylation was further enhanced in the presence of H2A.V. The study also examined the activity of the KAT5 inhibitor NU9056, uncovering variable inhibition across different acetylation sites. Overall, the authors conclude that intrinsic substrate selectivity is central to understanding the mechanism of Tip60 activity and that the presence of H2A variants can modulate both the efficiency and specificity of acetylation.

      Major comments:

      The study by Krause et al. examines the in vitro substrate selectivity of the Drosophila TIP60 core complex and the lysine-specific effects of the inhibitor NU9056. The authors use a defined in vitro system with recombinant proteins and nucleosome arrays, together with targeted mass spectrometry, to assess intrinsic enzyme activity while avoiding potential issues of antibody specificity and avidity. Heatmaps and bar plots derived from the MS data show site-specific acetylation patterns and the effects of the inhibitor. A comparative analysis with the MSL core complex, which has a well-characterized selectivity, is used as a reference point for interpreting the specificity of TIP60. The observation that NU9056 exhibits different levels of effectiveness on individual lysines, including residues within the same histone tail, is supported by the quantitative MS measurements. A central limitation of the study, noted by the authors, is the uncertainty regarding the biological relevance of the findings. While the in vitro system provides a controlled framework for analyzing residue specificity and kinetics, it does not address the functional significance of these results in a cellular or organismal context. This limitation is outside the scope of the current work but indicates potential directions for follow-up studies. Within its defined objectives, the study presents a methodological framework and dataset that contribute to understanding TIP60 activity in a biochemical setting.

      Minor comments:

      While the manuscript is clearly presented overall, there are two minor issues that could be addressed:

      • In Figure 1, the panels are not ordered according to their appearance in the Results section. In addition, the legends for Figures 1B and 1C appear to be swapped.
      • For the quantitative MS data (N = 2 biological replicates), the phrasing "Error bars represent the two replicate values" could be refined. With N = 2, showing individual data points or the range may convey the information more transparently than conventional error bars, which are typically associated with statistical measures (e.g., SEM) from larger sample sizes. Alternatively, a brief note explaining the choice to use two replicates and represent them with error bars could be added.

      Significance

      Krause and colleagues, using a clean in vitro system, define the substrate specificity of the Drosophila TIP60 core complex. They identify the main acetylation sites and their kinetic dynamics on H2A, H2A.V, and H4 tails, and further characterize the inhibitory activity of NU9056. This work addresses a longstanding question in the field and provides compelling evidence to support its conclusions. Future studies will be needed to establish the biological relevance of these findings.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      This study uses defined, reconstituted nucleosome arrays (H2A- or H2A.V-containing) and the four-subunit Drosophila TIP60 core complex to map intrinsic substrate selectivity across time courses and in the presence of reported TIP60 inhibitors (NU9056, TH1834). Key findings are: (i) selective H2A-tail acetylation (K10 > K8 > K5) with negligible K12/K14; (ii) preferential H2A.V K4 and K7 acetylation with distinct kinetics and low co-occurrence on a single tail; (iii) H4K12 is strongly favoured over other H4 sites; (iv) acetylation patterns are consistent with a more distributive (non-processive) mechanism relative to MOF/MSL; (v) NU9056 inhibits TIP60 activity with site-specific differences suggestive of a non-competitive/allosteric component, whereas TH1834 shows no effect in this Drosophila system.

      Major comments

      The study describes meticulously conducted and controlled experiments, showing the impressive biochemistry work consistently produced by this group. The statistical analysis and data presentation are appropriate, with the following major comments noted:

      1. Please clarify why K8ac/K12ac, K5ac/K16ac, K5ac/K12ac are not quantified (Figure 3). If undetected, state explicitly and annotate figures with "n.d." rather than leaving gaps. If detected but excluded, justify the exclusion.
      2. The statement "Nevertheless, combinations of di- and triacetylation were much more frequent if K12ac was included, suggesting that K12 is the primary target." is under-supported because only two non-K12ac combinations are shown, and only one is lower than K12ac-containing combinations. Either soften the claim ("trend toward ... in our dataset") or expand the analysis to all observed di/tri combinations with effect sizes, n, and statistical tests.
      3. Please provide a more detailed discussion about the known nature of NU9056 inhibition and how it fits or doesn't fit with your data. Are there any structural studies on this?
      4. Why was the inhibitor experiment MS only performed for H2A.V and not H2A? Given the clear H2A vs H2A.V differences reported in Figure 2, it would be useful to have the matched data for H2A.
      5. The inhibitor observations are very interesting as they can highlight systems to study the loss of specific acetyl residues: can the authors perform WB/IF validation in treated cells? I understand it will not be possible with the H2A antibodies, but the difference in H4K5ac vs H4K12ac should be possible to validate in cells.
      6. You highlight that H2A K10 (a major TIP60 site here) is not conserved in human canonical H2A. Please expand the discussion of the potential function and physiological relevance. Maybe in relation to H2A.V being a fusion of different human variants?
      7. To enable direct comparisons between variants and residues, please match y-axis scales where the biology invites comparison (e.g., H2A vs H2A.V; Figs. 2-3).

      Minor comments

      1. Add 1-2 sentences in the abstract on the gap in the field being addressed by the study.
      2. Either in the introduction or discussion, comment on your prior Tip60 three-subunit data (Kiss et al.). The three-subunit complex was significantly less active on H4, as indicated in that publication, which is likely due to the absence of Eaf6.
      3. Figure order/legends:

      a. Text references Fig.1E before Fig.1C, please reorder

      b. Fig.1B/C legend labels appear swapped.

      c. Fig.1E, 4A, 4B: add quantification

      d. Fig.2A: Note explicitly that K5-K10 and K8-K10 are unresolvable pairs to explain the shading scheme used 4. Ensure consistent KAT5/TIP60 naming. 5. Consider moving the first two Discussion paragraphs (field context and challenges in antibody-based detection) into the Introduction to better frame the significance.

      Significance

      This is a valuable and timely study for the histone acetylation field. The substrate specificity of many individual HATs remains incompletely understood owing to (i) cross-reactivity and limited selectivity of many anti-acetyl-lysine antibodies, (ii) functional redundancy among KATs, (iii) variability across in-vitro assays (HAT domain vs full-length/complex; free histones vs oligonucleosomes), and (iv) incomplete translation of in-vitro specificity to in-vivo settings. These factors have produced conflicting reports in the literature. By combining quantitative mass spectrometry with carefully engineered oligonucleosomal arrays, the authors make a principal step toward deconvoluting TIP60 biology in a controlled yet close-to-physiologically relevant system. Conceptually, the work delineates intrinsic, site-specific preferences of the TIP60 core on variant versus canonical nucleosomes, consistent with largely distributive behaviour and site-dependent inhibitor sensitivity. The inhibitor-dependent shifts in acetylation patterns are particularly intriguing and could enable dissection of residue-specific functions, with potential translational implications for preclinical cancer research and biomarker development. Overall, this manuscript will be of interest to the chromatin community, and I am supportive of publication pending satisfactory resolution of the points raised above.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Xiong and colleagues investigate the mechanisms operating downstream to TRIM32 and controlling myogenic progression from proliferation to differentiation. Overall, the bulk of the data presented is robust. Although further investigation of specific aspects would make the conclusions more definitive (see below), it is an interesting contribution to the field of scientists studying the molecular basis of muscle diseases.

      We thank the Reviewer for appreciating our work and for their valuable suggestions to improve our manuscript. We have carefully addressed some of the concerns raised, as detailed here, while others, which require more experimental efforts, will be addressed as detailed in the Revision Plan.

      In my opinion, a few aspects would improve the manuscript. Firstly, the conclusion that Trim32 regulates c-Myc mRNA stability could be expanded and corroborated by further mechanistic studies:

      1. Studies investigating whether Tim32 binds directly to c-Myc RNA. Moreover, although possibly beyond the scope of this study, an unbiased screening of RNA species binding to Trim32 would be informative. Authors’ response. This point will be addressed as detailed in the Revision Plan

      If possible, studies in which the overexpression of different mutants presenting specific altered functional domains (NHL domain known to bind RNAs and Ring domain reportedly involved in protein ubiquitination) would be used to test if they are capable or incapable of rescuing the reported alteration of Trim32 KO cell lines in c-Myc expression and muscle maturation.

      Authors’ response. This point will be addressed as detailed in the Revision Plan

      An optional aspect that might be interesting to explore is whether the alterations in c-Myc expression observed in C2C12 might be replicated with primary myoblasts or satellite cells devoid of Trim32.

      Authors’ response. This point will be addressed as detailed in the Revision Plan

      I also have a few minor points to highlight:

        • It is unclear if the differences highlighted in graphs 5G, EV5D, and EV5E are statistically significant.*

      Authors’ response. We thank the Reviewer for raising this point. We now indicated the statistical analyses performed on the data presented in the mentioned figures (according also to a point of Reviewer #3). According to the conclusion that Trim32 is necessary for proper regulation of c-Myc transcript stability, using 2-way-ANOVA, the data now reported as Figure 5G show the statistically significant effect of the genotype at 6h (right-hand graph) but not at D0 (left-hand graph). In the graphs of Fig. EV5 D and E at D0 no significant changes are observed whereas at 6h the data show significant difference at the 40 min time point. We included this info in the graphs and in the corresponding legends.

      - On page 10, it is stated that c-Myc down-regulation cannot rescue KO myotube morphology fully nor increase the differentiation index significantly, but the corresponding data is not shown. Could the authors include those quantifications in the manuscript?

      Authors’ response. As suggested, we included the graph showing the differentiation index upon c-Myc silencing in the Trim32 KO clones and in the WT clones, as a novel panel in Figure 6 (Fig. 6D). As already reported in the text, a partial recovery of differentiation index is observed but the increase is not statistically significant. In contrast, no changes are observed applying the same silencing in the WT cells. Legend and text were modified accordingly.

      Reviewer #1 (Significance (Required)):

      The manuscript offers several strengths. It provides novel mechanistic insight by identifying a previously unrecognized role for Trim32 in regulating c-Myc mRNA stability during the onset of myogenic differentiation. The study is supported by a robust methodology that integrates CRISPR/Cas9 gene editing, transcriptomic profiling, flow cytometry, biochemical assays, and rescue experiments using siRNA knockdown. Furthermore, the work has a disease relevance, as it uncovers a mechanistic link between Trim32 deficiency and impaired myogenesis, with implications for the pathogenesis of LGMDR8. * * At the same time, the study has some limitations. The findings rely exclusively on the C2C12 myoblast cell line, which may not fully represent primary satellite cell or in vivo biology. The functional rescue achieved through c-Myc knockdown is only partial, restoring Myogenin expression but not the full differentiation index or morphology, indicating that additional mechanisms are likely involved. Although evidence supports a role for Trim32 in mRNA destabilization, the precise molecular partners-such as RNA-binding activity, microRNA involvement, or ligase function-remain undefined. Some discrepancies with previous studies, including Trim32-mediated protein degradation of c-Myc, are acknowledged but not experimentally resolved. Moreover, functional validation in animal models or patient-derived cells is currently lacking. Despite these limitations, the study represents an advancement for the field. It shifts the conceptual framework from Trim32's canonical role in protein ubiquitination to a novel function in RNA regulation during myogenesis. It also raises potential clinical implications by suggesting that targeting the Trim32-c-Myc axis, or modulating c-Myc stability, may represent a therapeutic strategy for LGMDR8. This work will be of particular interest to muscle biology researchers studying myogenesis and the molecular basis of muscle disease, RNA biology specialists investigating post-transcriptional regulation and mRNA stability, and neuromuscular disease researchers and clinicians seeking to identify new molecular targets for therapeutic intervention in LGMDR8. * * The Reviewer expressing this opinion is an expert in muscle stem cells, muscle regeneration, and muscle development.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary: * * In this study, the authors sought to investigate the molecular role of Trim32, a tripartite motif-containing E3 ubiquitin ligase often associated with its dysregulation in Limb-Girdle Muscular Dystrophy Recessive 8 (LGMDR8), and its role in the dynamics of skeletal muscle differentiation. Using a CRISPR-Cas9 model of Trim32 knockout in C2C12 murine myoblasts, the authors demonstrate that loss of Trim32 alters the myogenic process, particularly by impairing the transition from proliferation to differentiation. The authors provide evidence in the way of transcriptomic profiling that displays an alteration of myogenic signaling in the Trim32 KO cells, leading to a disruption of myotube formation in-vitro. Interestingly, while previous studies have focused on Trim32's role in protein ubiquitination and degradation of c-Myc, the authors provide evidence that Trim32-regulation of c-Myc occurs at the level of mRNA stability. The authors show that the sustained c-Myc expression in Trim32 knockout cells disrupts the timely expression of key myogenic factors and interferes with critical withdrawal of myoblasts from the cell cycle required for myotube formation. Overall, the study offers a new insight into how Trim32 regulates early myogenic progression and highlights a potential therapeutic target for addressing the defects in muscular regeneration observed in LGMDR8.

      We thank the Reviewer for valuing our work and for their appreciated suggestions to improve our manuscript. We have carefully addressed some of the concerns raised as detailed here, while others, which require more laborious experimental efforts, will be addressed as reported in the Revision Plan.

      Major Comments:

      The work is a bit incremental based on this:

      https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0030445 * * And this:

      https://www.nature.com/articles/s41418-018-0129-0 * * To their credit, the authors do cite the above papers.

      Authors’ response. We thank the Reviewer for this careful evaluation of our work against the current literature and for recognising the contribution of our findings to the understanding of myogenesis complex picture in which the involvement of Trim32 and c-Myc, and of the Trim32-c-Myc axis, can occur at several stages and likely in narrow time windows along the process, thus possibly explaining some reports inconsistencies.

      The authors do provide compelling evidence that Trim32 deficiency disrupts C2C12 myogenic differentiation and sustained c-Myc expression contributes to this defective process. However, while knockdown of c-Myc does restore Myogenin levels, it was not sufficient to normalize myotube morphology or differentiation index, suggesting an incomplete picture of the Trim32-dependent pathways involved. The authors should qualify their claim by emphasizing that c-Myc regulation is a major, but not exclusive, mechanism underlying the observed defects. This will prevent an overgeneralization and better align the conclusions with the author's data.

      Authors’ response. We agree with the Reviewer and we modified our phrasing that implied Trim32-c-Myc axis as the exclusive mechanism by explicitly indicated that other pathways contribute to guarantee proper myogenesis, in the Abstract and in Discussion.

      The Abstract now reads: … suggesting that the Trim32–c-Myc axis may represent an essential hub, although likely not the exclusive molecular mechanism, in muscle regeneration within LGMDR8 pathogenesis.”

      The Discussion now reads: “Functionally, we demonstrated that c-Myc contributes to the impaired myogenesis observed in Trim32 KO clones, although this is clearly not the only factor involved in the Trim32-mediated myogenic network; realistically other molecular mechanisms can participate in this process as also suggested by our transcriptomic results.”

      The authors provide a thorough and well-executed interrogation of cell cycle dynamics in Trim32 KO clones, combining phosphor-histone H3 flow cytometry of DNA content, and CFSE proliferation assays. These complementary approaches convincingly show that, while proliferation states remain similar in WT and KO cells, Trim32-deficient myoblasts fail in their normal withdraw from the cell cycle during exposure to differentiation-inducing conditions. This work adds clarity to a previously inconsistent literature and greatly strengthens the study.

      Authors’ response. We thank the Reviewer for appreciating our thorough analyses on cell cycle dynamics in proliferation conditions and at the onset of the differentiation process.

      The transcriptomic analysis (detailed In the "Transcriptomic analysis of Trim32 WT and KO clones along early differentiation" section of Results) is central to the manuscript and provides strong evidence that Trim32 deficiency disrupts normal differentiation processes. However, the description of the pathway enrichment results is highly detailed and somewhat compressed, which may make it challenging for readers to following the key biological 'take-homes'. The narrative quickly moves across their multiple analyses like MDS, clustering, heatmaps, and bubble plots without pausing to guide the reader through what each analysis contributes to the overall biological interpretation. As a result, the key findings (reduced muscle development pathways in KO cells and enrichment of cell cycle-related pathways) can feel somewhat muted. The authors may consider reorganizing this section, so the primary biological insights are highlighted and supported by each of their analyses. This would allow the biological implications to be more accessible to a broader readership.

      Authors’ response. We thank the Reviewer for raising this point and apologise for being too brief in describing the data, leaving indeed some points excessively implicit. As suggested, we now reorganised this session and added the lists of enriched canonical pathways relative to WT vs KO comparisons at D0 and D3 (Fig. EV3B) as well as those relative to the comparison between D0 and D3 for both WT and Trim32 KO samples (Fig. EV3C), with their relative scores. We changed the Results section “Transcriptomic analysis of Trim32 WT and Trim32 KO clones along early differentiationas reported here below and modified the legends accordingly.

      The paragraph now reads: Based on our initial observations, the absence of Trim32 already exerts a significant impact by day 3 (D3) of C2C12 myogenic differentiation. To investigate how Trim32 influences early global transcriptional changes during the proliferative phase (D0) and early differentiation (D3), we performed an unbiased transcriptomic profiling of WT and Trim32 KO clones (Fig. 2A). Multidimensional Scaling (MDS) analysis revealed clear segregation of gene expression profiles based on both time of differentiation (Dim1, 44% variance) and Trim32 genotype (Dim2, 16% variance) (Fig. 2A). Likewise, hierarchical clustering grouped WT and Trim32 KO clones into distinct clusters at both timepoints, indicating consistent genotype-specific transcriptional differences (Fig. EV3A). Differentially Expressed Genes (DEGs) were detected in the Trim32 KO transcriptome relative to WT, at both D0 and D3. In proliferating conditions, 72 genes were upregulated and 189 were downregulated whereas at D3 of differentiation, 72 genes were upregulated and 212 were downregulated. Ingenuity Pathway Analysis of the DEGs revealed the top 10 Canonical Pathways displayed in Fig. EV3B as enriched at either D0 or D3 (Fig. EV3B). Several of these pathways can underscore relevant Trim32-mediated functions though most of them represent generic functions not immediately attributable to the observed myogenesis defects.

      Notably, the transcriptional divergence between WT and Trim32 KO cells is more pronounced at D3, as evidenced by a greater separation along the MSD Dim2 axis, suggesting that Trim32-dependent transcriptional regulation intensifies during early differentiation (Fig. 2A). Given our interest in the differentiation process, we therefore focused our analyses comparing the changes occurring from D0 to D3 in WT (WT D3 vs. D0) and in Trim32 KO (KO D3 vs. D0) RNAseq data.

      Pathway enrichment analysis of D3 vs. D0 DEGs allowed the selection of the top-scored pathways for both WT and Trim32 KO data. We obtained 18 top-scored pathways enriched in each genotype (-log(p-value) ³ 9 cut-off): 14 are shared while 4 are top-ranked only in WT and 4 only in Trim32 KO (Fig. EV3C). For the following analyses, we employed thus a total of 22 distinct pathways and to better mine those relevant in the passage from the proliferation stage to the early differentiation one and that are affected by the lack of Trim32, we built a bubble plot comparing side-by-side the scores and enrichment of the 22 selected top-scored pathways above in WT and Trim32 KO (Fig. 2B). A heatmap of DEGs included within these selected pathways confirms the clustering of the samples considering both the genotypes and the timepoints highlighting gene expression differences (Fig. 2C). These pathways are mainly related to muscle development, cell cycle regulation, genome stability maintenance and few other metabolic cascades.

      As expected given the results related to Figure 1, moving from D0 to D3 WT clones showed robust upregulation of key transcripts associated with the Inactive Sarcomere Protein Complex, a category encompassing most genes in the “Striated Muscle Contraction” pathway, while in Trim32 KO clones this pathway was not among those enriched in the transition from D0 to D3 (Fig. EV3C). Detailed analyses of transcripts enclosed within this pathway revealed that on the transition from proliferation to differentiation, WT clones show upregulation of several Myosin Heavy Chain isoforms (e.g., MYH3, MYH6, MYH8), α-Actin 1 (ACTA1), α-Actinin 2 (ACTN2), Desmin (DES), Tropomodulin 1 (TMOD1), and Titin (TTN), a pattern consistent with previous reports, while these same transcripts were either non-detected or only modestly upregulated in Trim32 KO clones at D3 (Fig. 2D). This genotype-specific disparity was further confirmed by gene set enrichment barcode plots, which demonstrated significant enrichment of these muscle-related transcripts in WT cells (FDR_UP = 0.0062), but not in Trim32 KO cells (FDR_UP = 0.24) (Fig. EV3D). These findings support an early transcriptional basis for the impaired myogenesis previously observed in Trim32 KO cells.

      In addition to differences in muscle-specific gene expression, we observed that also several pathways related to cell proliferation and cell cycle regulation were more enriched in Trim32 KO cells compared to WT. This suggests that altered cell proliferation may contribute to the distinct differentiation behavior observed in Trim32 KO versus WT (Fig. 2B). Given that cell cycle exit is a critical prerequisite for the onset of myogenic differentiation and considering that previous studies on Trim32 role in cell cycle regulation have reported inconsistent findings, we further examined cell cycle dynamics under our experimental conditions to clarify Trim32 contribution to this process

      The work would be greatly strengthened by the conclusion of LGMDR8 primary cells, and rescue experiments of TRIM32 to explore myogenesis.

      Authors’ response. This point will be addressed as detailed in the Revision Plan

      Also, EU (5-ethynyl uridine) pulse-chase experiments to label nascent and stable RNA coupled with MYC pulldowns and qPCR (or RNA-sequencing of both pools) would further enhance the claim that MYC stability is being affected.

      Authors’ response. This point will be addressed as detailed in the Revision Plan

      "On one side, c-Myc may influence early stages of myogenesis, such as myoblast proliferation and initial myotube formation, but it may not contribute significantly to later events such as myotube hypertrophy or fusion between existing myotubes and myocytes. This hypothesis is supported by recent work showing that c-Myc is dispensable for muscle fiber hypertrophy but essential for normal MuSC function (Ham et al, 2025)." Also address and discuss the following, as what is currently written is not entirely accurate: https://www.embopress.org/doi/full/10.1038/s44319-024-00299-z and https://journals.physiology.org/doi/prev/20250724-aop/abs/10.1152/ajpcell.00528.2025

      Authors’ response. We thank the Reviewer for bringing to our attention these two publications, that indeed, add important piece of data to recapitulate the in vivo complexity of c-Myc role in myogenesis. We included this point in our Discussion.

      The Discussion now reads: “On one side, c-Myc may influence early stages of myogenesis, such as myoblast proliferation and initial myotube formation, but it may not contribute significantly to later events such as myotube hypertrophy or fusion between existing myotubes and myocytes. This hypothesis is supported by recent work showing that c-Myc is dispensable for muscle fiber hypertrophy but essential for normal MuSC function (Ham et al, 2025). Other reports, instead, demonstrated the implication of c-Myc periodic pulses, mimicking resistance-exercise, in muscle growth, a role that cannot though be observed in our experimental model (Edman et al., 2024; Jones et al., 2025).”

      Minor Comments:

      Z-score scale used in the pathway bubble plot (Figure 2C) could benefit from alternative color choices. Current gradient is a bit muddy and clarity for the reader could be improved by more distinct color options, particularly in the transition from positive to negative Z-score.

      Authors’ response. As suggested, we modified the z-score-representing colors using a more distinct gradient especially in the positive to negative transition in Figure 2B.

      Clarification on the rationale for selecting the "top 18" pathways would be helpful, as it is not clear if this cutoff was chosen arbitrarily or reflects a specific statistical or biological threshold.

      Authors’ response. As now better explained (see comment regarding Major point: Transcriptomics), we used a cut-off of -log(p-value) above or equal to 9 for pathways enriched in DEGs of the D0 vs D3 comparison for both WT and Trim32 KO. The threshold is now included in the Results section and the pathways (shared between WT and Trim32 KO and unique) are listed as Fig. EV3C.

      The authors alternates between using "Trim 32 KO clones" and "KO clones" throughout the manuscript. Consistent terminology across figures and text would improve readability.

      Authors’ response. We thank the Reviewer for this remark, and we apologise for having overlooked it. We amended this throughout the manuscript by always using for clarity “Trim32 KO clones/cells”.

      Cell culture methodology does not specify passage number or culture duration (only "At confluence") before differentiation. This is important, as C2C12 differentiation potential can drift with extended passaging.

      Authors’ response. We agree with the Reviewer that C2C12 passaging can reduce the differentiation potential of this myoblast cell lines; this is indeed the main reason why we decided to employ WT clones, which underwent the same editing process as those that resulted mutated in the Trim32 gene, as reference controls throughout our study. We apologise for not indicating the passages in the first version of the manuscript that now is amended as per here below in the Methods section:

      The C2C12 parental cells used in this study were maintained within passages 3–8. All clonal cell lines (see below) were utilized within 10 passages following gene editing. In all experiments, WT and Trim32 KO clones of comparable passage numbers were used to ensure consistency and minimize passage-related variability.

      Reviewer #2 (Significance (Required)):

      General Assessment:

      This study provides a thorough investigation of Trim32's role the processes related to skeletal muscle differentiation using a CRISPR-Cas9 knockout C2C12 model. The strengths of this study lie in the multi-layered experimental approach as the authors incorporated transcriptomics, cell cycle profiling, and stability assays which collectively build a strong case for their hypothesis that Trim32 is a key factor in the normal regulation of myogenesis. The work is also strengthened by the use of multiple biological and technical replicates, particularly the independent KO clones which helps address potential clonal variation issues that could occur. The largest limitation to this study is that, while the c-Myc mechanism is well explored, the other Trim32-dependent pathways associated with the disruption (implicated by the incomplete rescue by c-Myc knockdown) are not as well addressed. Overall however, the study convincingly identifies a critical function for Trim32 during skeletal muscle differentiation. * * Advance: * * To my knowledge, this is the first study to demonstrate the mRNA stability level of c-Myc regulation by Trim32, rather than through the ubiquitin-mediated protein degradation. This work will advance the current understanding and provide a more complete understanding of Trim32's role in c-Myc regulation. Beyond c-Myc, this work highlights the idea that TRIM family proteins can influence RNA stability which could implicate a broader role in RNA biology and has potential for future therapeutic targeting. * * Audience: * * This research will be of interest to an audience that focuses on broad skeletal muscle biology but primarily to readers with more focused research such as myogenesis and neuromuscular disease (LGMDR8 in particular) where the defined Trim32 governance over early differentiation checkpoints will be of interest. It will also provide mechanistic insights to those outside of skeletal muscle that study TRIM family proteins, ubiquitin biology, and RNA regulation. For translational/clinical researchers, it identifies the Trim32/c-Myc axis as a potential therapeutic target for LGMDR8 and related muscular dystrophies.

      Expertise: * * My expertise lies in skeletal muscle biology, gene editing, transgenic mouse models, and bioinformatics. I feel confident evaluating the data and conclusions as presented.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      • In this paper, the authors examine the role of TRIM32, implicated in limb girdle muscular dystrophy recessive 8 (LGMDR8), in the differentiation of C2C12 mouse myoblasts. Using CRISPR, they generate mutant and wild-type clones and compare their differentiation capacity in vitro. They report that Trim32-deficient clones exhibit delayed and defective myogenic differentiation. RNA-seq analysis reveals widespread changes in gene expression, although few are validated by independent methods. Notably, Trim32 mutant cells maintain residual proliferation under differentiation conditions, apparently due to a failure to downregulate c-Myc. Translation inhibition experiments suggest that TRIM32 promotes c-Myc mRNA destabilization, but this conclusion is insufficiently substantiated. The authors also perform rescue experiments, showing that c-Myc knockdown in Trim32-deficient cells alleviates some differentiation defects. However, this rescue is not quantified, was conducted in only two of the three knockout lines, and is supported by inappropriate statistical analysis of gene expression. Overall, the manuscript in its current form has substantial weaknesses that preclude publication. Beyond statistical issues, the major concerns are: (1) exclusive reliance on the immortalized C2C12 line, with no validation in primary/satellite cells or in vivo, (2) insufficient mechanistic evidence that TRIM32 acts directly on c-Myc mRNA, and (3) overinterpretation of disease relevance in the absence of supporting patient or in vivo data. Please find more details below:*

      We thank the Reviewer for the in-depth assessment of our work and precious suggestions to improve the manuscript. We have carefully addressed some of the concerns raised, as detailed here, while others, which require more experimental efforts, will be addressed as detailed in the Revision Plan.

      - TRIM32 complementation / rescue experiments to exclude clonal or off-target CRISPR effects and show specificity are lacking.

      Authors’ response. This point will be addressed as detailed in the Revision Plan

      - The authors link their in vitro findings to LGMDR8 pathogenesis and propose that the Trim32-c-Myc axis may serve as a central regulator of muscle regeneration in the disease. However, LGMDR8 is a complex disorder, and connecting muscle wasting in patients to differentiation assays in C2C12 cells is difficult to justify. No direct evidence is provided that the proposed mRNA mechanism operates in patient-derived samples or in mouse satellite cells. Moreover, the partial rescue achieved by c-Myc knockdown (which does not fully restore myotube morphology or differentiation index) further suggests that the disease connection is not straightforward. Validation of the TRIM32-c-Myc axis in a physiologically relevant system, such as LGMD patient myoblasts or Trim32 mutant mouse cells, would greatly strengthen the claim.

      Authors’ response. This point will be addressed as detailed in the Revision Plan

      -Some gene expression changes from the RNA-seq study in Figure 2 should be validated by qPCR

      Authors’ response. We thank the reviewer for this suggestion. This point will be addressed as detailed in the Revision Plan. We have selected several transcripts that will be evaluated in independent samples in order to validate the RNAseq results.

      - The paper shows siRNA knockdown of c-Myc in KO restores Myogenin RNA/protein but does not fully rescue myotube morphology or differentiation index. This suggests that Trim32 controls additional effectors beyond c-Myc; yet the authors do not pursue other candidate mediators identified in the RNA-seq. The manuscript would be strengthened by systematically testing whether other deregulated transcripts contribute to the phenotype.

      Authors’ response. This point will be addressed as detailed in the Revision Plan

      - There are concerns with experimental/statistical issues and insufficient replicate reporting. The authors use unpaired two-tailed Student's t-test across many comparisons; multiple testing corrections or ANOVA where appropriate should be used. In Figure EV5B and Figure 6B, the authors perform statistical analyses with control values set to 1. This method masks the inherent variability between experiments and artificially augments p values. Control sample values need to be normalized to one another to have reliable statistical analysis. Myotube morphology and differentiation index quantifications need clear description of fields counted, blind analysis, and number of biological replicates.

      Authors’ response. We thank the Reviewer for raising this point.

      Regarding the replicates, we clarified in the Methods and Legends that the Trim32 KO experiments have been performed on 3 biological replicates (independent clones) and the same for the reference control (3 independent WT clones), except for the Fig. 6 experiments that were performed on 2 Trim32 KO and 2 WT clones. All the Western Blots, immunofluorescence, qPCR data are representative of the results of at least 3 independent experiments unless otherwise stated. We reported the number and type of replicates as well as the microscope fields analyzed.

      We repeated the statistical analyses of the data in Figure 5G, EV5D, EV5E, employing more appropriately the 2-way-ANOVA test, as suggested, and we now reported this info in the graphs and legends.

      We thank the Reviewer for raising this point, we agree and substituted the graphs in Fig. EV5B and 6B showing the control values normalised as suggested. The statistical analyses now reflect this change.

      -Some English mistakes require additional read-throughs. For example: "Indeed, Trim32 has no effect on the stability of c-Myc mRNA in proliferating conditions, but upon induction of differentiation the stability of c-Myc mRNA resulted enhanced in Trim32 KO clones (Fig. 5G, Fig. EV5D and 5E)."

      Authors’ response. We re-edited this revised version of the manuscript as suggested.

      -Results in Figure 5A should be quantified

      Authors’ response. We amended this point by quantifying the results shown in Fig. 5A, we added the graph of the quantification of 3 experimental replicates to the Figure. Quantification confirms that no statistically significant difference is observed. The Figure and the relative legend are modified accordingly.

      -Based on the nuclear marker p84, the separation of cytoplasmic and nuclear fractions is not ideal in Figure 5D

      Authors’ response. We agree with the Reviewer that the presence of p84 also in the cytoplasmic fraction is not ideal. Regrettably, we observed this faint p84 band in all the experiments performed. We think however, that this is not impacting on the result that clearly shows that c-Myc and Trim32 are never detected in the same compartment.

      -In Figure 6, it is not appropriate to perform statistical analyses on only two data points per condition.

      Authors’ response. We agree with the Reviewer and we now show the graph of the results of the 3 technical replicates for 2 biological replicates and do not indicate any statistics (Fig. 6B). The graph was also modified according to a previous point raised.

      -The nuclear MYOG phenotype is very interesting; could this be related to requirements of TRIM32 in fusion?

      Authors’ response. We agree with the Reviewer that Trim32 might also be necessary for myoblast fusion. This point is however beyond the scope of the present study and will be addressed in future work.

      - The hypothesis that TRIM32 destabilizes c-Myc mRNA is intriguing but requires stronger mechanistic support. This would be more convincing with RNA immunoprecipitation to test direct association with c-Myc mRNA, and/or co-immunoprecipitation to identify interactions between TRIM32 and proteins involved in mRNA stability. The study would also be strengthened by reporter assays, such as c-Myc 3′UTR luciferase constructs in WT and KO cells, to directly demonstrate 3′UTR-dependent regulation of mRNA stability.

      Authors’ response. This point will be addressed as detailed in the Revision Plan

      Reviewer #3 (Significance (Required)):

      The manuscript presents a minor conceptual advance in understanding TRIM32 function in myogenic differentiation. Its main limitation is that all experiments were performed in C2C12 cells. While C2C12 are a classical system to study muscle differentiation, they are an immortalized, long-cultured, and genetically unstable line that represents a committed myoblast stage rather than bona fide satellite cells. They therefore do not fully model the biology of early regenerative responses. Several TRIM32 phenotypes reported in the literature differ between primary satellite cells and cell lines, and the authors themselves note such discrepancies. Extrapolating these findings to LGMDR8 pathogenesis without validation in primary human myoblasts, satellite cell assays, or in vivo regeneration models is therefore not justified. Previous work has already established clear roles for TRIM32 in mouse satellite cells in vivo and in patient myoblasts in vitro, whereas this study introduces a novel link to c-Myc regulation during differentiation. In addition, without mechanistic evidence, the central claim that TRIM32 regulates c-Myc mRNA stability remains descriptive and incomplete. Nevertheless, the results will be of interest to researchers studying LGMD and to those exploring TRIM32 biology in broader contexts. I review this manuscript as a muscle biologist with expertise in satellite cell biology and transcriptional regulation.

      Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Reply to the Reviewers

      I thank the Referees for their...

      Referee #1

      1. The authors should provide more information when...

      Responses + The typical domed appearance of a hydrocephalus-harboring skull is apparent as early as P4, as shown in a new side-by-side comparison of pups at that age (Fig. 1A). + Though this is not stated in the MS 2. Figure 6: Why has only...

      Response: We expanded the comparison

      Minor comments:

      1. The text contains several...

      Response: We added...

      Referee #2

      Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Reply to the Reviewers

      I thank the Referees for their...

      Referee #1

      1. The authors should provide more information when...

      Responses + The typical domed appearance of a hydrocephalus-harboring skull is apparent as early as P4, as shown in a new side-by-side comparison of pups at that age (Fig. 1A). + Though this is not stated in the MS 2. Figure 6: Why has only...

      Response: We expanded the comparison

      Minor comments:

      1. The text contains several...

      Response: We added...

      Referee #2

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this paper, the authors examine the role of TRIM32, implicated in limb girdle muscular dystrophy recessive 8 (LGMDR8), in the differentiation of C2C12 mouse myoblasts. Using CRISPR, they generate mutant and wild-type clones and compare their differentiation capacity in vitro. They report that Trim32-deficient clones exhibit delayed and defective myogenic differentiation. RNA-seq analysis reveals widespread changes in gene expression, although few are validated by independent methods. Notably, Trim32 mutant cells maintain residual proliferation under differentiation conditions, apparently due to a failure to downregulate c-Myc. Translation inhibition experiments suggest that TRIM32 promotes c-Myc mRNA destabilization, but this conclusion is insufficiently substantiated. The authors also perform rescue experiments, showing that c-Myc knockdown in Trim32-deficient cells alleviates some differentiation defects. However, this rescue is not quantified, was conducted in only two of the three knockout lines, and is supported by inappropriate statistical analysis of gene expression. Overall, the manuscript in its current form has substantial weaknesses that preclude publication. Beyond statistical issues, the major concerns are: (1) exclusive reliance on the immortalized C2C12 line, with no validation in primary/satellite cells or in vivo, (2) insufficient mechanistic evidence that TRIM32 acts directly on c-Myc mRNA, and (3) overinterpretation of disease relevance in the absence of supporting patient or in vivo data. Please find more details below:

      • TRIM32 complementation / rescue experiments to exclude clonal or off-target CRISPR effects and show specificity are lacking.
      • The authors link their in vitro findings to LGMDR8 pathogenesis and propose that the Trim32-c-Myc axis may serve as a central regulator of muscle regeneration in the disease. However, LGMDR8 is a complex disorder, and connecting muscle wasting in patients to differentiation assays in C2C12 cells is difficult to justify. No direct evidence is provided that the proposed mRNA mechanism operates in patient-derived samples or in mouse satellite cells. Moreover, the partial rescue achieved by c-Myc knockdown (which does not fully restore myotube morphology or differentiation index) further suggests that the disease connection is not straightforward. Validation of the TRIM32-c-Myc axis in a physiologically relevant system, such as LGMD patient myoblasts or Trim32 mutant mouse cells, would greatly strengthen the claim. -Some gene expression changes from the RNA-seq study in Figure 2 should be validated by qPCR
      • The paper shows siRNA knockdown of c-Myc in KO restores Myogenin RNA/protein but does not fully rescue myotube morphology or differentiation index. This suggests that Trim32 controls additional effectors beyond c-Myc; yet the authors do not pursue other candidate mediators identified in the RNA-seq. The manuscript would be strengthened by systematically testing whether other deregulated transcripts contribute to the phenotype.
      • There are concerns with experimental/statistical issues and insufficient replicate reporting. The authors use unpaired two-tailed Student's t-test across many comparisons; multiple testing corrections or ANOVA where appropriate should be used. In Figure EV5B and Figure 6B, the authors perform statistical analyses with control values set to 1. This method masks the inherent variability between experiments and artificially augments p values. Control sample values need to be normalized to one another to have reliable statistical analysis. Myotube morphology and differentiation index quantifications need clear description of fields counted, blind analysis, and number of biological replicates. -Some English mistakes require additional read-throughs. For example: "Indeed, Trim32 has no effect on the stability of c-Myc mRNA in proliferating conditions, but upon induction of differentiation the stability of c-Myc mRNA resulted enhanced in Trim32 KO clones (Fig. 5G, Fig. EV5D and 5E)." -Results in Figure 5A should be quantified -Based on the nuclear marker p84, the separation of cytoplasmic and nuclear fractions is not ideal in Figure 5D -In Figure 6, it is not appropriate to perform statistical analyses on only two data points per condition. -The nuclear MYOG phenotype is very interesting; could this be related to requirements of TRIM32 in fusion?
      • The hypothesis that TRIM32 destabilizes c-Myc mRNA is intriguing but requires stronger mechanistic support. This would be more convincing with RNA immunoprecipitation to test direct association with c-Myc mRNA, and/or co-immunoprecipitation to identify interactions between TRIM32 and proteins involved in mRNA stability. The study would also be strengthened by reporter assays, such as c-Myc 3′UTR luciferase constructs in WT and KO cells, to directly demonstrate 3′UTR-dependent regulation of mRNA stability.

      Significance

      The manuscript presents a minor conceptual advance in understanding TRIM32 function in myogenic differentiation. Its main limitation is that all experiments were performed in C2C12 cells. While C2C12 are a classical system to study muscle differentiation, they are an immortalized, long-cultured, and genetically unstable line that represents a committed myoblast stage rather than bona fide satellite cells. They therefore do not fully model the biology of early regenerative responses. Several TRIM32 phenotypes reported in the literature differ between primary satellite cells and cell lines, and the authors themselves note such discrepancies. Extrapolating these findings to LGMDR8 pathogenesis without validation in primary human myoblasts, satellite cell assays, or in vivo regeneration models is therefore not justified. Previous work has already established clear roles for TRIM32 in mouse satellite cells in vivo and in patient myoblasts in vitro, whereas this study introduces a novel link to c-Myc regulation during differentiation. In addition, without mechanistic evidence, the central claim that TRIM32 regulates c-Myc mRNA stability remains descriptive and incomplete. Nevertheless, the results will be of interest to researchers studying LGMD and to those exploring TRIM32 biology in broader contexts. I review this manuscript as a muscle biologist with expertise in satellite cell biology and transcriptional regulation.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      In this study, the authors sought to investigate the molecular role of Trim32, a tripartite motif-containing E3 ubiquitin ligase often associated with its dysregulation in Limb-Girdle Muscular Dystrophy Recessive 8 (LGMDR8), and its role in the dynamics of skeletal muscle differentiation. Using a CRISPR-Cas9 model of Trim32 knockout in C2C12 murine myoblasts, the authors demonstrate that loss of Trim32 alters the myogenic process, particularly by impairing the transition from proliferation to differentiation. The authors provide evidence in the way of transcriptomic profiling that displays an alteration of myogenic signaling in the Trim32 KO cells, leading to a disruption of myotube formation in-vitro. Interestingly, while previous studies have focused on Trim32's role in protein ubiquitination and degradation of c-Myc, the authors provide evidence that Trim32-regulation of c-Myc occurs at the level of mRNA stability. The authors show that the sustained c-Myc expression in Trim32 knockout cells disrupts the timely expression of key myogenic factors and interferes with critical withdrawal of myoblasts from the cell cycle required for myotube formation. Overall, the study offers a new insight into how Trim32 regulates early myogenic progression and highlights a potential therapeutic target for addressing the defects in muscular regeneration observed in LGMDR8.

      Major Comments:

      The work is a bit incremental based on this: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0030445 And this: https://www.nature.com/articles/s41418-018-0129-0 To their credit, the authors do cite the above papers.

      The authors do provide compelling evidence that Trim32 deficiency disrupts C2C12 myogenic differentiation and sustained c-Myc expression contributes to this defective process. However, while knockdown of c-Myc does restore Myogenin levels, it was not sufficient to normalize myotube morphology or differentiation index, suggesting an incomplete picture of the Trim32-dependent pathways involved. The authors should qualify their claim by emphasizing that c-Myc regulation is a major, but not exclusive, mechanism underlying the observed defects. This will prevent an overgeneralization and better align the conclusions with the author's data. The authors provide a thorough and well-executed interrogation of cell cycle dynamics in Trim32 KO clones, combining phosphor-histone H3 flow cytometry of DNA content, and CFSE proliferation assays. These complementary approaches convincingly show that, while proliferation states remain similar in WT and KO cells, Trim32-deficient myoblasts fail in their normal withdraw from the cell cycle during exposure to differentiation-inducing conditions. This work adds clarity to a previously inconsistent literature and greatly strengthens the study.

      The transcriptomic analysis (detailed In the "Transcriptomic analysis of Trim32 WT and KO clones along early differentiation" section of Results) is central to the manuscript and provides strong evidence that Trim32 deficiency disrupts normal differentiation processes. However, the description of the pathway enrichment results is highly detailed and somewhat compressed, which may make it challenging for readers to following the key biological 'take-homes'. The narrative quickly moves across their multiple analyses like MDS, clustering, heatmaps, and bubble plots without pausing to guide the reader through what each analysis contributes to the overall biological interpretation. As a result, the key findings (reduced muscle development pathways in KO cells and enrichment of cell cycle-related pathways) can feel somewhat muted. The authors may consider reorganizing this section, so the primary biological insights are highlighted and supported by each of their analyses. This would allow the biological implications to be more accessible to a broader readership.

      The work would be greatly strengthened by the conclusion of LGMDR8 primary cells, and rescue experiments of TRIM32 to explore myogenesis. Also, EU (5-ethynyl uridine) pulse-chase experiments to label nascent and stable RNA coupled with MYC pulldowns and qPCR (or RNA-sequencing of both pools) would further enhance the claim that MYC stability is being affected.

      "On one side, c-Myc may influence early stages of myogenesis, such as myoblast proliferation and initial myotube formation, but it may not contribute significantly to later events such as myotube hypertrophy or fusion between existing myotubes and myocytes. This hypothesis is supported by recent work showing that c-Myc is dispensable for muscle fiber hypertrophy but essential for normal MuSC function (Ham et al, 2025)." Also address and discuss the following, as what is currently written is not entirely accurate: https://www.embopress.org/doi/full/10.1038/s44319-024-00299-z and https://journals.physiology.org/doi/prev/20250724-aop/abs/10.1152/ajpcell.00528.2025

      Minor Comments:

      Z-score scale used in the pathway bubble plot (Figure 2C) could benefit from alternative color choices. Current gradient is a bit muddy and clarity for the reader could be improved by more distinct color options, particularly in the transition from positive to negative Z-score.

      Clarification on the rationale for selecting the "top 18" pathways would be helpful, as it is not clear if this cutoff was chosen arbitrarily or reflects a specific statistical or biological threshold.

      The authors alternates between using "Trim 32 KO clones" and "KO clones" throughout the manuscript. Consistent terminology across figures and text would improve readability.

      Cell culture methodology does not specify passage number or culture duration (only "At confluence") before differentiation. This is important, as C2C12 differentiation potential can drift with extended passaging.

      Significance

      General Assessment:

      This study provides a thorough investigation of Trim32's role the processes related to skeletal muscle differentiation using a CRISPR-Cas9 knockout C2C12 model. The strengths of this study lie in the multi-layered experimental approach as the authors incorporated transcriptomics, cell cycle profiling, and stability assays which collectively build a strong case for their hypothesis that Trim32 is a key factor in the normal regulation of myogenesis. The work is also strengthened by the use of multiple biological and technical replicates, particularly the independent KO clones which helps address potential clonal variation issues that could occur. The largest limitation to this study is that, while the c-Myc mechanism is well explored, the other Trim32-dependent pathways associated with the disruption (implicated by the incomplete rescue by c-Myc knockdown) are not as well addressed. Overall however, the study convincingly identifies a critical function for Trim32 during skeletal muscle differentiation.

      Advance:

      To my knowledge, this is the first study to demonstrate the mRNA stability level of c-Myc regulation by Trim32, rather than through the ubiquitin-mediated protein degradation. This work will advance the current understanding and provide a more complete understanding of Trim32's role in c-Myc regulation. Beyond c-Myc, this work highlights the idea that TRIM family proteins can influence RNA stability which could implicate a broader role in RNA biology and has potential for future therapeutic targeting.

      Audience:

      This research will be of interest to an audience that focuses on broad skeletal muscle biology but primarily to readers with more focused research such as myogenesis and neuromuscular disease (LGMDR8 in particular) where the defined Trim32 governance over early differentiation checkpoints will be of interest. It will also provide mechanistic insights to those outside of skeletal muscle that study TRIM family proteins, ubiquitin biology, and RNA regulation. For translational/clinical researchers, it identifies the Trim32/c-Myc axis as a potential therapeutic target for LGMDR8 and related muscular dystrophies.

      Expertise:

      My expertise lies in skeletal muscle biology, gene editing, transgenic mouse models, and bioinformatics. I feel confident evaluating the data and conclusions as presented.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript, Xiong and colleagues investigate the mechanisms operating downstream to TRIM32 and controlling myogenic progression from proliferation to differentiation. Overall, the bulk of the data presented is robust. Although further investigation of specific aspects would make the conclusions more definitive (see below), it is an interesting contribution to the field of scientists studying the molecular basis of muscle diseases. In my opinion, a few aspects would improve the manuscript.

      Firstly, the conclusion that Trim32 regulates c-Myc mRNA stability could be expanded and corroborated by further mechanistic studies:

      1. Studies investigating whether Tim32 binds directly to c-Myc RNA. Moreover, although possibly beyond the scope of this study, an unbiased screening of RNA species binding to Trim32 would be informative.
      2. If possible, studies in which the overexpression of different mutants presenting specific altered functional domains (NHL domain known to bind RNAs and Ring domain reportedly involved in protein ubiquitination) would be used to test if they are capable or incapable of rescuing the reported alteration of Trim32 KO cell lines in c-Myc expression and muscle maturation. An optional aspect that might be interesting to explore is whether the alterations in c-Myc expression observed in C2C12 might be replicated with primary myoblasts or satellite cells devoid of Trim32.

      I also have a few minor points to highlight:

      • It is unclear if the differences highlighted in graphs 5G, EV5D, and EV5E are statistically significant.
      • On page 10, it is stated that c-Myc down-regulation cannot rescue KO myotube morphology fully nor increase the differentiation index significantly, but the corresponding data is not shown. Could the authors include those quantifications in the manuscript?

      Significance

      The manuscript offers several strengths. It provides novel mechanistic insight by identifying a previously unrecognized role for Trim32 in regulating c-Myc mRNA stability during the onset of myogenic differentiation. The study is supported by a robust methodology that integrates CRISPR/Cas9 gene editing, transcriptomic profiling, flow cytometry, biochemical assays, and rescue experiments using siRNA knockdown. Furthermore, the work has a disease relevance, as it uncovers a mechanistic link between Trim32 deficiency and impaired myogenesis, with implications for the pathogenesis of LGMDR8. At the same time, the study has some limitations. The findings rely exclusively on the C2C12 myoblast cell line, which may not fully represent primary satellite cell or in vivo biology. The functional rescue achieved through c-Myc knockdown is only partial, restoring Myogenin expression but not the full differentiation index or morphology, indicating that additional mechanisms are likely involved. Although evidence supports a role for Trim32 in mRNA destabilization, the precise molecular partners-such as RNA-binding activity, microRNA involvement, or ligase function-remain undefined. Some discrepancies with previous studies, including Trim32-mediated protein degradation of c-Myc, are acknowledged but not experimentally resolved. Moreover, functional validation in animal models or patient-derived cells is currently lacking.

      Despite these limitations, the study represents an advancement for the field. It shifts the conceptual framework from Trim32's canonical role in protein ubiquitination to a novel function in RNA regulation during myogenesis. It also raises potential clinical implications by suggesting that targeting the Trim32-c-Myc axis, or modulating c-Myc stability, may represent a therapeutic strategy for LGMDR8. This work will be of particular interest to muscle biology researchers studying myogenesis and the molecular basis of muscle disease, RNA biology specialists investigating post-transcriptional regulation and mRNA stability, and neuromuscular disease researchers and clinicians seeking to identify new molecular targets for therapeutic intervention in LGMDR8.

      The Reviewer expressing this opinion is an expert in muscle stem cells, muscle regeneration, and muscle development.

    1. Reviewer #2 (Public review):

      Summary:

      This article presents Morphonet 2.0, a software designed to visualise and curate segmentations of 3D and 3D+t data. The authors demonstrate its capabilities on five published datasets, showcasing how even small segmentation errors can be automatically detected, easily assessed and corrected by the user. This allows for more reliable ground truths which will in turn be very much valuable for analysis and training deep learning models. Morphonet 2.0 offers intuitive 3D inspection and functionalities accessible to a non-coding audience, thereby broadening its impact.

      Strengths:

      The work proposed in this article is expected to be of great interest for the community, by enabling easy visualisation and correction of complex 3D(+t) datasets. Moreover, the article is clear and well written making MorphoNet more likely to be used. The goals are clearly defined, addressing an undeniable need in the bioimage analysis community. The authors use a diverse range of datasets, successfully demonstrating the versatility of the software.

      We would also like to highlight the great effort that was made to clearly explain which type of computer configurations are necessary to run the different dataset and how to find the appropriate documentation according to your needs. The authors clearly carefully thought about these two important problems and came up with very satisfactory solutions.

      Weaknesses:

      Sometimes, it can be a bit difficult to assess the strength of the improvements made by the proposed methods, but this is not something the authors could easily address, given the great complexity of the samples

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review): 

      The authors present a substantial improvement to their existing tool, MorphoNet, intended to facilitate assessment of 3D+t cell segmentation and tracking results, and curation of high-quality analysis for scientific discovery and data sharing. These tools are provided through a user-friendly GUI, making them accessible to biologists who are not experienced coders. Further, the authors have re-developed this tool to be a locally installed piece of software instead of a web interface, making the analysis and rendering of large 3D+t datasets more computationally efficient. The authors evidence the value of this tool with a series of use cases, in which they apply different features of the software to existing datasets and show the improvement to the segmentation and tracking achieved. 

      While the computational tools packaged in this software are familiar to readers (e.g., cellpose), the novel contribution of this work is the focus on error correction. The MorphoNet 2.0 software helps users identify where their candidate segmentation and/or tracking may be incorrect. The authors then provide existing tools in a single user-friendly package, lowering the threshold of skill required for users to get maximal value from these existing tools. To help users apply these tools effectively, the authors introduce a number of unsupervised quality metrics that can be applied to a segmentation candidate to identify masks and regions where the segmentation results are noticeably different from the majority of the image. 

      This work is valuable to researchers who are working with cell microscopy data that requires high-quality segmentation and tracking, particularly if their data are 3D time-lapse and thus challenging to segment and assess. The MorphoNet 2.0 tool that the authors present is intended to make the iterative process of segmentation, quality assessment, and re-processing easier and more streamlined, combining commonly used tools into a single user interface.   

      We sincerely thank the reviewer for their thorough and encouraging evaluation of our work. We are grateful that they highlighted both the technical improvements of MorphoNet 2.0 and its potential impact for the broader community working with complex 3D+t microscopy datasets. We particularly appreciate the recognition of our efforts to make advanced segmentation and tracking tools accessible to non-expert users through a user-friendly and locally installable interface, and for pointing out the importance of error detection and correction in the iterative analysis workflow. The reviewer’s appreciation of the value of integrating unsupervised quality metrics to support this process is especially meaningful to us, as this was a central motivation behind the development of MorphoNet 2.0. We hope the tool will indeed facilitate more rigorous and reproducible analyses, and we are encouraged by the reviewer’s positive assessment of its utility for the community.

      One of the key contributions of the work is the unsupervised metrics that MorphoNet 2.0 offers for segmentation quality assessment. These metrics are used in the use cases to identify low-quality instances of segmentation in the provided datasets, so that they can be improved with plugins directly in MorphoNet 2.0. However, not enough consideration is given to demonstrating that optimizing these metrics leads to an improvement in segmentation quality. For example, in Use Case 1, the authors report their metrics of interest (Intensity offset, Intensity border variation, and Nuclei volume) for the uncurated silver truth, the partially curated and fully curated datasets, but this does not evidence an improvement in the results. Additional plotting of the distribution of these metrics on the Gold Truth data could help confirm that the distribution of these metrics now better matches the expected distribution. 

      Similarly, in Use Case 2, visual inspection leads us to believe that the segmentation generated by the Cellpose + Deli pipeline (shown in Figure 4d) is an improvement, but a direct comparison of agreement between segmented masks and masks in the published data (where the segmentations overlap) would further evidence this. 

      We agree that demonstrating the correlation between metric optimization and real segmentation improvement is essential. We have added new analysis comparing the distributions of the unsupervised metrics with the gold truth data before and after curation. Additionally, we provided overlap scores where ground truth annotations are available, confirming the improvement. We also explicitly discussed the limitation of relying solely on unsupervised metrics without complementary validation.

      We would appreciate the authors addressing the risk of decreasing the quality of the segmentations by applying circular logic with their tool; MorphoNet 2.0 uses unsupervised metrics to identify masks that do not fit the typical distribution. A model such as StarDist can be trained on the "good" masks to generate more masks that match the most common type. This leads to a more homogeneous segmentation quality, without consideration for whether these metrics actually optimize the segmentation 

      We thank the reviewer for this important and insightful comment. It raises a crucial point regarding the risk of circular logic in our segmentation pipeline. Indeed, relying on unsupervised metrics to select “good” masks and using them to train a model like StarDist could lead to reinforcing a particular distribution of shapes or sizes, potentially filtering out biologically relevant variability. This homogenization may improve consistency with the chosen metrics, but not necessarily with the true underlying structures.

      We fully agree that this is a key limitation to be aware of. We have revised the manuscript to explicitly discuss this risk, emphasizing that while our approach may help improve segmentation quality according to specific criteria, it should be complemented with biological validation and, when possible, expert input to ensure that important but rare phenotypes are not excluded.

      In Use case 5, the authors include details that the errors were corrected by "264 MorphoNet plugin actions ... in 8 hours actions [sic]". The work would benefit from explaining whether this is 8 hours of human work, trying plugins and iteratively improving, or 8 hours of compute time to apply the selected plugins. 

      We clarified that the “8 hours” refer to human interaction time, including exploration, testing, and iterative correction using plugins. 

      Reviewer #2 (Public review):

      Summary: 

      This article presents Morphonet 2.0, a software designed to visualise and curate segmentations of 3D and 3D+t data. The authors demonstrate their capabilities on five published datasets, showcasing how even small segmentation errors can be automatically detected, easily assessed, and corrected by the user. This allows for more reliable ground truths, which will in turn be very much valuable for analysis and training deep learning models. Morphonet 2.0 offers intuitive 3D inspection and functionalities accessible to a non-coding audience, thereby broadening its impact. 

      Strengths: 

      The work proposed in this article is expected to be of great interest to the community by enabling easy visualisation and correction of complex 3D(+t) datasets. Moreover, the article is clear and well written, making MorphoNet more likely to be used. The goals are clearly defined, addressing an undeniable need in the bioimage analysis community. The authors use a diverse range of datasets, successfully demonstrating the versatility of the software. 

      We would also like to highlight the great effort that was made to clearly explain which type of computer configurations are necessary to run the different datasets and how to find the appropriate documentation according to your needs. The authors clearly carefully thought about these two important problems and came up with very satisfactory solutions. 

      We would like to sincerely thank the reviewer for their positive and thoughtful feedback. We are especially grateful that they acknowledged the clarity of the manuscript and the potential value of MorphoNet 2.0 for the community, particularly in facilitating the visualization and correction of complex 3D(+t) datasets. We also appreciate the reviewer’s recognition of our efforts to provide detailed guidance on hardware requirements and access to documentation—two aspects we consider crucial to ensuring the tool is both usable and widely adopted. Their comments are very encouraging and reinforce our commitment to making MorphoNet 2.0 as accessible and practical as possible for a broad range of users in the bioimage analysis community.

      Weaknesses: 

      There is still one concern: the quantification of the improvement of the segmentations in the use cases and, therefore, the quantification of the potential impact of the software. While it appears hard to quantify the quality of the correction, the proposed work would be significantly improved if such metrics could be provided. 

      The authors show some distributions of metrics before and after segmentations to highlight the changes. This is a great start, but there seem to be two shortcomings: first, the comparison and interpretation of the different distributions does not appear to be trivial. It is therefore difficult to judge the quality of the improvement from these. Maybe an explanation in the text of how to interpret the differences between the distributions could help. A second shortcoming is that the before/after metrics displayed are the metrics used to guide the correction, so, by design, the scores will improve, but does that accurately represent the improvement of the segmentation? It seems to be the case, but it would be nice to maybe have a better assessment of the improvement of the quality. 

      We thank the reviewer for this constructive and important comment. We fully agreed that assessing the true quality improvement of segmentation after correction is a central and challenging issue. While we initially focused on changes in the unsupervised quality metrics to illustrate the effect of the correction, we acknowledged that interpreting these distributions was not always straightforward, and that relying solely on the metrics used to guide the correction introduced an inherent bias in the evaluation.

      To address the first point, we revised the manuscript to provide clearer guidance on how to interpret the changes in metric distributions before and after correction, with additional examples to make this interpretation more intuitive.

      Regarding the second point, we agreed that using independent, external validation was necessary to confirm that the segmentation had genuinely improved. To this end, we included additional assessments using complementary evaluation strategies on selected datasets where ground truth was accessible, to compare pre- and post-correction segmentations with an independent reference. These results reinforced the idea that the corrections guided by unsupervised metrics generally led to more accurate segmentations, but we also emphasized their limitations and the need for biological validation in real-world cases.

      Reviewer #3 (Public review): 

      Summary: 

      A very thorough technical report of a new standalone, open-source software for microscopy image processing and analysis (MorphoNet 2.0), with a particular emphasis on automated segmentation and its curation to obtain accurate results even with very complex 3D stacks, including timelapse experiments. 

      Strengths: 

      The authors did a good job of explaining the advantages of MorphoNet 2.0, as compared to its previous web-based version and to other software with similar capabilities. What I particularly found more useful to actually envisage these claimed advantages is the five examples used to illustrate the power of the software (based on a combination of

      Python scripting and the 3D game engine Unity). These examples, from published research, are very varied in both types of information and image quality, and all have their complexities, making them inherently difficult to segment. I strongly recommend the readers to carefully watch the accompanying videos, which show (although not thoroughly) how the software is actually used in these examples. 

      We sincerely thanked the reviewer for their thoughtful and encouraging feedback. We were particularly pleased that the reviewer appreciated the comparative analysis of MorphoNet 2.0 with both its earlier version and existing tools, as well as the relevance of the five diverse and complex use cases we had selected. Demonstrating the software’s versatility and robustness across a variety of challenging datasets was a key goal of this work, and we were glad that this aspect came through clearly. We also appreciated the reviewer’s recommendation to watch the accompanying videos, which we had designed to provide a practical sense of how the tool was used in real-world scenarios. Their positive assessment was highly motivating and reinforced the value of combining scripting flexibility with an interactive 3D interface.

      Weaknesses: 

      Being a technical article, the only possible comments are on how methods are presented, which is generally adequate, as mentioned above. In this regard, and in spite of the presented examples (chosen by the authors, who clearly gave them a deep thought before showing them), the only way in which the presented software will prove valuable is through its use by as many researchers as possible. This is not a weakness per se, of course, but just what is usual in this sort of report. Hence, I encourage readers to download the software and give it time to test it on their own data (which I will also do myself).   

      We fully agreed that the true value of MorphoNet 2.0 would be demonstrated through its practical use by a wide range of researchers working with complex 3D and 3D+t datasets. In this regard, we improved the user documentation and provided a set of example datasets to help new users quickly familiarize themselves with the platform. We were also committed to maintaining and updating MorphoNet 2.0 based on user feedback to further support its usability and impact.

      In conclusion, I believe that this report is fundamental because it will be the major way of initially promoting the use of MorphoNet 2.0 by the objective public. The software itself holds the promise of being very impactful for the microscopists' community. 

      Reviewer #1 (Recommendations for the authors): 

      (1) In Use Case 1, when referring to Figure 3a, they describe features of 3b? 

      We corrected the mismatch between Figure 3a and 3b descriptions.

      (2) In Figure 3g-I, columns for Curated Nuclei and All Nuclei appear to be incorrectly labelled, and should be the other way around. 

      We corrected  the label swapped between “Curated Nuclei” and “All Nuclei.”

      (3) Some mention of how this will be supported in the future would be of interest. 

      We added a note on long-term support plans  

      (4) Could Morphonet be rolled into something like napari and integrated into its environment with access to its plugins and tools? 

      We thank the reviewer for this pertinent suggestion. We fully recognize the growing importance of interoperability within the bioimage analysis community, and we have been working on establishing a bridge between MorphoNet and napari to enable data exchange and complementary use of the two tools. As a platform, all new developments are first evaluated by our beta testers before being officially released to the user community and subsequently documented. The interoperability component is still under active development and will be announced shortly in a beta-testing phase. For this reason, we were not able to include it in the present manuscript, but we plan to document it in a future release.

      (5) Can meshes be extracted/saved in another format? 

      We agreed that the ability to extract and save meshes in standard formats was highly useful for interoperability with other tools. We implemented this feature in the new version of MorphoNet, allowing users to export meshes in commonly used formats such as OBJ or STL. Response: We thank the reviewer for this pertinent suggestion. We fully recognize the growing importance of interoperability within the bioimage analysis community, and we have been working on establishing a bridge between MorphoNet and napari to enable data exchange and complementary use of the two tools. As a platform, all new developments are first evaluated by our beta testers before being officially released to the user community and subsequently documented. The interoperability component is still under active development and will be announced shortly in a beta-testing phase. For this reason, we were not able to include it in the present manuscript, but we plan to document it in a future release.

      Reviewer #2 (Recommendations for the authors): 

      As a comment, since the authors mentioned the recent progress in 3D segmentation of various biological components, including organelles, it could be interesting to have examples of Morphonet applied to investigate subcellular structures. These present different challenges in visualization and quantification due to their smaller scale.

      We thank the reviewer for this insightful suggestion. We fully agree that applying MorphoNet 2.0 to the analysis of sub-cellular structures is a promising direction, particularly given the specific challenges these datasets present in terms of resolution, visualization, and quantification. While our current use cases focus on cellular and tissue-level segmentation, we are actively interested in extending the applicability of the tool to finer scales. We are currently exploring plugins for spot detection and curation in single-molecule FISH data. However, this requires more time to properly validate relevant use cases, and we plan to include this functionality in the next release.

      Another comment is that the authors briefly mention two other state-of-the-art softwares (namely FIJI and napari) but do not really position MorphoNet against them. The text would likely benefit from such a comparison so the users can better decide which one to use or not. 

      We agreed that providing a clearer comparison between MorphoNet 2.0 and other widely used tools such as FIJI and Napari would greatly benefit readers and potential users. In response, we included a new paragraph in the supplementary materials of the revised manuscript, highlighting the main features, strengths, and limitations of each tool in the context of 3D+t segmentation, visualization, and correction workflows. This addition helped users better understand the positioning of MorphoNet 2.0 and make informed choices based on their specific needs.

      Minor comments: 

      L 439: The Deli plugin is mentioned but not introduced in the main text; it could be helpful to have an idea of what it is without having to dive into the supplementary material. 

      We included a brief description in the main text and thoroughly revise the help pages to improve clarity

      Figure 4: It is not clear how the potential holes created by the removal of objects are handled. Are the empty areas filled by neighboring cells, for example, are they left empty? 

      We clarified in the figure legend of Figure 4.

      Please remove from the supplementary the use cases that are already in the main text. 

      We cleaned up redundant use case descriptions.

      Typos: 

      L 22: the end of the sentence is missing. 

      L 51: There are two "."   

      L 370: replace 'et' with 'and'.   

      L 407-408, Figure 3: panels g-i, the columns 'curated nuclei' and 'all nuclei' seem to be inverted. 

      L 549: "four 4". 

      Reviewer #3 (Recommendations for the authors): 

      Dear Authors, what follows are "minor comments" (the only sort of comment I have for this nice report): 

      Minor issues: 

      (1) Not being a user of MorphoNet, I found that reading the manuscript was a bit hard due to the several names of plugins or tools that are mentioned, many times without a clear explanation of what they do. One way of improving this could be to add a table, a sort of glossary, with those names, a brief explanation of what they are, and a link to their "help" page on the web. 

      We understood that the manuscript might be difficult to follow for readers unfamiliar with MorphoNet, especially due to the numerous plugin and tool names referenced. To address this, we carried out a complete overhaul of the help pages to make them clearer, more structured, and easier to navigate.

      (2) Figure 4d, orthogonal view: It is claimed that this segmentation is correct according to the original intensity image, but it is not clear why some cells in the border actually appear a lot bigger than other cells in the embryo. It does look like an incomplete segmentation due to the poor image quality at the border. Whether this is the case or if the authors consider the contrary, it should be somehow explained/discussed in the figure legend or the main text. 

      We revised the figure legend and main text to acknowledge the challenge of segmenting peripheral regions with low signal-to-noise ratios and discussed how this affects segmentation.

      Small writing issues I could spot:   

      Line 247: there is a double point after "Sup. Mat..". 

      Line 329: probably a diagrammation error of the pdf I use to review, there is a loose sentence apparently related to a figure: "Vegetal view ofwith smoothness". 

      Line 393 (and many other places): avoid using numbers when it is not a parameter you are talking about, and the number is smaller than 10. In this case, it should be: "The five steps...". 

      Line 459: Is "opposite" referring to "Vegetal", like in g? In addition, it starts with lower lowercase. 

      Lines 540-541: Check if redaction is correct in "...projected the values onto the meshed dual of the object..." (it sounds obscure to me). 

      Lines 548-549: Same thing for "...included two groups of four 4 nuclei and one group of 3 fused nuclei.". 

      Line 637: Should it be "Same view as b"? 

      Line 646: "The property highlights..."? 

      Line 651: In the text, I have seen a "propagation plugin" named as "Prope", "Propa", and now "Propi". Are they all different? Is it a mistake? Please, see my first "Minor issue", which might help readers navigate through this sort of confusing nomenclature. 

      Line 702: I personally find the use of the term "eco-system" inappropriate in this context. We scientists know what an ecosystem is, and the fact that it has now become a fashionable word for politicians does not make it correct in any context. 

      We thank the reviewer for their careful reading of the manuscript and for pointing out these writing and typographic issues. We corrected all the mentioned points in the revised version, including punctuation, sentence clarity, consistent naming of tools (e.g., the propagation plugin), and appropriate use of terms such as “ecosystem.” We also appreciated the suggestion to avoid numerals for numbers under ten when not referring to parameters, and we ensured consistency throughout the text. These corrections improved the clarity and readability of the manuscript, and we were grateful for the reviewer’s attention to detail.

    1. rely on model assumptions for validity

      I do not agree. In design-based setting, the working model does not need to be correct. In model-based setting the situation is obviously different. I remember (from way back when I was teaching from it) that there is mention of some kind of bias in Cochran's book regarding regression estimator, but generally I think we nowadays consider model-assisted estimation as design-unbiased, at least approximately design-unbiased. Need to check from my Yello book, but I anyway think it is not correct to state the validity of regression estimation depends on the validity of the model.

    2. SYS can produce biased estimates.

      Göran Ståhl once gave me a lesson about this. I know I have written this sentence in my chapter in 2006, but Göran later proved me wrong. It is not biased, it is just increased variance. If you go through all the possible starting points, and calculate all the possible results, take a mean of them, it should be exactly unbiased.

    3. If you really must have a specific sample size, then the best approach is to specify a denser grid than needed and randomly or systematically thin points until the target sample size is reached (see, e.g., K. Iles (2003)).

      I do not know if it is of any importance here, but it could also be a pseudo-systematic sample. So that make a grid of desired size, and make a simple random sample of one unit from each. That would also be more like a real random sample.

    1. The film is divided into reels. The reels are usually equal in length, on an average from 900 to 1,200 feet long. The combination of the reels forms the picture. The usual length of a picture should not be more than from 6,500 to 7,500 feet. This length, as yet, involves no unnecessary exhaustion of the spectator. The film is usually divided into from six to eight reels. It should be noted here, as a practical hint, that the average length of a piece (remember the editing of scenes) is from 6 to 10 feet, and consequently from 100 to 150 pieces go to a reel. By orientating himself on these figures, the scenarist can visualise how much material can be fitted into the scenario.

      Astonishing, in a way, to see guidance so concretely dependent on the particular technology, given the abstract nature of most of this. Of course he really means duration, time, but it's like saying "a film should be 65 to 75 gigabytes, as not to exhaust the viewer".

    1. is positioned using a random mechanism

      But the systematic sample is still really one big cluster of plots, because when you select one plot, you select all of them. I noticed that you explained this later on, but I would prefer mentioning the problem also here. To avoid misunderstandings.

    1. Open

      Open - Source - Construct - Sauce

      leading to emergent Open Standards =

      not specification that can b implemented across platforms

      but treating the Brwwser as a Universal Platform

      allowing "implementations" across a whole variety

      of non-functional requirements

      constellations designed to expand/scale according to the number of

      • participants
      • sychronization
        • real time
        • asynchronous
      • consitency
        • eventual
        • instantaneous
      • reach
      • discoerabiity
      • privacy
      • security
      • availability

      make assummed impossibiities ineviatable

      Zooko's triangle

    1. While user studies can tell you a lot about the usability problems in your interface and help you identify incremental improvements to your design, they can’t identify fundamental flaws and they can’t tell you whether your design is useful. This is because you define the tasks. If no one wants to complete those tasks in real life, or there are conditions that change the nature of those tasks in real life, your user study results will not reveal those things. The only way to find out if something would actually be used is to implement your design and give it to people to see if it offers real value

      I think usability tests are very useful for identifying interface breakdowns. However, they can’t show whether a design truly provides value in real life. Designers often focus on whether users can complete tasks in a controlled setting. They assume that success there means success in the real world. As Ko points out, task completion alone doesn’t measure whether someone would actually want to use the product or integrate it into their routine. I’ve seen prototypes work perfectly in tests but fail when deployed because the tasks felt artificial or didn’t meet real user needs. That’s why real-world testing is so important. It’s absolutely harder and takes a lot more time, but it truly shows how people actually use a design in their daily lives.

    2. While user studies can tell you a lot about the usability problems in your interface and help you identify incremental improvements to your design, they can’t identify fundamental flaws and they can’t tell you whether your design is useful. This is because you define the tasks. If no one wants to complete those tasks in real life, or there are conditions that change the nature of those tasks in real life, your user study results will not reveal those things. The only way to find out if something would actually be used is to implement your design and give it to people to see if it offers real value (you’d know, because they wouldn’t want you to take it away).

      When thinking about designers perspectives and role this does not surprise me, however thinking from the users perspective it does as I always assumed user studies were the ultimate way to test a design, but considering what the author stated about them (at times) missing fundamental flaws which can change that perspective. It’s interesting how the author says that considering designers define the tasks, the results can’t show whether people would actually want to do those tasks in real life. It makes me realize how important it is to test a design’s real-world value, not just its usability from seeing if people would actually miss it if it were taken away.

    3. We’re here to test this system, not you, so anything that goes wrong is our fault, not yours. I’m going to give you some tasks to perform. I won’t be able to answer your questions during the test, because the goal is to see where people have difficulty, so we can make it easier. Do you have any questions before we begin?

      I liked how this part because i think a lot of people, including me, tend to feel judged when we try something new in front of others. This kinda push the responsibility back onto us as designers, which I think is more fair and also encourages better feedback.

      It also reminded me how easy it is to assume user are thing wrong but in reality, if they are confuse, the design and interface didnt not pass and failed. This is what I also talked about in my essay for informatics.

    4. but they generally can’t help you learn about whether the design achieves its larger goals (whether it’s useful, valuable, meaningful, etc.). This is because a usability test doesn’t occur in the context of someone’s actual life, where those larger goals are relevant.

      When testing a design, I thinks it natural to want to test all of the capabilities and limitations in one go. So, having this framework when approaching user tests is helpful because it prevents both the observer and user from becoming overwhelmed with the all goals the testing wants to achieve. Additionally, it'll help me be realistic when I conduct user testing, as I won't be able to get the all results I'm looking for in one session; rather, results would be collection over time in other sessions.

    5. Usability tests can help you learn about lower level problems in a user interface (layout, labeling, flow, etc.), but they generally can’t help you learn about whether the design achieves its larger goals (whether it’s useful, valuable, meaningful, etc.). This is because a usability test doesn’t occur in the context of someone’s actual life, where those larger goals are relevant.

      Yeah, I totally agree with this quote as usability tests are awesome for catching stuff like confusing buttons, weird layouts, or a clunky flow. But they don’t really show if the design actually fits into someone’s real life or if it’s genuinely useful. It made me realize that even if something tests well in a lab, it might still fail to be meaningful in the real world. I think it’s a good reminder that good design is about more than just making things easy to use and it’s about making them worth using.

    6. The goal of most usability tests is to discover aspects of a design that cause someone to fail at some task. We call these failures breakdowns, the idea being that someone can be following the correct sequence of steps to complete a task, but then fail to get past a crucial step. Once you’ve found the breakdowns that occur in your design, you can go back and redesign your interface to prevent breakdowns, running more usability tests after redesign to see if those breakdowns still occur. Usability tests allow the designer to observe these breakdowns in person, helping them to make highly informed interpretations of what caused them, informing redesign.

      I like this section because it highlights how valuable failure can be in the process of designing something. I agree that usability testing is less about providing a design and more so about finding out where it breaks down. It reminded me that good design comes from observing and understanding in order to fix those breakdowns, until the interface works for people.

    7. A major limitation of A/B tests is that because it’s difficult to come up with holistic measures of success, the results tend to be pretty narrow. Perhaps that’s okay if your definition of success is increased profit. Making more money is easy to measure. But if your definition of success is harder to measure (e.g., there’s less hate speech on your platform), A/B tests might be much harder to conduct. The ease with which A/B tests can run, and the difficulty of measuring meaningful things, can lead designers to overlook the importance of meaningful things. A good designer will resist this path of least resistance, focusing on the outcomes that matter to a design, independent of what tools make easy.

      I like how the content from this chapter relates to the content from INFO 300 and all the ideas from randomized tests. I am also taking INFO 370 which approaches many concepts in parallel as well on how difference design choices for the participants have different pros and cons depending on which validities the study/research is aiming for.

    8. For example, if you are designing a course planner for students, you would want to recruit students (but what kind of students)?If your representative users are challenging to recruit, you might have to get creative. I’ve often had to walk into coffee shops and ask random strangers, or spam mailing lists to ask people to participate. You have to be a bit bold and courageous to find participants, and find ways of compensating them for their time and attention. If you’re working for a company that invests in a whole team to find people to participate in user studies, you might be able to delegate this recruiting work to them.

      I like this point about how recruiting participants often requires creativity and courage, like approaching strangers or using mailing lists. It shows that good research isn’t just about having a solid plan; it’s also about being proactive and resourceful. I agree that finding representative users is one of the hardest parts of user research, since not everyone will fit the target audience or be easy to reach. This made me appreciate how much behind-the-scenes effort goes into designing a good study and how researchers often have to step outside their comfort zones to get meaningful results.

    1. Consistency and standards is the idea that designs should minimize how many new concepts users have to learn to successfully use the interface. A good example of this is Apple’s Mac OS operating system, which almost mandates that every application support a small set of universal keyboard shortcuts, including for closing a window, closing an application, saving, printing, copying, pasting, undoing, etc. Other operating systems often leave these keyboard shortcut mappings to individual application designers, leaving users to have to relearn a new shortcut for every application.

      When thinking about this specific section within the text, this paragraph about consistency and standards stood out to me because it shows how something as simple as consistency can completely shape the user experience. The example makes the point really clear, as when shortcuts and commands stay the same across applications, it saves users from constantly having to relearn basic actions. It also shows how thoughtful design is not always about adding new features, but about creating familiarity and predictability. That kind of consistency builds trust. When a user has familiarity with something that expectation of the user knowing and controlling understandably where they are and being able to direct where they want to head to is super important in terms of comfortability.

    2. Walkthroughs77 Polson, P. G., Lewis, C., Rieman, J., & Wharton, C. (1992). Cognitive walkthroughs: a method for theory-based evaluation of user interfaces. International Journal of Man-Machine Studies.  are methods where an expert (that would be you, novice designer), defines tasks, but rather than testing those tasks with real people, you walk through each step of the task and verify that a user would know to do the step, know how to do the step, would successfully do the step, and would understand the feedback the design provided. If you go through every step and check these four things, you’ll find all kinds of problems with a design.

      This step is key to designing a good workflow. In my prior work as a product designer. It helped me realize one of the biggest flaws in my design process, which is that I jump into prototyping or working on some form of visual elements before I thoroughly think through the workflow itself. This often results in big gaps in the workflow due to the design being based off of my experience and understanding.

      For example, one of the workflows I've designed was a feature to help users with cognitive disabilities to identify a position of interest (e.g. cashier), but one of the input source I used to determine the result was industry of interest, which doesn't make much sense of it now that I look back, since you can't expect most job seekers with cognitive disabilities to know whether they like to work in retail or hospitality, etc.

    3. If you ignore variation along these five dimensions, your design will only work for some people. By using multiple personas, and testing a task against each, you can ensure that your design is more inclusive. In fact, the authors behind GenderMag have deployed it into many software companies, finding that teams always find inclusiveness issues22 Burnett, M.M., Peters, A., Hill, C., and Elarief, N. (2016). Finding gender-inclusiveness software issues with GenderMag: A field investigation. ACM SIGCHI Conference on Human Factors in Computing (CHI). .

      I think this is a great point to consider. In the previous video, the record button had an icon and it would be interpretable by tech savvy people, but maybe not young children or older people who are not as familiar with newer technology. They could've done a walkthrough where they test the design against multiple personas, ensuring that their design is inclusive. Of course, there is the fact that designers want their product to work for their target audience. However, it is important to make designs more inclusive for every user base.

    4. Some researchers have addressed these flaws in persona choice by contributing more theoretically-informed persona. For example, GenderMag is similar to the cognitive walkthrough like the one above, but with four customizable persona that cover a broad spectrum of facets of software use11 Burnett, M., Stumpf, S., Macbeth, J., Makri, S., Beckwith, L., Kwan, I., Peters, A., Jernigan, W. (2016). GenderMag: A method for evaluating software's gender inclusiveness. Interacting with Computers. :A user’s motivations for using the software.A user’s information processing style (top-down, which is more comprehensive before acting, and bottom-up, which is more selective.)A user’s computer self-efficacy (their belief that they can succeed at computer tasks).A user’s stance toward risk-taking in software use.A user’s strategy for learning new technology.

      I found this section interesting because it shows how persona design can move beyond surface traits and actually reflect the way people think and behave. I agree that this approach makes evaluations more inclusive by considering these specific traits. It made me realize that realistic personas aren't just creative writing but they're grounded in real psychology and can reveal deeper issues pertaining to usability.

    5. Some researchers have addressed these flaws in persona choice by contributing more theoretically-informed persona. For example, GenderMag is similar to the cognitive walkthrough like the one above, but with four customizable persona that cover a broad spectrum of facets of software use

      I find this section on GenderMag and theoretically informed personas really fascinating. I agree that traditional personas often oversimplify users, and I like how GenderMag introduces specific cognitive and motivational dimensions to make design more inclusive. The five factors, such as information processing style and risk-taking, reveal that people approach technology in very different ways, and designers must account for this. This made me realize that inclusivity isn’t just about who uses the product, but also how they think, learn, and make decisions when using it.

    1. But as the number of features increases, combining and comparing features becomes less intuitive and more complex.

      more features = more complex to compare similarity

    1. The equator cuts the continent in half, and the rainforest is along the equator. Tropical climates occur around the equator because this area receives direct or relatively direct insolation (incoming solar radiation) for the entire year (Figure 4.4b&4.4c).

      Geography is way more complicated than I thought! Africa has such a variety of climates—not just deserts but rainforests, savannas, and even snow-capped mountains like Kilimanjaro. Climate, wind patterns, and ocean currents all affect how people live and farm. It’s really cool how people adapt to these conditions with so much traditional knowledge.

    2. San had more time for leisure, slept more, ate a more balanced diet, and worked less than their “more developed” farming neighbors. The San and other hunter-gatherers around the world know where they can find different resources, including food, shelter, and water, during the course of the year, and they migrate seasonally and purposely to find resources necessary for survival

      It’s honestly so eye-opening to see that the San work way fewer hours but still have everything they need, while people in “developed” countries are constantly stressed and overworked. It really makes me think that modern life isn’t necessarily better — it’s just different.

    3. The San people of the Kalahari Desert in southern Africa are one remaining such group. What thoughts come to mind when you see a picture of hunter-gatherers? Most Westerners see such groups as primitive, backward, or underdeveloped. We may think of hunter-gatherers as “less developed” than city dwellers in New York or London. Whether we are conscious of it or not, we likely place people on a continuum of development, a scale typically linked to indicators of material well-being. What criteria do we use to measure development in our mind, and why do we use these criteria? Development implies progress, but progress in what? Does development mean amassing wealth? Does development mean access to clean water and a steady food supply? Can people be poor and developed at the same time? While we may perceive hunter-gatherers as primitive or underdeveloped, hunter-gatherers necessarily worse off than we are? Studies suggest that one group of San spent 12 to 19 hours per week working to obtain food as compared to the 40-some-hour workweek of most people in the so-called developed world.

      The discussion about the San people really challenges the idea of them being “underdeveloped.” Honestly, I feel like they actually use their time really well and live in a sustainable way that works for them. In some ways, that makes them more economically balanced than people might assume when they call them “underdeveloped.”

    1. For some people, it is not difficult to use Standard English, because it happens to be their local dialect. But for others in different parts of the country, they may have to remind themselves to follow the rules, including the sentence order and grammar of Standard English, when they are speaking or writing in a formal context. However, Standard English can be spoken in any accent, and must not be confused with talking ‘posh’.

      when people talk in standard English need to used the rules like grammar and sentence cause this used in formal English.

    2. Standard English The history of English is quite a story in itself, with dramatic changes and great variety. Up to about 450, British (Celtic) tribes spoke languages related to modern Welsh, Scots Gaelic, and Irish (Erse). However, the years between 450 and 1066 brought about great change. The Angles, Saxons, and Jutes invaded from North Germany in around 450, and settled on the eastern side of what is now called England. Their language, Anglo-Saxon, spread across to the west of England and developed into what we now call Old English. Many of the words we use today still relate back to Old English – but this was soon to change too. Other invasions, this time in the form of the Vikings from Scandinavia, influenced the language with new words from the Viking's language Old Norse that entered Old English between 800 and 900.

      history about how the English changes in different countries

    3. Do you speak more than one language? Perhaps you are taught French or German at school, or English is your second language, and you speak a different tongue at home. But have you ever thought that you also speak different forms of language? For example, you probably speak to your friends in a way that you would never speak to, say, an interviewer in an interview. Hopefully, you would write differently in an exam than you would in a text message or e-mail! When we communicate with different people and in different situations, we naturally follow different sets of rules and patterns, often without having to think about the switches and transitions we are making. The most used form of English is Standard English.

      manners to talk in English and different forms that you speak in public or home.

    1. this would have confused me if i were in the situation because the sentence C said doesnt make any sense to me. but the teacher doesnt bat an eye, they just repeat what C said to M. because it isnt necessary for the reasoning to align with the teacher's logic, nor does the teacher need to know the whole contextual backstory. because if the teacher did, then that could put them in. situation in which they have to ask themselves what THEY think the right solution is, and that's not the idea at all!!!, only M needs to understand C.

    2. ooooooo so the teacher is just fully adhering to what the child says first. if K says "he can write his letters" then that's that... UNLESS R can express a take of their own. The children are fully in the arena here, and using words to communicate is not only encouraged by the teacher explicitly, but implicitly in the very environment that they have created.

      difficulty speaking is helped, of course. R seems quieter than K by a long shot, but the teacher is mediating the whole time, and that means assisting R in his communication.

    3. so many things here

      • presents the alternate option of talking to K
      • expresses that the "talk" option comes with help and support from you, but "fight" option does not
      • positions you, the teacher, as an ally to the child rather than an enemy (i won't let anyone hit you either)
      • in response to the potential question "why talk instead of push?" the answer is, "because one comes with explicit support and protection while the other one does not"

    Annotators

    1. Such cognitive artefacts may operate in different ways and using different functions such that they complement human cognition – in effect they extend what the human mind can do, rather than replicate it.

      Scripts can quickly promote patterns like in my case "sleep-vision + wine/ointment," but historical judgement still decides if a line really describes a medicinal step made or just a metaphor.

    1. Similarly, the pages of the annual Computer Applications in Archaeology conference proceedings are filled with accounts of applications and case studies of their use, but examples of failure are rare, not least because the incentives for authors and publishers to report successes are naturally greater.

      Publishing only wins hides where tagging or counts break, this quote signals to me that I also have to show cases that failed to build trust to the conclusions I make. I will have to add a tiny "case failure" box/paragraph if for example, inscriptions where incubation is present but no drug is.)

    1. As survey instrumentation becomes digital and increasingly automated, so the level of human engagement changes: the cognitive load is transferred to the digital device while the survey strategy and (for now) the physical assembly and setup of the instrumentation remains on the human side of the relationship.

      Sometimes point and click spreadsheets can feel effortless, but they sometimes hide certain steps. This means for the dataset for my project I need to keep it relatively short and keep readable notes so that it's easy to follow how the linkages work between Incubation, substances, and cures at the Sanctuary of Asklepius at Epidaurus

    1. The Igbo are neighbors of the highly politically centralized Yoruba, but their political system is much dif-ferent. Instead of centralized kingdoms headed by powerful “kings” and their advisers, the Igbo had no centralized system of governance. Rather they lived in politically autonomous villages. That is, each village was politically separate and was politically not directly connected to neighbor-ing villages. Within the villages, there was not a system of hereditary chiefs. Village decisions were made by a headman and a council of elders that selected the headman. The absence of a centralized system of govern-ment did not mean that there were no systems or institutions of governance among the Igb

      I find interesting that society was able to thrive, despite having a centralized government since I've been raised on such strong encouragement of the federal government.

    1. Many experts now include knowledge as a fifth factor, acknowledging its key role in business success.

      This section demonstrates how much the world has changed, which is why I find it so fascinating. Success used to mostly depend on material possessions like land or machinery, but these days it's more about people's knowledge, ideas, and technological prowess.

    1. We can involve students in the process of curating content for courses, either by offering them limited choices between different texts or by offering them solid time to curate a future unit more or less on their own (or in a group) as a research project

      Absolutely true, students learn by doing, so as they are involved in creating course content, they develop course development skills, which will enable develop a course from scratch in the future.

      I believe most teachers are skeptical about letting undergraduate students be involved in course content creation or modification, but tend to let graduate students undertake these assignments.

    1. Digital archaeology should exist to assist us in the performance of archaeology as a whole. It should not be a secret knowledge, nor a distinct school of thought, but rather simply seen as archaeology done well, using all of the tools available to and in better recovering, understanding and presenting the past.

      This expresses the idea that digital tools should be acting as extensions of thought rather than replacements. It connects to my final project because GIS and digital mapping help interpret archaeological data more effectively without detaching from human analysis.

    1. carty.github.io/FRIplaybook/composite.html scoreItems(keys = SH_T2_keys, items = combined) Call: scoreItems(keys = SH_T2_keys, items = combined) (Unstandardized) Alpha: SLEEPHYGIENE_T2 alpha 0.7 Standard errors of unstandardized Alpha: SLEEPHYGIENE_T2

      is this section the same composite scoring or something else-- add explanation or maybe sub heading for clarity-- the alpha and standard error values make me think this is some type of modeling but source labeled as more composite scoring?

    1. ation but must adialects, b

      A call to action. This entire essay is proactive in stance, serving to not only educate readers, but inspire them to find a way to incite change. Later on, this call to action is supported by a list of reading material that readers can utilize to educate themselves further on the subject matter.

    1. But we no longer live in an age of information scarcity. The lecture is a solution to a problem we no longer have. The challenge for colleges and universities in the twenty-first century is to deliver artisanal-quality learning at an industrial scale. For decades, this has been an impossible dream. Until now.

      Logic: The lecture format of learning was in place to "allow one expert to broadcast information"- which can now happen in a multitude of ways (cue "flipped classroom")- should be supplanted with at-scale personalized learning. This is the best way to scale the Socratic ideal, short of direct expert-to-learner instruction.

    1. hese digital transformations are not only diversifying the sources and types ofnews available but are also prompting a re-evaluation of journalistic norms and practices.

      werpt wel de vraag op of serieuze nieuwsmedia dan niet teveel meegaan in de aandachtseconomie die tech giganten ons (burgers/consumenten) ons opdringen

  5. social-media-ethics-automation.github.io social-media-ethics-automation.github.io
    1. ersion ID: 1

      This Wikipedia entry describes the evolutionary path of cetaceans, starting with the discovery of cetacean fossils on land, then moving to shallow waters, and finally becoming entirely marine. Modern cetaceans also fall into two main categories, such as Mysticeti and Odontoceti. Their evolution involves not only genetic and skeletal development but also cultural behaviors, such as the use of different tools for foraging. Environmental factors also influenced the divergence of cetaceans. One such detail is the radioactive events; I observed three large-scale radioactive events, the last one occurring 12-2 million years ago.

    2. Rick Paulas. What It Feels Like to Go Viral. Pacific Standard, June 2017. URL: https://psmag.com/economics/going-viral-is-like-doing-cartwheels-on-the-water-spout-of-a-giant-whale (visited on 2023-12-08).

      This article by Rick Paulas provides a visceral, first-hand feel for what it means to go “viral” in social media contexts—describing it as something like “doing cartwheels on the water-spout of a giant whale.” That kind of metaphor really brings home how thrilling yet unstable virality is: fun, exhilarating, but also out of control and potentially dangerous.

    3. Meme. December 2023. Page Version ID: 1187840093. URL: https://en.wikipedia.org/w/index.php?title=Meme&oldid=1187840093#Etymology (visited on 2023-12-08).

      By reading the wikipedia I learned the why memes are called memes, and understanding a viral word with the biological explanation is very interesting. Before reading the article, I thought the memes were just pictures we share online, but after learning the theory of memes, I realized that these online pictures are also a form of evolution.

    4. Bean Dad. January 2021. URL: https://knowyourmeme.com/memes/events/bean-dad (visited on 2023-12-08).

      A bean dad is about a person named John Roderick who tweeted a story of his 9-year-old daughter asking him to open a can of beans, but John Roderick thinks she should figure out how to open it herself. His daughter ended up spending hours just to open the can of beans. After he posted the tweet, people started accusing him, saying it is an abuse for him to treat his daughter like that. After this drama, people discovered more about John Roderick’s past controversies. Shortly, people turned this into a meme. Users edited this story to fit in a different scene. 

    5. Rick Paulas. What It Feels Like to Go Viral. Pacific Standard, June 2017. URL: https://psmag.com/economics/going-viral-is-like-doing-cartwheels-on-the-water-spout-of-a-giant-whale (visited on 2023-12-08).

      This article really resonates with people. The author describes "going viral" as "doing a cartwheel in the blowhole of a giant whale" - it sounds funny, but just imagine how out of control it is. You post something on your social media, and suddenly it becomes the topic of discussion all over the world overnight. The pressure and absurdity hit you all at once. It reminds us that although going viral on the internet seems glamorous, it often comes with huge psychological burdens, exposure of privacy, and completely unpredictable consequences. In other words, "the light of going viral" has a lot of "burning heat" in it.

    6. Rowland Manthorpe. It's the attention economy, stupid: why Trump represents the future whether we like it or not. Wired UK, 2016. URL: https://www.wired.co.uk/article/us-president-donald-trump-attention-economy (visited on 2023-12-08).

      This article explains how social media rewards whatever grabs attention, even if it's shocking or negative. It connects really well with this chapter's idea of selection: the posts that get the most reactions are the ones that survive and spread. I thought it was interesting but also a bit depressing that success online often means being louder, not smarter. It made me realize how easily the attention economy can shape what we see and believe.

    7. Monica Lewinsky. December 2023. Page Version ID: 1187944516. URL: https://en.wikipedia.org/w/index.php?title=Monica_Lewinsky&oldid=1187944516 (visited on 2023-12-08).

      I think Monica Lewinsky is a great example of virality. Before her scandal of being rumored to have had an affair with U.S. president Bill Clinton, no one really knew her. She was just a white house intern. But after news broke out and theories started to spread, her life changed drastically to constantly being ridiculed by the press and public. Although many years have passed, and she's now an activist for women's rights, people still reference the meme of her "being under the desk" to this day.

    1. A meme is a piece of

      For my experience, nowadays meme is kind of trends. People will put the most hottest influencers on meme, or some funny graph or words on meme. A particular type of meme may become trendy for a period of time, but meme trends change very quickly. A new meme trend might emerge roughly every month. People frequently showcase currently popular memes on social media. For example, the Madagascar penguin is a very popular meme on Chinese social media right now; it's both funny and well-known.

    2. A meme is a piece of culture that might reproduce in an evolutionary fashion, like a hummable tune that someone hears and starts humming to themselves, perhaps changing it, and then others overhearing next. In this view, any piece of human culture can be considered a meme that is spreading (or failing to spread) according to evolutionary forces. So we can use an evolutionary perspective to consider the spread of:

      This reminds me of something quite silly but I think it's worth mentioning. While this term was later adapted to refer to what we today call a meme, it was still in use a this definition before and did circle through media, which made the media retroactively very comedic through the redefining of the word meme. My favorite example of this is the 2013 game Metal Gear Rising: Revengeance, which has a plot points revolving around how the only thing that truly matters to a persons self and decisions is memes and the ideas that their culture pass on to them. But with our modern definition, all the thoughtful speeches throughout the game become unintentionally very funny.

    3. Since genes contained information about how organisms would grow and live, then biological evolution could be considered to be evolving information. Dawkins then took this idea of the evolution of information and applied it to culture, coining the term “meme” (intended to sound like “gene”

      Before reading this chapter I had never tied meme with the biological evolution, and I didn't know that the term "meme" comes from gene. It is very interesting to me how memes are spreading just like evolution process. Memes are spread super easily, and people edit it and spread it in so many different ways but still keeps the main theme. It is very interesting how people find different ways to express themselves on the internet with different memes.

    4. the 1976 book The Selfish Gene [l3], evolutionary biologist Richard Dawkins[1] said rather than looking at the evolution of organisms, it made even more sense to look at the evolution of the genes of those organisms (sections of DNA that perform some functions and are inherited). For example, if a bee protects its nest by stinging an attacking animal and dying, then it can’t reproduce and it might look like a failure of evolution. But if the gene that told the bee to die protecting the nest was shared by the other bees in the nest, then that one bee dying allows the gene to keep being replicated, so the gene is successful evolutionarily. Since genes contained information about how organisms would grow and live, then biological evolution could be considered to be evolving information. Dawkins then took this idea of the evolution of information and applied it to culture, coining the term “meme” (intended to sound like “gene” [l4]). A meme is a piece of culture that might reproduce in an evolutionary fashion, like a hummable tune that someone hears and starts humming to themselves, perhaps changing it, and then others overhearing next. In this view, any piece of human culture can be considered a meme that is spreading (or failing to spread) according to evolutionary forces. So we can use an evolutionary perspective to consider the spread of: Technology (languages, weapons, medicine, writing, math, computers, etc.), religions philosophies political ideas (democracy, authoritarianism, etc.) art organizations etc. We can even consider the evolutionary forces that play in the spread of true and false information (like an old saying: “A lie is halfway around the world before the truth has got its boots on.” [l5]) [1] While we value Dawkin’s contribution to evolutionary theory, we don’t want to make this an endorsement of any of his later statements or views. { requestKernel: true, binderOptions: { repo: "binder-examples/jupyter-stacks-datascience", ref: "master", }, codeMirrorConfig: { theme: "abcdef", mode: "python" }, kernelOptions: { name: "python3", path: "./ch12_virality" }, predefinedOutput: true } kernelName = 'python3' previous 12. Virality and Memes next 12.2. Pre-internet Virality Examples Contents 12.1.1. Biological Evolution 12.1.2. Memes By Kyle Thayer and Susan Notess © Copyright 2024. { "showHighlights": "whenSidebarOpen" }

      Reading this section about The Selfish Gene really amazed me. I never realized how deeply connected biology and culture could be. The idea that memes evolve in the same way as genes made me think differently about how fast ideas spread on social media today. Personally, I find it both exciting and a little scary—exciting because creativity can spread so quickly, but scary because misinformation can too. It made me realize how powerful our sharing behavior is in shaping modern “evolution.”

    1. Much of the internet has developed a culture of copying without necessarily giving attribution to where it came from. Often, unlike with Elon Musk, this copying also involves modifying the content, recontextualizing the content to give it new meaning, or combining it with other content

      Reading this section made me think about how normalized copying has become online. Platforms like TikTok, Twitter, and even meme pages thrive on remixing and reposting, but most people never think about who originally made something. Personally, I’ve shared memes and gifs without even realizing they came from artists who might want credit. I think Confucius’s idea of “li”—doing what’s proper and respectful—applies here: giving credit isn’t just a rule, it’s a way of showing respect for the creator and the community.

      At the same time, I agree with Michael Wesch’s point that remixing can be a form of cultural expression and creativity, not just theft. It’s tricky, though, when remixing turns into cultural appropriation—like when certain slang or imagery from Black culture is taken and used for jokes by people outside the culture. I think the line between cultural exchange and appropriation comes down to intent and respect. If you’re sharing something to appreciate and understand, that’s exchange. But if it’s just for clout or laughs, it’s exploitation.

      This section really made me rethink how I use memes and social media. I’m going to start paying more attention to where things come from—and maybe even give credit when I can, even if it’s just a tag or mention.

    2. For example, many phrases from Black American culture have been appropriated by white Americans

      Memes are a great way to get an insight into the culture of people but what is important is that it is those people making memes about themselves. Looking at memes made by black people that are poking fun or making references in black culture can be a great way to better understand black culture. What is problematic is white people making memes about black culture, because that is not coming from a place of deep understanding and critique but from prejudice. Its kinda like doing accents, if you as a white person do a impression of a marginalized group it better be really damn good or its gonna be racist. And if you can do a really good impression its likely that you are coming from a place of deep understanding of the culture because you have spent a lot of time with its people or something else.

    3. How do you think attribution should work when copying and reusing content on social media (like if you post a meme or gif on social media)?

      I think when people are copying and reusing content on social media the original creator should be given credit too. It provides them the recognition and popularity that so well deserved and earned. But besides giving credit there should be a watermark features on their content to ensure that they are given credit too. Often times original creators aren't provided with credit so other big creators take it from them and profit off of it.

    1. they often or always found their English teacher easy to understandwhen they spoke English

      Teachers mostly used English, but would switch to Norwegian for clarity.

    2. The status of Englishin the world is increasingly characterised by those who use it as a sec-ond or additional language, rather than by its native speakers (Jenkins,2015). English is not only spread globally, but also appropriated locally(Mufwene, 2010). Simultaneously, researchers have raised concernsTESOL QUARTERLY Vol. 54, No. 4, December 2020© 2020 The Authors. TESOL Quarterly published by Wiley Periodicals, Inc. on behalf of TESOL International Association925This is an open access article under the terms of the Creative Commons Attribution-NonCommercial License, which permitsuse, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercialpurposes.regarding the use of English at the expense of other languages as wellas the lack of inclusion of students’ existing language resources in theclassroom

      They are challenging English only teaching as they argue for balance.

    1. Users can also create intentionally bad or offensive content in an attempt to make it go viral (which is a form of trolling). So when criticism of this content goes viral, that is in fact aligned with the original purpose.

      I find this part very interesting. The concept of "cringe" in the 2010s was so prevalent, but now the lines are so blurred, as content many people enjoy can also be widely mocked as cringe. And on top of that, intentionally cringe content (which often goes viral and gets lots of success) makes this distinction even more difficult. But is this a good thing? It facilitates more diverse discourse and leads us away from homogenous thinking and opinions, as now there is disagreement. Which I think might actually be a good thing for internet culture.

    1. Additionally, content can be copied by being screenshotted, or photoshopped. Text and images can be copied and reposted with modifications (like a poem about plums [l17]). And content in one form can be used to make new content in completely new forms, like this “Internet Drama” song whose lyrics are from messages sent back and forth between two people in a Facebook Marketplace:

      As someone who uses social platforms and watches how memes / posts spread, I’ve observed that sometimes the version that goes viral isn’t the original but a mutated one (someone adds a caption, remix, or cross-posts to another network). The chapter’s point that inheritance matters jumped out: once a variation exists and spreads with the change, future copies carry that change. That resonates with seeing e.g. a tweet being quote-retweeted, then everyone repeats the quote-tweet version, not the original tweet.

    2. As we said before, evolution occurs when there is: replication (with inheritance), variations or mutations, and selection[1], so let’s look at each of those.

      I think that this comment is very interesting and I would like to add my personal experience with the mutation of social media and the memes of social media that evolved into what memes are today. Memes have evolved from one another and continue to do so every day, an example of this being "brainrot," a form of memes that first began around 2020 but still evolved into being relevant today.

    3. There are ways of duplicating that are built into social media platforms: Actions such as: liking, reposting, replying, and paid promotion get the original posting to show up for users more Actions like quote tweeting, or the TikTok Duet feature let people see the original content, but modified with new context. Social media sites also provide ways of embedding posts in other places, like in news articles

      The ease at which things are able to duplicate and spread across the internet is what creates trends online. Being able to show your friends a funny video allows for more eyes to be on a video which leads to the algorithm picking it up and getting more people to see it.

    4. 12.3.1. Replication (With Inheritance)# For social media content, replication means that the content (or a copy or modified version) gets seen by more people. Additionally, when a modified version gets distributed, future replications of that version will include the modification (a.k.a., inheritance). There are ways of duplicating that are built into social media platforms: Actions such as: liking, reposting, replying, and paid promotion get the original posting to show up for users more Actions like quote tweeting, or the TikTok Duet feature let people see the original content, but modified with new context. Social media sites also provide ways of embedding posts in other places, like in news articles There are also ways of replicating social media content that aren’t directly built into the social media platform, such as: copying images or text and reposting them yourself taking screenshots, and cross-posting to different sites

      I like how this section compares social media to evolution, and it actually makes a lot of sense. A post can reproduce when people share it, mutate when someone adds a caption or emoji, and then survive if it goes viral. It's funny but a little scary to realize how fast ideas can change and spread online. Sometimes the meaning of the original post completely disappears after being shared so many times. It made me think that maybe memes evolve faster than anything in nature!

    5. Finally, social media platforms use algorithms and design layouts which determine what posts people see. There are various rules and designs social media sites can use, and they can amplify human selection (including coordinated efforts like astroturfing) in various ways. They can do this through recommendation algorithms as we saw last chapter, as well as choosing what actions are allowed and what amount of friction is given to those actions, as well as what data is collected and displayed.

      I like how the chapter uses evolution to explain virality, but the “selection” part on social media feels more like artificial selection than natural. Platforms kinda breed certain traits on purpose (or at least by design): short, remix-able, high-arousal posts travel farther because the UI + metrics reward them. Remove visible like counts or add one extra click to repost and suddenly the “fitness” of outragey jokes drops—this isn’t nature, it’s a product decision, tbh. That ties back to algoritm ranking from last week: ranking isn’t a mirror, it’s a selector that shapes what even exists to be copied. So my question is: if platforms act as the main selector, how much responsiblity do they own for which memes win and which basically go extinct?

    6. Actions such as: liking, reposting, replying, and paid promotion get the original posting to show up for users more Actions like quote tweeting, or the TikTok Duet feature let people see the original content, but modified with new context. Social media sites also provide ways of embedding posts in other places, like in news articles

      I thought it was really interesting how the chapter compared social media to evolution. It made me realize that posts kind of “evolve” too like when people add comments, quote tweets, or make new versions of memes. I see this a lot on TikTok, where one simple video turns into so many different versions as people keep adding their own twist. It’s interesting that how creative that can be, but also a little scary because no one really controls where it goes, and sometimes it ends up spreading negative stuff or misinformation.

    1. The sense of deep time that the Anthropocene evokes and that the novel explicitly weaves into its historical narration of the Sundarbans region adds a new dimension to The Hungry Tide’s representation and reconciliation of the transcultural conflict between Western environmentalism and subaltern refugee agency.5 That is, it suggests that tensions between concerns of biodiversity loss and social injustice in the Sundarbans are part of a planetary crisis of agency unfolding over a much longer time period—both forward and backward—than that of colonization and decolonization. Addressing such tensions thus requires a longer temporal perspective capable not only of understanding the history [End Page 641] of colonialism, environmentalism, and globalization that conditioned events like the Morichjhãpi massacre, but also of anticipating the increasing agential challenges climate and geology will pose in cases of forced migration in South Asia.

      This passage significantly advances the essay's core argument by incorporating the concepts of deep time and the Anthropocene into the analysis of Amitav Ghosh's The Hungry Tide. It argues that the geological timescale evoked by the Anthropocene, which the novel weaves into its historical narrative of the Sundarbans, adds a vital new dimension to the novel's central conflict.

      Specifically, the author claims that framing the transcultural conflict between Western environmentalism and subaltern refugee agency in the Sundarbans within deep time suggests that these tensions are not merely historical (colonialism vs. decolonization), but are part of a broader, planetary crisis of agency unfolding across immense temporal scales, both past and future.

      This perspective implies that concerns over biodiversity loss and social injustice are fundamentally linked at the level of planetary change. Consequently, addressing these complex tensions such as the historical trauma of the Morichjhapi massacre requires a "longer temporal perspective." This expanded view is necessary to fully grasp the history that conditioned past events and, critically, to anticipate the increasing agential challenges that geology and climate change will pose to cases of forced migration in South Asia in the future.

    2. In this essay, I address an additional set of concerns and conciliatory gestures that The Hungry Tide models and that have been little discussed in scholarship on the novel but have burgeoned in postcolonial ecocriticism concerning climate change and the Anthropocene. Namely, I argue that the novel demonstrates the political value of a utopian approach to refugee agency in South Asia under conditions of climate-induced migration.

      the thesis of an academic essay analyzing Amitav Ghosh’s novel, The Hungry Tide. The author frames their argument within recent scholarship on postcolonial ecocriticism, specifically addressing climate change and the Anthropocene, concerns previously underexplored in novel scholarship.

      The central claim is that the novel demonstrates the political value of a utopian approach to refugee agency in South Asia, particularly for populations facing climate-induced migration. This focus shifts critical attention to how the text models imaginative, hopeful solutions for empowerment and survival, moving beyond discussions solely focused on ecological degradation and conflict.

    1. When physical mail was dominant in the 1900s, one type of mail that spread around the US was a chain letter [l7]. Chain letters were letters that instructed the recipient to make their own copies of the letter and send them to people they knew. Some letters gave the reason for people to make copies might be as part of a pyramid scheme [l8] where you were supposed to send money to the people you got the letter from, but then the people you send the letter to would give you money. Other letters gave the reason for people to make copies that if they made copies, good things would happen to them, and if not bad things would, like this:

      I think this is interesting because it reminds me of copypastas that can be found on the internet. Sometimes, there will be a TikTok in my feed that is of the same nature, urging people to repost and use the audio for good luck. I did not know chain letters were a thing and it's really interesting to see how they are carried over in the digital age.

    2. Sourdough bread is made by baking something called a “starter,” which is a mix of flour, water, and a colony of microorganisms (like yeast).

      I like the analogy of sourdough for the Internet. It seemed silly to me at first, but it actually is quite accurate. The sourdough starter grows and develops over time. In the same way, a meme or an online joke starts with one user / one event, and morphs depending on who interacts with it. People can put viral topics in new contexts and give them a new light. The sourdough starter can be used into multiple different loaves.

  6. social-media-ethics-automation.github.io social-media-ethics-automation.github.io
    1. In what ways have you experienced going viral?

      I had an interesting experience during covid when we were all locked indoors of going viral on tik tok and I will never forget it. I was always a bit obsessed with going viral during covid as any middle schooler in the time was. It was right when the video game among us was going viral itself and I decided to try and benefit off of that. I played the game a lot and really enjoyed playing, I decided to create a fresh tik tok account that would post funny among us content. Videos would be 60 seconds and of my game play along with funny sound effects over the gameplay and my videos went pretty viral. I worked up to 170 thousand followers and a total of around 5 million likes and even more views. It was a very fun but also stressful experience because once I reached that viral status, I was constantly worried about keeping it and not going down in views.

    1. On Monday or Tuesday, the Ministers of the Interior of the states are coming to a meeting about the SA. I have no doubt that we will master it – one way or the other. I think we have already drawn its poisonous fangs. One can made good tactical use of the endless declarations of legality made by the SA leaders, which they have handed to me in thick volumes. The SA is thereby undermining its credibility. But there are still difficult weeks of political maneuvering until the various Landtag elections are over. Then, one will have to start working towards making the Nazis acceptable as participants in a government because the movement, which will certainly grow, can no longer be suppressed by force. Of course the Nazis must not be allowed to form a government of their own anywhere, let alone in the Reich. But in the states an attempt will have to be made here and there to harness them in a coalition and to cure them of their utopias by constructive government work. I can see no better way, for the idea of trying to destroy the Party through an anti-Nazi law on the lines of the old anti-Socialist law I would regard as a very unfortunate undertaking. With the SA of course it is different. They must be eliminated in any event, and ideally the so-called Iron Front as well. [ . . . ] Source of English translation: Jeremy Noakes and Geoffrey Pridham, eds., Nazism 1919-1945, Vol. 1,The Rise to Power 1919-1934. Exeter: University of Exeter Press, 1998, pp. 98-99

      Point 3 primary source for references

    1. Both genetic and environmental factors are considered as important contributors to the development and progression of this disorder. The environmental factors have been linked to changes in gene expression through epigenetic modulations,

      This sets up the reason epigenetics matters in schizophrenia: not just genetic inheritance but modifiable environmental impacts shape risk and progression. Epigenetic mechanisms such as DNA methylation, ncRNA regulation, histone modification serve as bridges between what you inherit and what you experience.

    1. Ok. We agree that all speakers have the riglanguage variety or style they prefer. But speakconsequences of their choices. ALL speakerstigmatized language style will be subject toappropriateness of language use in context, anStates are especially subject

      I think that a standard english is quite unreasonable for the average person to use consitently, and that even in professional situations, often when I have broken out of that mold, even for a joke or passing comment, it not only humanizes me but connects me to others, even when I say something standard english would not deem acceptable.

    2. e agree with Gerald Graff who notes that "Young's argumentsleave a number of questions unanswered" and asks: "What does compe-tent code-meshing look like in student writing and speaking, and how willteachers determine the difference between successful and effective code-meshing and awkwardly cobbled together mixes of formal and vernacularEnglish?" (16

      I think this does complicater someideas, as I think one can successfully code mesh but much of the success is deppendent on understanding ones audience and if they will understand the purpose of code meshing. I often speak to friends and family in multiple dialects, which I determine how and when to do such based of of my relationship with and knowledge of my audience. However, in education, such as grading, how does one differentiate bad writing vs code-meshing? I believe this can be determined from the usage and context, such asif code-meshing can add a stronger argument or foundation for ones work.

    3. ecause Wheeler isof vernacular-speakInstead, she talks abof the home to theget the message acrlin

      This ssue is not about race, but about encouraging students and nuturing thier language skills

    4. In a diglossie community, a speaker may use osuch as banking but use a different variety forway, the varieties fill what are described as 'hisociety. High functions tend to be associated wlegal and governmental processes, and the highpoetry. Low functions are associated with the hfamily communicatio

      What I have been refering to as proffessional vs personal, many people will change their dialect depending on their situation, sometimes without meaning to. I have often used the so called 'customer service voice' at jobs and ets, but without even thinking about it conciously. I do make this switch to make others around me either more comfortable, or in some cases, to lessen aggression from others. By changing my tone and word choices, it is easier to navigate with certain people and their percetions of how a service worker should speak and act.

    5. Instead of 'correcting' student writing, teachers lead studentsand contrast the grammar of the home to the grammar of torder to be able to consciously choose the language patternsetting (see fig. 4). Often (but not always) in school, studenStandard English. Often (but not always) in narrative anwriting, students choose vernacular to create voi

      When taught the differences in dialects, compared to only be taught a standard dialect, students are given the option to chose for themselves, showing their own compentence and opions. This shows many chose standard for more proffesinal cases, and their own dialect for creative freedom and expression. By embracing their unique dialects, students aare able to learn more comprehensively, and have shown improvement in their reading a writing capabilities, vs forcing students to use only one dialect. Along with use, stopping students from using their home dialect, it can also cause them to lose their sense of self and distance them from their communities.

    6. She and others (Macto teach '"neutral ssystem," but instethat . . . trains [Afrand cult

      The goal is not to stop students from using their preffered dialects, but to use it as a tool to continue to teach students grammar, spelling, and reading comprehension.

    7. ls. Yet code-switching bidialectalists and some code-meshingproponents appear at odds over the role of Standard English in educationand on the national terrain. Specifically, Vershawn Ashanti Young hasrecently slammed code-switching for its "inherent racism" and its advo-cates for "translating] the racist logic of early twentieth century legalsegregation into a linguistic logic that undergirds twenty-first centurylanguage instruction" ("Nah"'

      Code-switching/meshing has mixed opinions, especially in education. Is labeling this switch as 'code-meshing' racist? it is undenieable that people will speak different dialects depending on many factors, not just race,but also class, location, gender, etc

    1. If the first woman God ever made was strong enough to turn the world upside down all alone, these women together ought to be able to turn it back , and get it right side up again! And now they is asking to do it, the men better let them.

      This line really stood out to me because it captures both empowerment and unity. Sojourner Truth reminds her audience that women’s strength has always been transformative, Eve “turned the world upside down,” and now women collectively have the power to restore justice and balance. I find this message timeless because it speaks not only to women’s resilience but also to the importance of solidarity in achieving equality. Truth’s words challenge the idea that women are passive or fragile; instead, she reframes them as powerful agents of change who can reshape society when they work together.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      The manuscript by Shukla et al described the "chromatin states" in the bryophyte Marchantia polymorpha and compared it with that in Arabidopsis thaliana. They described the generally common features of chromatin states between these evolutionally distant plant species, but they also find some differences. The authors also studied the connection between chromatin states and TF bindings, mostly in Arabidopsis due to the scarcity of the TF binding data in Marchantia. Their analyses lead to interesting finding that specific transcription families tend to associate with specific chromatin state, which tend to associate with specific genomic regions such as promoter, TSS, gene body, and fucultative heterochromatin. Overall, the authors provide novel piece of information regarding the evolutional conservation of chromatin states and the relationship between chromatin states and TFs.

      Major comments:

      1. In the end of the abstract they state "The association with the +1 nucleosome defines a list of candidate pioneer factors we know little about in plants", which is one of their major points. This is based on the results Fig4F and 4G, described in P27 L16-17. Question is, is cluster 1 TFs really associated with the +1 nucleosome? From Fig. 1C, +1 nucleosome is characterized mostly by E1 state and also by E2, F3, F4. However, from Fig. 4F, cluster 1 TFs are not associated with E1/E2 and association is not particularly strong for F3/F4. Indeeed association with E1/E2 is much conspicuous for cluster 4 TFs. Therefore, authors should reconsider this point and consider rephrasing or showing further results of analyses.

      2. P17 last line to P18, they state "The facultative heterochromatin states were primarily associated with the intergenic states I1 to I3, based on their enrichment in H3K27me3 and H2AK121ub, low accessibility, and low gene expression". I'm not sure about this statement. How can they say "primarily associated" from the data they cite? As far as the PTMs and variants patterns, I1 to I3 and facultative heterochromatin look different. The authors should explain more or rephrase.

      3. P20 L15, the authors state "Contrary to Arabidopsis, the promoters of Marchantia defined by the region just upstream of the TSS showed enrichment of H2AUb and the elongation mark H3K36me3, along with other euchromatic marks. " I have a concern that the TSS annotation could be inaccurate in Marchantia compared to more rigorously tested annotation of Arabidopsis thaliana, so that the relationship between TSS and histone PTMs could be different between species. The authors should make sure this is not the case.

      4. P21 last line to P22, they analyzed only H3K27me3 and H2Aub in the mutants of E(z) (Fig. 2E) and states that "we analyzed chromatin landscape in the Marchantia...". Is analyzing two histone marks enough to say "chromatin landscape"? In addition, they state "These findings suggest a strong independence of the two Polycomb repressive pathways in Marchantia. " However, they did not analyzed the effect of loss of PRC1 on H3K27me3; the opposite way. Actually, in Arabidopsis loss of PRC1 causes loss of H2Aub AND H3K27me3 (Zhou et al (2017) Genome Biol: DOI 10.1186/s13059-017-1197-z).

      5. Related to the above comments, they states "To further compare the regulation by PRC2 in both species,". However, they did not describe the knowledge about regulation by PRC2 in Arabidopsis. They should consider describing.

      6. P25 L14: "With this method to estimate TF activity, the scores of TF occupancy and activity converged. To look at different patterns of chromatin preferences among TFs, we kept ChIP-seq and DAP-seq data for ~300 TFs in Arabidopsis (after filtering out TFs with low scores of occupancy and activity)." This part is a little hard to follow. Perhaps better to explain in more detail.

      7. In discussion section P30 L19-21: "This could be due to open chromatin, which is associated with highly expressed genes and permissive for TF binding, generating highly occupied target regions (HOT) with redundant or passive activity (19)." This part needs further explanation; espetially for the latter part, It's not clar what the authors claim.

      Minor comments:

      1. P17 L21: H2bUb should be H2Bub.

      2. Legend of Fig. 4D: later should be latter.

      3. Legend of Fig. 4G and H: "clusters defined in figure-H" should be "defined in Fig. 4F"?

      Referee cross-commenting

      Reviewer #1 raises thorough and important points that should be addressed before the manuscript is published. Particularly about the comparison of chromatin states between Arabidopsis and Marchantia, as this paper will make foundation for further research in the future and serve as a resource for community, the authors should thoroughly look into the points raised by reviewer #1 including annotation of transcriptional units.

      Significance

      Strength and limitation: Strength of this paper is the insights into chromatin-based transcriptional regulation by defining chromatin states using combination of many epigenome data and compare it with TF biding data. Limitation is lack of experimental support for their interesting claims by perturbing histone PTMs, for example. Also, a limitation is that comparing only two species can tell subjective "similar" or "different" between species.

      Advance comparing past literature: One clear advance is studying chromatin states in a plant other than Arabidopsis thaliana. Another one is revealing that TFs can be classified into a number of groups according to the relationships with chromatin-based transcription regulation. However, experimental tests for these are awaited.

      Audience: Epigenetics, chromatin, and transcription researchers, plant biologists interested in transcriptional regulation.

      My expertise: Epigenome, genetics, histone PTMs, plants

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      The authors characterize chromatin states in the flowering plant Arabidopsis thaliana and the bryophyte Marchantia polymorpha. Here, they draw from ChIP-seq data that was previously published, and from data generated as part of this study, in particular for Marchantia H2.A variants (H2A.X.1, H2A.X.1, H2A.Z, H2A.M.2). The authors compute chromatin states, which enables a comparison over more than 450 million years of land plant evolution. While comparisons of plant chromatin to other species highlighted conservation as well as differences, this study targets a knowledge gap of evaluating chromatin conservation during land plant evolution. The authors investigate a connection between Transcription Factors binding sites and chromatin states. They propose a list of candidate pioneer factors associating with the +1 nucleosome.

      Major comments:

      • For the Association of chromatin states with expression, the authors use the TAIR10 annotation for extracting TSSs and promoter sequences. When investigated, a comparison of data resolving TSS with this annotation (or Araport11) shows a pretty poor overlap between the TSS based on Tair10/Araport11 and experimentally derived TSSs. This information was captured in Arabidopsis genome annotation files where the experimental TSS matches the genome annotation. What is the advantage of using an annotation with the inaccurate TSSs in TAIR10? It seems to confound the study.

      • The TSS annotation in Marchantia polymorpha (Tak1 v7.1) may also match poorly to the experimentally derived TSS. I suggest that the authors generate data to detect TSS in their tissue of choice and compare the positions to the genome annotation they use (f.x. PMID: 38831668).

      • I am not convinced that it is a wise choice to utilize fewer ChIP-seq data in Marchantia than Arabidopsis. Can the missing Marchantia ChIP-seq experiments not be performed and included to complete the comparison?

      • P. 26 onwards, the authors investigate different TF clusters and their association with chromatin states. They state "cluster 1 TFs primarily associated with the first nucleosome downstream of the TSS". However, if the gene is not really expressed in these "leave" tissues, then how can the authors be sure that the same TSS position would be used in "flower" tissue? It could be an artifact of a genome annotation file that misses flower-tissue TSS data. It is not an obvious to conclusion to name these factors "pioneer TFs". Experiments testing this are missing as far as I can gather.

      Minor comments:

      • Can the authors add files ( e.g. .bed) with their segmented chromatin states as part of their GEO submission? That could improve the impact and make the findings more accessible.

      • Can the authors rule out issues with the Marchantia annotation, for example missing read-through transcription or alternative isoforms, that would essentially have the effect that the genomic segmentation they use contains elongating upstream transcripts in from of promoter TSS? This could be an alternative explanation for the enrichment of H2AUb/H3K36me3 just upstream of the TSSs as they describe on p.21. If it can´t be ruled, the limitations from genome annotations, and examples offering improvements could be highlighted in the discussion. This may also be supported by the long persistence of E4 after the TTS p.23.

      • P.23 - This further suggests that in Marchantia, the orientation of genes defines

      • distinct chromatin environment in their vicinity, through mechanisms yet to be uncovered. Does this correlate with the distance of the closest (annotated) transcript pairs?

      • The E1 state highlighted on p.24 and in Fig.3A/d is not annotated in Fig.3A/D. It is also not clear in the legends which number it is.

      • P.30 - The marks H3K4me1 and H3K36me3 reflecting transcriptional elongation and confined to the gene bodies in Arabidopsis, extend beyond the TTS in Marchantia, suggesting that signals for transcriptional termination differ between flowering plants and bryophytes. There are multiple alternative explanations. Likely a combination of missing transcripts in their genome annotation (e.g. lncRNAs), annotation errors (e.g. wrong ends) and the segmentation of these regions (e.g. the transcripts are closer than in Arabidopsis). The discussion could extended significantly to address these issues and include the efforts to improve the genome annotations.

      Referee cross-commenting

      Reviewer #2 raises fair and valuable questions.

      Significance

      Significance: The authors corroborate prior chromatin state analyses in Arabidopsis and provide a chromatin state analysis for Marchantia. These data represent a resource that will be used and appreciated by the plant and ChromEvoDevo communities. The quality of the analyses are high and the description is transparent. I am not aware of a similar study comparing bryophytes and a land plant, so this study addresses a gap in knowledge.

      General assessment: The quality of the manuscript is high. The analyses are described well, and in sufficient detail to be understood. The effort going into documentation is high, I rate the study as reproducible. The linked github deposition looks good. The data generated as part of this study is available in the linked GEO deposition. An experimental design of 2 biological repeats is used, which is OK, but the lower limit. The GEO-deposited .bw files should be of interest to the ChromEvoDevo community, and researchers interested in Marchantia epigenetics and gene expression. The manuscript is written clearly and to the point. The figures condense a lot of data and match the text. The figures are rather complex and not easily accessible to someone browsing through a journal issue. However, that is fine for these types of papers. The manuscript is strong on data analysis. Other approaches, for example mutants to validate their hypothesis, are not utilized. The calculation of chromatin states offers a way to condense complex information into simpler terms. Nevertheless, it re-organizes information that largely existed before. To me, the biggest value of this study appears to be to regard it as a resource that calculated the chromatin states in a comparable fashion between organisms.

      Advance: The manuscript provides several advances. It provides new ChIP-seq data for Marchantia, it generates a chromatin state map for Marchantia, it compares Chromatin state maps between distant evolutionary time, and it generates a new hypothesis regarding pioneer TFs in plants. Some of the points described in the article hold true for even larger evolutionary distances, for example comparing plants to yeast and metazoans. The manuscript fills a knowledge gap and has offers a comparison via the computation of comparable chromatin states.

      Audience: The audience will be colleagues interested in chromatin and epigenetics, the Marchantia and plant communities as well as researchers interested in EvoDevo of chromatin organization. Even though the study uses plant models, it is highly relevant for non-plant models.

    1. Reviewer #1 (Public review):

      This thoughtful and thorough mechanistic and functional study reports ARHGAP36 as a direct transcriptional target of FOXC1, which regulates Hedgehog signaling (SUFU, SMO, and GLI family transcription factors) through modulation of PKAC. Clinical outcome data from patients with neuroblastoma, one of the most common extracranial solid malignancies in children, demonstrate that ARHGAP36 expression is associated with improved survival. Although this study largely represents a robust and near-comprehensive set of focused investigations on a novel target of FOXC1 activity, several significant omissions undercut the generalizability of the findings reported.

      (1) It is notable that the volcano plot in Figure 1a does now show evidence of canonical Hedgehog gene regulation, even though the subsequent studies in this paper clearly demonstrate that ARHGAP36 regulates Hedgehog signal transduction. Is this because canonical Hedgehog target genes (GLI1, PTCH1, SUFU) simply weren't labeled? Or is there a technical limitation that needs to be clarified? A note about Hedgehog target genes is made in conjunction with Table S1, but the justification or basis of defining these genes as Hedgehog targets is unclear. More broadly, it would be useful to see ontology analyses from these gene expression data to understand FOXC1 target genes more broadly. Ontology analyses are included in a supplementary table, but network visualizations would be much preferred.

      (2) Likewise, the ChIP-seq data in Figure 2 are under-analyzed, focusing only on the ARHGAP36 locus and not more broadly on the FOXC1 gene expression program. This is a missed opportunity that should be remedied with unbiased analyses intersecting differentially expressed FOXC1 peaks with differentially expressed genes from RNA-sequencing data displayed in Figure 1.

      (3) RNA-seq and ChIP-seq data strongly suggest that FOXC1 regulates ARHGAP36 expression, and the authors convincingly identify genomic segments at the ARHGAP36 locus where FOXC1 binds, but they do not test if FOXC1 specifically activates this locus through the creation of a luciferase or similar promoter reporter. Such a reagent and associated experiments would not only strengthen the primary argument of this investigation but could serve as a valuable resource for the community of scientists investigating FOXC1, ARHGAP36, the Hedgehog pathway, and related biological processes. CRISPRi targeting of the identified regions of the ARHGAP locus is a useful step in the right direction, but these experiments are not done in a way to demonstrate FOXC1 dependency.

      (4) It would be useful to see individual fluorescence channels in association with images in Figure 3b.

      (5) Perhaps the most significant limitation of this study is the omission of in vivo data, a shortcoming the authors partly mitigate through the incorporation of clinical outcome data from pediatric neuroblastoma patients in the context of ARHGAP36 expression. The authors also mention that high levels of ARHGAP36 expression were also detected in "specific CNS, breast, lung, and neuroendocrine tumors," but do not provide clinical outcome data for these cohorts. Such analyses would be useful to understand the generalizability of their findings across different cancer types. More broadly, how were high, medium, and low levels of ARHGAP36 expression identified? "Terciles" are mentioned, but such an approach is not experimentally rigorous, and RPA or related approaches (nested rank statistics, etc) are recommended to find optimal cutpoints for ARHGAP36 expression in the context of neuroblastoma, "specific CNS, breast, lung, and neuroendocrine" tumor outcomes.

    1. Not much holding themtogether. So far this essay of philosophizing mixed with examplesmight make you think that I let my students write anything they wantand that I’m encouraging you, as well, to write anything you want; inother words, trading rules for freedom. I don’t think writers have tochoose one over the other. I don’t think you can. If I try to convinceyou to write whatever you want, I’m using a traditional strategy forengaging students: your choice, your interests, your whatever. But anywriting choice is a choice. At the end of a semester, Adbe Guerrero,a former student, taught me about the positions that expertise andchoice occupied in relation to his experiences, my teaching, and oneof our later readings

      Charlton argues for experimentation instead of over-focusing on rigid form. I like his honesty that "focus" can limit invention; reminds me to explore ideas before narrowing. In my personal projects, I also "drift" before finding structure, it's the same creative process. He claims too much focus harms learning, which is true but I believe some structure helps.

    2. Brittany questioned the form and function of a test, so it made sensefor her to try and create one that met her goals. In the end, she cre-ated what we might now call an example of high school and collegealignment—an exam in high school that might have prepared her forour college writing class. It is wishful thinking, but classmates wereprompted to talk about how to approach tests that they needed to takebut didn’t agree with, and my colleagues and I learned that alignmentdiscussions can be had among all stakeholders, rather than amongteachers and administrators alone.

      Relates to the public communication, like adapting a game demo for investors. Could invention projects like this replace traditional essays? This reaffirms that audience awareness develops through experimentation, not memorization.

      (SAYS-DOES) Charlton says Brittany’s creative testing aligns audiences, and this does illustrate authentic transfer of learning.

    3. Q: If you can comprehend difficult material (i.e. Downs &Wardle Article), does that affect your writing capability?Merely a MisconceptionFrom the elementary level to secondary schooling, educa-tors are consistent upon the insistence that their students readmore because it will help improve their vocabulary, writing,etc. School districts have even gone as far as instituting incen-tive programs in order to encourage reading (i.e. AcceleratedReader or A.R.) or otherwise force it on students. However,the question here is, does reading more really help; and if so,does reading more difficult material play a role in one’s writ-ing level?. . . I believe that one’s writing can be improved throughreading and that in some part, your reading level does affectyour writing capability, but it is not always the case. Differ-ent people learn differently. Writing requires practice all onits own in order to better oneself at it and requires the read-ing of not just more difficult pieces but a multitude of pieces.In order to improve one’s writing one needs to be exposed todifferent varieties of writing in order to hone the ability ofcomprehension. Everyone has their own method and style ofwriting, however no one style of writing is original. It is justlike art, an artist can no longer claim their work to be originalbecause everything has been done before. What can be doneis to take what others have given us and use it to our advan-tage; learn from it.

      Brittany tests the link between reading and writing improvement, later critiques standardized testing. Her evolution from essay to exam design shows creative transfer. Her mock ACT logically aligns testing with actual writing tasks. What grading criteria did she use for her mock exam, was it ever tested?

    1. But in 999, King Olaf Tryggvason of Norway threatened to cut Iceland off from the Viking trade routes, so the Icelanders threw their idols over the Godafoss (“Waterfall of the Gods”) and converted.

      Im shocked at how quick they were to cut ties with King Olaf after he threatened them.

    1. Stereoisomers are isomers that differ in spatial arrangement of atoms, rather than order of atomic connectivity

      Stereoisomers have the exact same "blueprint" or "wiring" (connectivity), but they are just different 3D shapes (cis/trans)

    1. wordCloudLine

      self-curated wordCloudLines make everything in line and marked in line

      When documents cease to be enclosures of discreste disconnected islands separated decontextualized and self standing of human externalization of human intellect but

      but inclosures acting as portals

      to complete transitive closure of bidirectionally meaningfully high-resolution linked associative complexes that connect people ideas and things into emerging co-evolving born bultiplayer co-laborative permanent evergreen coevolving spaces of associative conplexes

    1. Reviewer #1 (Public review):

      Summary:

      This study set out to investigate potential pharmacological drug-drug interactions between the two most common antimalarial classes, the artemisinins and quinolines. There is a strong rationale for this aim, because drugs from these classes are already widely used in Artemisinin Combination Therapies (ACTs) in the clinic, and drug combinations are an important consideration in the development of new medicines. Furthermore, whilst there is ample literature proposing many diverse mechanisms of action and resistance for the artemisinins and quinolines, it is generally accepted that the mechanisms for both classes involve heme metabolism in the parasite, and that artemisinin activity is dependent on activation by reduced heme. The study was designed to measure drug-drug interactions associated with a short pulse exposure (4 h) that is reminiscent of the short duration of artemisinin exposure obtained after in vivo dosing. Clear antagonism was observed between dihydroartemisinin (DHA) and chloroquine, which became even more extensive in chloroquine-resistant parasites. Antagonism was also observed in this assay for the more clinically-relevant ACT partner drugs piperaquine and amodiaquine, but not for other ACT partners mefloquine and lumefantrine, which don't share the 4-aminoquinoline structure or mode of action. Interestingly, chloroquine induced an artemisinin resistance phenotype in the standard in vitro Ring-stage Survival Assay, whereas this effect was not apparent for piperaquine.

      The authors also utilised a heme-reactive probe to demonstrate that the 4-aminoquinolines can inhibit heme-mediated activation of the probe within parasites, which suggests that the mechanism of antagonism involves the inactivation of heme, rendering it unable to activate the artemisinins. Measurement of protein ubiquitination showed reduced DHA-induced protein damage in the presence of chloroquine, which is also consistent with decreased heme-mediated activation, and/or with decreased DHA activity more generally.

      Overall, the study clearly demonstrates a mechanistic antagonism between DHA and 4-aminoquinoline antimalarials in vitro. It is interesting that this combination is successfully used to treat millions of malaria cases every year, which may raise questions about the clinical relevance of this finding. However, the conclusions in this paper are supported by multiple lines of evidence, and the data are clearly and transparently presented, leaving no doubt that DHA activity is compromised by the presence of chloroquine in vitro. It is perhaps fortunate that the clinical dosing regimens of 4-aminoquinoline-based ACTs have been sufficient to maintain clinical efficacy despite the non-optimal combination. Nevertheless, optimisation of antimalarial combinations and dosing regimens is becoming more important in the current era of increasing resistance to artemisinins and 4-aminoquinolines. Therefore, these findings should be considered when proposing new treatment regimens (including Tripe-ACTs) and the assays described in this study should be performed on new drug combinations that are proposed for new or existing antimalarial medicines.

      Strengths:

      This manuscript is clearly written, and the data presented are clear and complete. The key conclusions are supported by multiple lines of evidence, and most findings are replicated with multiple drugs within a class, and across multiple parasite strains, thus providing more confidence in the generalisability of these findings across the 4-aminoquinoline and peroxide drug classes.

      A key strength of this study was the focus on short pulse exposures to DHA (4 h in trophs and 3 h in rings), which is relevant to the in vivo exposure of artemisinins. Artemisinin resistance has had a significant impact on treatment outcomes in South-East Asia, and is now emerging in Africa, but is not detected using a 'standard' 48 or 72 h in vitro growth inhibition assay. It is only in the RSA (a short pulse of 3-6 h treatment of early ring stage parasites) that the resistance phenotype can be detected in vitro. Therefore, assays based on this short pulse exposure provide the most relevant approach to determine whether drug-drug interactions are likely to have a clinically relevant impact on DHA activity. These assays clearly showed antagonism between DHA and 4-aminoquinolines (chloroquine, piperaquine, amodiaquine, and ferroquine) in trophozoite stages. Interestingly, whilst chloroquine clearly induced an artemisinin-resistant phenotype in the RSA, piperaquine did not appear to impact the early ring stage activity of DHA, which may be fortunate considering that piperaquine is a currently recommended DHA partner drug in ACTs, whereas chloroquine is not!

      The evaluation of additional drug combinations at the end of this paper is a valuable addition, which increases the potential impact of this work. The finding of antagonism between piperaquine and OZ439 in trophozoites is consistent with the general interactions observed between peroxides and 4-aminoquinolines, and it would be interesting to see whether piperaquine impacts the ring-stage activity of OZ439.

      The evaluation of reactive heme in parasites using a fluorescent sensor, combined with the measurement of K48-linked ubiquitin, further supports the findings of this study, providing independent read-outs for the chloroquine-induced antagonism.

      The in-depth discussion of the interpretation and implications of the results is an additional strength of this manuscript. Whilst the discussion section is rather lengthy, there are important caveats to the interpretation of some of these results, and clear relevance to the future management of malaria that require these detailed explanations.

      Overall, this is a high-quality manuscript describing an important study that has implications for the selection of antimalarial combinations for new and existing malaria medicines.

      Weaknesses:

      This study is an in vitro study of parasite cultures, and therefore, caution should be taken when applying these findings to decisions about clinical combinations. The drug concentrations and exposure durations in these assays are intended to represent clinically relevant exposures, although it is recognised that the in vitro system is somewhat simplified and there may be additional factors that influence in vivo activity. I think this is reasonably well acknowledged in the manuscript.

      It is also important to recognise that the majority of the key findings regarding antagonism are based on trophozoite-stage parasites, and one must show caution when generalising these findings to other stages or scenarios. For example, piperaquine showed clear antagonism in trophozoite stages, but not in ring stages under these assay conditions.

      The key weakness in this manuscript is the over-interpretation of the mechanistic studies that implicate heme-mediated artemisinin activation as the mechanism underpinning antagonism by chloroquine. In particular, the manuscript title focuses on heme-mediated activation of artemisinins, but this study did not directly measure the activation of artemisinins. The data obtained from the activation of the fluorescent probe are generally supportive of chloroquine suppressing the heme-mediated activation of artemisinins, and I think this is the most likely explanation, but there are significant caveats that undermine this conclusion. Primarily, the inconsistency between the fluorescence profile in the chemical reactions and the cell-based assay raises questions about the accuracy of this readout. In the chemical reaction, mefloquine and chloroquine showed identical inhibition of fluorescence, whereas piperaquine had minimal impact. On the contrary, in the cell, chloroquine and piperaquine had similar impacts on fluorescence, but mefloquine had minimal impact. This inconsistency indicates that the cellular fluorescence based on this sensor does not give a simple direct readout of the reactivity of ferrous heme, and therefore, these results should be interpreted with caution. Indeed, the correlation between fluorescence and antagonism for the tested drugs is a correlation, not causation. There could be several reasons for the disconnect between the chemical and biological results, either via additional mechanisms that quench fluorescence, or the presence of biomolecules that alter the oxidation state or coordination chemistry of heme or other potential catalysts of this sensor. It is possible that another factor that influences the H-FluNox fluorescence in cells also influences the DHA activity in cells, leading to the correlation with activity. It should be noted that H-FluNox is not a chemical analogue of artemisinins. Its activation relies on Fenton-like chemistry, but with an N-O rather than O-O bond, and it possesses very different steric and electronic substituents around the reactive centre, which are known to alter reactivity to different iron sources. Despite these limitations, the authors have provided reasonable justification for the use of this probe to directly visualise heme reactivity in cells, and the results are still informative, but additional caution should be provided in the interpretation, and the results are not conclusive enough to justify the current title of the paper.

      Another interesting finding that was not elaborated by the authors is the impact of chloroquine on the DHA dose-response curves from the ring stage assays. Detection of artemisinin resistance in the RSA generally focuses on the % survival at high DHA concentrations (700 nM) as there is minimal shift in the IC50 (see Figure 2), however, chloroquine clearly induces a shift in the IC50 (~5-fold), where the whole curve is shifted to the right, whereas the increase in % survival is relatively small. This different profile suggests that the mechanism of chloroquine-induced antagonism is different from the mechanism of artemisinin resistance. Current evidence regarding the mechanism of artemisinin resistance generally points towards decreased heme-mediated drug activation due to a decrease in hemoglobin uptake, which should be analogous to the decrease in heme-mediated drug activation caused by chloroquine. However, these different dose-response curves suggest different mechanisms are primarily responsible. Additional mechanisms have been proposed for artemisinin resistance, involving redox or heat stress responses, proteostatic responses, mitochondrial function, dormancy, and PI3K signaling, among others. Whilst the H-FluNox probe generally supports the idea that chloroquine suppresses heme-mediated DHA activation, it remains plausible that chloroquine could induce these, or other, cellular responses that suppress DHA activity.

      The other potential weakness in the current manuscript is the interpretation of the OZ439 clinical data. Whilst the observed interaction with piperaquine and ferroquine may have been a contributing factor, it should also be recognised that the low pharmacokinetic exposure in these studies was the primary reason for treatment failure (Macintyre 2017).

      Impact:

      This study has important implications for the selection of drugs to form combinations for the treatment of malaria. The overall findings of antagonism between peroxide antimalarials and 4-aminoquinolines in the trophozoite stage are robust, and this carries across to the ring stage for chloroquine (but not piperaquine).

      The manuscript also provides a plausible mechanism to explain the antagonism, although future work will be required to further explore the details of this mechanism and to rule out alternative factors that may contribute.

      Overall, this is an important contribution to the field and provides a clear justification for the evaluation of potential drug combinations in relevant in vitro assays before clinical testing.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript by Rosenthal and Goldberg investigates interactions between artemisinins and their quinoline partner drugs currently used for treating uncomplicated Plasmodium falciparum malaria. The authors show that chloroquine (CQ), piperaquine, and amodiaquine antagonize dihydroartemisinin (DHA) activity, and in CQ-resistant parasites, the interaction is described as "superantagonism," linked to the pfcrt genotype. Mechanistically, application of the heme-reactive probe H-FluNox indicates that quinolines render cytosolic heme chemically inert, thereby reducing peroxide activation. The work is further extended to triple ACTs and ozonide-quinoline combinations, with implications for artemisinin-based combination therapy (ACT) design, including triple ACTs.

      Strengths:

      The manuscript is clearly written, methodologically careful, and addresses a clinically relevant question. The pulsing assay format more accurately models in vivo artemisinin exposure than conventional 72-hour assays, and the use of H-FluNox and Ac-H-FluNox probes provides mechanistic depth by distinguishing chemically active versus inert heme. These elements represent important refinements beyond prior studies, adding nuance to our understanding of artemisinin-quinoline interactions.

      Weaknesses:

      Several points warrant consideration. The novelty of the work is somewhat incremental, as antagonism between artemisinins and quinolines is well established. Multiple prior studies using standard fixed-ratio isobologram assays have shown that DHA exhibits indifferent or antagonistic interactions with chloroquine, piperaquine, and amodiaquine (e.g., Davis et al., 2006; Fivelman et al., 2007; Muangnoicharoen et al., 2009), with recent work highlighting the role of parasite genetic background, including pfcrt and pfmdr1, in modulating these interactions (Eastman et al., 2016). High-throughput drug screens likewise identify quinoline-artemisinin combinations as mostly antagonistic. The present manuscript adds refinement by applying pulsed-exposure assays and heme probes rather than establishing antagonism de novo.

      The dataset focuses on several parasite lines assayed in vitro, so claims about broad clinical implications should be tempered, and the discussion could more clearly address how in vitro antagonism may or may not translate to clinical outcomes. The conclusion that artemisinins are predominantly activated in the cytoplasm is intriguing but relies heavily on Ac-H-FluNox data, which may have limitations in accessing the digestive vacuole and should be acknowledged explicitly. The term "superantagonism" is striking but may appear rhetorical; clarifying its reproducibility across replicates and providing a mechanistic definition would strengthen the framing. Finally, some discussion points, such as questioning the clinical utility of DHA-PPQ, should be moderated to better align conclusions with the presented data while acknowledging the complexity of in vivo pharmacology and clinical outcomes.

      Despite these mild reservations, the data are interesting and of high quality and provide important new information for the field.

    3. Reviewer #3 (Public review):

      Summary:

      The authors present an in vitro evaluation of drug-drug interactions between artemisinins and quinoline antimalarials, as an important aspect for screening the current artemisinin-based combination therapies for Plasmodium falciparum. Using a revised pulsing assay, they report antagonism between dihydroartemisinin (DHA) and several quinolines, including chloroquine, piperaquine (PPQ), and amodiaquine. This antagonism is increased in CQ-resistant strains in isobologram analyses. Moreover, CQ co-treatment was found to induce artemisinin resistance even in parasites lacking K13 mutations during the ring-stage survival assay. This implies that drug-drug interactions, not just genetic mutations, can influence resistance phenotypes. By using a chemical probe for reactive heme, the authors demonstrate that quinolines inhibit artemisinin activation by rendering cytosolic heme chemically inert, thereby impairing the cytotoxic effects of DHA. The study also observed negative interactions in triple-drug regimens (e.g., DHA-PPQ-Mefloquine) and in combinations involving OZ439, a next-generation peroxide antimalarial. Taken together, these findings raise significant concerns regarding the compatibility of artemisinin and quinoline combinations, which may promote resistance or reduce efficacy.

      Throughout the manuscript, no combinations were synergistic, which necessitates comparing the claims to a synergistic combination as a control. The lack of this positive control makes it difficult to contextualize the observed antagonism. Including a known synergistic pair (e.g., artemisinin + lumefantrine) throughout the study would have provided a useful benchmark to assess the relative impact of the drug interactions described.

      Strengths:

      This study demonstrates the following strengths:

      (1) The use of a pulsed in vitro assay that is more physiologically relevant than the traditional 48h or 72h assays.

      (2) Small molecule probes, H-FluNox, and Ac-H-FluNox to detect reactive cytosolic heme, demonstrating that quinolines render heme inert and thereby block DHA activation.

      (3) Evaluates not only traditional combinations but also triple-drug combinations and next-generation artemisinins like OZ439. This broad scope increases the study's relevance to current treatment strategies and future drug development.

      (4) By using the K13 wild-type parasites, the study suggests that resistance phenotypes can emerge from drug-drug interactions alone, without requiring genetic resistance markers.

      Weaknesses:

      (1) No combinations are shown as synergistic: it could be valuable to have a combination that shows synergy as a positive control (e.g, artemisinin + lumefantrine) throughout the manuscript. The absence of a synergistic control combination in the experimental design makes it more challenging to evaluate the relative impact of the described drug interactions.

      (2) Evaluation of the choice of drug-drug interactions: How generalizable are the findings across a broad range of combinations, especially those with varied modes of action?

      (3) The study would also benefit from a characterization of the molecular basis for the observed heme inactivation by quinolines to support this hypothesis - while the probe experiments are valuable, they do not fully elucidate how quinolines specifically alter heme chemistry at the molecular level.

      (4) Suggestion of alternative combinations that show synergy could have improved the significance of the work.

      (5) All data are derived from in vitro experiments, without accompanying an in vivo validation. While the pulsing assay improves physiological relevance, it still cannot fully capture the complexity of drug pharmacokinetics, host-parasite interactions, or immune responses present in living organisms.

      (6) The absence of pharmacokinetic/pharmacodynamic modeling leaves questions about how the observed antagonism would manifest under real-world dosing conditions.

    4. Author response:

      Reviewer #1:

      We thank the reviewer for their thoughtful summary of this manuscript. It is important to note that DHA-PPQ did show antagonism in RSAs. In this modified RSA, 200 nM PPQ alone inhibited growth of PPQ-sensitive parasites approximately 20%. If DHA and PPQ were additive, then we would expect that addition of 200 nM PPQ would shift the DHA dose response curve to the left and result in a lower DHA IC50. Please refer to Figure 4a and b as examples of additive relationships in dose-response assays. We observed no significant shift in IC50 values between DHA alone and DHA + PPQ. This suggests antagonism, albeit not to the extent seen with CQ. We will modify the manuscript to emphasize this point. As the reviewer pointed out, it is fortunate that despite being antagonistic, clinically used artemisinin-4-aminoquinoline combinations are effective, provided that parasites are sensitive to the 4-aminoquinoline. It is possible that superantagonism is required to observe a noticeable effect on treatment efficacy (Sutherland et al. 2003 and Kofoed et al. 2003), but that classical antagonism may still have silent consequences. For example, if PPQ blocks some DHA activation, this might result in DHA-PPQ acting more like a pseudo-monotherapy. However, as the reviewer pointed out, while our data suggest that DHA-PPQ and AS-ADQ are “non-optimal” combinations, the clinical consequences of these interactions are unclear. We will modify the manuscript to emphasize the later point.

      While the Ac-H-FluNox and ubiquitin data point to a likely mechanism for DHA-quinoline antagonism, we agree that there are other possible mechanisms to explain this interaction.  We will temper the title and manuscript to reflect these limitations. Though we tried to measure DHA activation in parasites directly, these attempts were unsuccessful. We acknowledge that the chemistry of DHA and Ac-H-FluNox activation is not identical and that caution should be taken when interpreting these data. Nevertheless, we believe that Ac-H-FluNox is the best currently available tool to measure “active heme” in live parasites and is the best available proxy to assess DHA activation in live parasites. Both in vitro and in parasite studies point to a roll for CQ in modulating heme, though an exact mechanism will require further examination. Similar to the reviewer, we were perplexed by the differences observed between in vitro and in parasite assays with PPQ and MFQ. We proposed possible hypotheses to explain these discrepancies in the discussion section. Interestingly, our data corelate well with hemozoin inhibition assays in which all three antimalarials inhibit hemozoin formation in solution, but only CQ and PPQ inhibit hemozoin formation in parasites. In both assays, in-parasite experiments are likely to be more informative for mechanistic assessment.

      It remains unclear why K13 genotype influences RSA values, but not early ring DHA IC50 values. In K13<sup>WT</sup> parasites, both RSA values and DHA IC50 values were increased 3-5 fold upon addition of CQ. This suggests that CQ-mediated resistance is more robust than that conferred by K13 genotype. However, this does not necessarily suggest a different resistance mechanism. We acknowledge that in addition to modulating heme, it is possible that CQ may enhance DHA survival by promoting parasite stress responses. Future studies will be needed to test this alternative hypothesis. This limitation will be acknowledged in the manuscript. We will also address the reviewer’s point that other factors, including poor pharmacokinetic exposure, contributed to OZ439-PPQ treatment failure.

      Reviewer #2:

      We appreciate the positive feedback. We agree that there have been previous studies, many of which we cited, assessing interactions of these antimalarials. We also acknowledge that previous work, including our own, has shown that parasite genetics can alter drug-drug interactions. We will include the author’s recommended citations to the list of references that we cited. Importantly, our work was unique not only for utilizing a pulsing format, but also for revealing a superantagonistic phenotype, assessing interactions in an RSA format, and investigating a mechanism to explain these interactions. We agree with the reviewer that implications from this in vitro work should be cautious, but hope that this work contributes another dimension to critical thinking about drug-drug interactions for future combination therapies. We will modify the manuscript to temper any unintended recommendations or implications.

      The reviewer notes that we conclude “artemisinins are predominantly activated in the cytoplasm”. We recognize that the site of artemisinin activation is contentious. We were very clear to state that our data combined with others suggest that artemisinins can be activated in the parasite cytoplasm. We did not state that this is the primary site of activation. We were clear to point out that technical limitations may prevent Ac-H-FluNox signal in the digestive vacuole, but determined that low pH alone could not explain the absence of a digestive vacuole signal.

      With regard to the “reproducibility” and “mechanistic definition” of superantagonism, we observed what we defined as a one-sided superantagonistic relationship for three different parasites (Dd2, Dd2 PfCRT<sup>Dd2</sup>, and Dd2 K13<sup>R539T</sup>) for a total of nine independent replicates. In the text, we define that these isoboles are unique in that they had mean ΣFIC50 values > 2.4 and peak ΣFIC50 values >4 with points extending upward instead of curving back to the axis. As further evidence of the reproducibility of this relationship, we show that CQ has a significant rescuing effect on parasite survival to DHA as assessed by RSAs and IC50 values in early rings.

      Reviewer #3:

      We thank the reviewer for their positive feedback. We acknowledge that no combinations tested in this manuscript were synergistic. However, two combinations, DHA-MFQ and DHA-LM, were additive, which provides context for contextualizing antagonistic relationships. We have previously reported synergistic and additive isobolograms for peroxide-proteasome inhibitor combinations using this same pulsing format (Rosenthal and Ng 2021). These published results will be cited in the manuscript.

      We believe that these findings are specific to 4-aminoquinoline-peroxide combinations, and that these findings cannot be generalized to antimalarials with different mechanisms of action. Note that the aryl amino alcohols, MFQ and LM, were additive with DHA. Since the mechanism of action of MFQ and LM are poorly understood, it is difficult to speculate on a mechanism underlying these interactions.

      We agree with the reviewer that while the heme probe may provide some mechanistic insight to explain DHA-quinoline interactions, there is much more to learn about CQ-heme chemistry, particularly within parasites.

      The focus of this manuscript was to add a new dimension to considerations about pairings for combination therapies. It is outside the scope of this manuscript to suggest alternative combinations. However, we agree that synergistic combinations would likely be more strategic clinically.

      An in vitro setup allows us to eliminate many confounding variables in order to directly assess the impact of partner drugs on DHA activity. However, we agree that in vivo conditions are incredibly more complex, and explicitly state this.

      We agree that in the future, modeling studies could provide insight into how antagonism may contribute to real-world efficacy. This is outside the scope of our studies.

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript, Nührenberg et al., describe vassi, a Python package for mutually exclusive behavioral classification of social behaviors. This package imports and organizes trajectory data and manual behavior labels, and then computes feature representations for use with available Python machine learning-based classification tools. These representations include all possible dyadic interactions within an animal group, enabling classification of social behaviors between pairs of animals at a distance. The authors validate this package by reproducing the behavior classification performance on a previously published dyadic mouse dataset, and demonstrate its use on a novel cichlid group dataset. The authors have created a package that is agnostic to the mechanism of tracking and will reduce the barrier of data preparation for machine learning, which can be a stumbling block for non-experts. The package also evaluates the classification performance with helpful visualizations and provides a tool for inspection of behavior classification results.

      Strengths:

      (1) A major contribution of this paper was creating a framework to extend social behavior classification to groups of animals such that the actor and receiver can be any member of the group, regardless of distance. To implement this framework, the authors created a Python package and an extensive documentation site, which is greatly appreciated. This package should be useful to researchers with a knowledge of Python, virtual environments, and machine learning, as it relies on scripts rather than a GUI interface and may facilitate the development of new machine learning algorithms for behavior classification.

      (2) The authors include modules for correctly creating train and test sets, and evaluation of classifier performance. This is extremely useful. Beyond evaluation, they have created a tool for manual review and correction of annotations. And they demonstrate the utility of this validation tool in the case of rare behaviors where correct classification is difficult, but the number of examples to review is reasonable.

      (3) The authors provide well-commented step-by-step instructions for the use of the package in the documentation.

      Weaknesses:

      (1) While the classification algorithm was not the subject of the paper, as the authors used off-the-shelf methods and were only able to reproduce the performance of the CALMS21 dyadic dataset, they did not improve upon previously published results. Furthermore, the results from the novel cichlid fish dataset, including a macro F1 score of 0.45, did not compellingly show that the workflow described in the paper produces useful behavioral classifications for groups of interacting animals performing rare social behaviors. I commend the authors for transparently reporting the results both with the macro F1 scores and the confusion matrices for the classifiers. The mutually exclusive, all-vs-all data annotation scheme of rare behaviors results in extremely unbalanced datasets such that categorical classification becomes a difficult problem. To try to address the performance limitation, the authors built a validation tool that allows the user to manually review the behavior predictions.

      (2) The pipeline makes a few strong assumptions that should be made more explicit in the paper.

      First, the behavioral classifiers are mutually exclusive and one-to-one. An individual animal can only be performing one behavior at any given time, and that behavior has only one recipient. These assumptions are implicit in how the package creates the data structure, and should be made clearer to the reader. Additionally, the authors emphasize that they have extended behavior classification to animal groups, but more accurately, they have extended behavioral classification to all possible pairs within a group.

      Second, the package expects comprehensive behavior labeling of the tracking data as input. Any frames not manually labeled are assumed to be the background category. Additionally, the package will interpolate through any missing segments of tracking data and assign the background behavioral category to those trajectory segments as well. The effects of these assumptions are not explored in the paper, which may limit the utility of this workflow for naturalistic environments.

      (3) Finally, the authors described the package as a tool for biologists and ethologists, but the level of Python and machine learning expertise required to use the package to develop a novel behavior classification workflow may be beyond the ability of many biologists. More accessible example notebooks would help address this problem.

    2. Reviewer #2 (Public review):

      Summary:

      The authors present a novel supervised behavioral analysis pipeline (vassi), which extends beyond previously available packages with its innate support of groups of any number of organisms. Importantly, this program also allows for iterative improvement upon models through revised behavioral annotation.

      Strengths:

      vassi's support of groups of any number of animals is a major advancement for those studying collective social behavior. Additionally, the built-in ability to choose different base models and iteratively train them is an important advancement beyond current pipelines. vassi is also producing behavioral classifiers with similar precision/recall metrics for dyadic behavior as currently published packages using similar algorithms.

      Weaknesses:

      vassi's performance on group behaviors is potentially too low to proceed with (F1 roughly 0.2 to 0.6). Different sources have slightly different definitions, but an F1 score of 0.7 or 0.8 is often considered good, while anything lower than 0.5 can typically be considered bad. There has been no published consensus within behavioral neuroscience (that I know of) on a minimum F1 score for use. Collective behavioral research is extremely challenging to perform due to hand annotation times, and there needs to be a discussion in the field as to the trade-off between throughput and accuracy before these scores can be either used or thrown out the door. It would also be useful to see the authors perform a few rounds of iterative corrections on these classifiers to see if performance is improved.

      While the interaction networks in Figure 2b-c look visually similar based on interaction pairs, the weights of the interactions appear to be quite different between hand and automated annotations. This could lead to incorrect social network metrics, which are increasingly popular in collective social behavior analysis. It would be very helpful to see calculated SNA metrics for hand versus machine scoring to see whether or not vassi is reliable for these datasets.

    3. Author response:

      We thank the reviewers and editors for their assessment and for identifying the main issues of our framework for automated classification of social interactions in animal groups. Based on the reviewers’ feedback, we would like to briefly summarize three areas in which we aim to improve both our manuscript and the software package.

      Firstly, we will revise our manuscript to better define the scope of our classification pipeline. As reviewer #1 correctly points out, our framework is built around the scoring and analysis of dyadic interactions within groups, rather than emergent group-level or collective behavior. This structure more faithfully reflects the way that researchers score social behaviors within groups, following focal individuals while logging all directed interactions of interest (e.g., grooming, aggression or courtship), and with whom these interactions are performed. Indeed, animal groups are often described as social networks of interconnected nodes (individuals), in which the connections between these nodes are derived from pairwise metrics, for example proximity or interaction frequency. For this reason, vassi does not aim to classify higher-level group behavior (i.e., the emergent, collective state of all group members) but rather the pair-wise interactions typically measured. Our classification pipeline replicates this structure, and therefore produces raw data that is familiar to researchers that study social animal groups with a focus on pairwise interactions. Since this may be seen as a limitation when studying group-level behavior (with more than two individuals involved, usually undirected), we will make this distinction between different forms of social interaction more clear in the introduction.

      Secondly, we acknowledge the low performance of our classification pipeline on the cichlid group dataset. We included analyses in the first version of our manuscript that, in our opinion, can justify the use of our pipeline in such cases (comparison to proximity networks), but we understand the reviewers' concerns. Based on their comments, we will perform additional analyses to further assess whether the use of vassi on this dataset results in valid behavioral metrics. This may, for example, include a comparison of per-individual SNA metrics between pipeline results and ground truth, or equivalent comparisons on the level of group structure (e.g., hierarchy derived from aggression counts). We thank reviewer #2 for these suggestions. As the reviewers further point out, there is no consensus yet on when the performance of behavioral classifiers is sufficient for reliable downstream analyses, and although this manuscript does not have the scope to discuss this for the field, it may help to substantiate discussion in future research.

      Finally, we appreciate the reviewers feedback on vassi as a methodological framework and will address the remaining software-related issues by improving the documentation and accessibility of our example scripts. This will reduce the technical hurdle to use vassi in further research. Additionally, we aim to incorporate a third dataset to demonstrate how our framework can be used for iterative training on a sparsely annotated dataset of groups, while broadening the taxonomic scope of our manuscript.

    1. Allen Guttmann explains how modern sports differ from the games foundin ancient society. He ascribes seven distinct features (two values and fiveprocesses) to modern sports that distinguish it from ancient games: secu-larism, equality, specialization, rationalization, bureaucratization, quan-tification, and record

      Modern sports are usually jobs. Such a pipeline is inherently speculative, not because you shouldn't make a living with art or fitness, but because in doing so, you are the product, and people will rate you.

    1. The person (or people) whose content or actions are going viral, who might want attention, or get financial gain, or might be embarrassed or might get criticism or harassment, etc. Different people involved might have different interests. Some may not have awareness of it happening at all (like a video of an infant).

      I think going viral can honestly be a double end sword. There are many cases where people have gained huge financial success and reputation from going viral weather intentionally or unintendedly. For example, many Youtubers and Tik Tokers have created huge brands for themselves from simply going viral off their respected app. But going viral can also bring harassment, and even ruin people's lives. For example, I remember watching a interview where a woman talks about how an ad she modeled for became a massive meme online, causing people to mock, harass, and even send death threats over her looks. This just goes to show although something may sound good in retrospect, there are many downsides that may also come with it.

    1. Hypertext

      the future of HyperText is HyperPlex

      Hypertext Mark Up Language

      HyperPlex Mark-in Notation HPMI

      reimaginging HTML

      local-first person-first self-husted autopoietic

      Open commons based

      Peer produced

      Extendable

      Integral Omnioptional omnipresent

      emergent Open Stadards

      not just open source

      but Open Constructs

      embodied Open de factor co-evolutionary Standards

      liveing lively organicallly vo-evolving

      field of emergent practices self-describing self-explicating

    1. 57 percent are female, reflecting males’ shorter life spans

      Yeah and we wonder why they have shorter lifespans. It’s almost comical that they haven’t seem to figure it out for themselves, but I digress.

    1. Life expectancy has been increasing in the United States along with the rest of the world

      I wonder truly if the life expectancy will continue to increase as humanity ages or will it end up declining. Maybe with better medicines the population will continue to grow and prosper, but if not we could see a potential decline in society. There’s no way to tell until we get there and see what the future holds.

    1. Artificiality is used in the film to emphasize the visual design of the film and how they used cinematic tools to create a world in the story. The film did not have the intention of being realistic to the time period or setting, but to create emotion and suspense in the storyline of Macbeth. With its use of shadows and high contrast black and white shadows, it gives attention to the amount of detail and high contrast efforts that were implemented in the film.

    2. The heightened self-awareness was something I haven't seen in many films but it adds to the suspenseful and cold/uncomfortable feeling of the story leading to some big event--which ended up being when he was killed.

    1. , practitioners and theorists began to seepotential in them for small-scale but important educational reform

      Here, school language policies are framed as micro-level reforms capable of improving daily teaching and learning practices, showing that systemic change can start locally.

    2. School language policies are viewed by many in education as an integral and necessary part of theadministration and the curriculum practice of modern schools.

      The opening statement establishes that language policy is not just linguistic concern but a core component of school governance and curriculum development. It connects with language directly to educational structure and success.

    Annotators

    1. I don’t think any of us disagree that it’s nice to have international students but then what arethe implications for teaching and learning because this course in its current format does notwork for that class. So does it mean that we change the learners or do we change the course?And then what does that mean for more local learners? Because we can’t do everything foreverybody.

      This is basically saying that Kota’s experience shows how important instructors are in helping international students get fully involved and feel included in class. But it also points out that not all teachers or native-speaking students automatically have the skills or resources to help second-language learners and it’s not only international students who need to adjust to the academic community. Since classes are becoming more diverse, teaching should be flexible. Instructors might need to change the way they teach, adjust course content, or adapt requirements to fit both international and local students’ needs. Dr. Evans says that while it’s great to have international students, the way the course is set up right now doesn’t work well for a mixed group. She wonders whether the answer is to change the students or change the course and reminds us that we can’t meet everyone’s needs all the time.

    2. Because I want them to understand what foreign students like myself are going through... . Ithink that local students can show more understanding to foreign students if foreign studentstalk to them privately outside the classroom.

      Kota explains that language was his biggest challenge, so he tried various strategies: taking ESL/EAP courses, auditing classes, hiring private tutors, and speaking English at home. These helped somewhat, but they took time away from his main doctoral work, and ESL classes didn’t always match graduate-level needs. Private tutors were especially effective they offered a safe space to practice speaking, refine research ideas, and learn about Canadian academic culture. Kota also built connections by talking one-on-one with classmates outside of class. This allowed him to clarify course material, get feedback, learn about their backgrounds, and better understand the local academic community. He even reached out to those least interested in him, believing private conversations could help local students empathize more with the challenges faced by international students like himself.

    3. There’s a gender issue. . . . It’s happened before and it’s with international students, male, andtheir respect for female instructors. ... The rules are different than they are in other culturesand it’s a problem for the student and for the instructor.... And it’s hard to confront. Notrespect as... it’s not a different sort of respect but just general respect for the teaching andlearning experience and sometimes that isn’t present.

      I think this means that male international students sometimes struggle with respecting female instructors in the same way that’s expected here. It’s not always intentional disrespect, but cultural differences in views on teaching and learning. This can cause problems and is hard to address.

    4. A third perspective, gender, provides another interesting way to look at Kota’s academicsocialization. There were two contexts where gender issues surfaced. One was the doctoralseminar where Kota felt particularly powerless mainly because of his self-percetvedlimited language skills and his minority status as the only international student. Asdiscussed earlier, another significant reason for his marginality seemed to be relatedto the gap between his research interest and the kinds of research approaches that hadcurrency in the department. Interestingly, gender seemed to be also relevant to thisgap in a subtle but potentially significant way. Whereas feminism, critical theories, andissues of minority education were popular in the department, Kota was interested inexploring university—industry collaboration from a perspective of economics — a viewpointthat he felt might be considered as ‘a male perspective’

      This part is talking about gender as another way to understand Kota’s experience in grad school. Gender came up in two situations. First, in the doctoral seminar, Kota felt powerless because he thought his language skills weren’t good enough and he was the only international student. Second, his research focus was really different from what most people in the department were doing. While others focused on topics like feminism, critical theory, and minority education, Kota studied university industry collaboration through an economics lens. He felt this might be seen as more of a “male” perspective, which made him stand out even more. So, his sense of being on the outside wasn’t only about language or culture the type of research he did, and how it might be gendered, also played a part

    5. In the doctoral seminar I was the only one who was interested in looking at education from aperspective of economics.

      In the seminar, Kota focused on studying education using economics theories, but nobody else was interested doing the same.

    6. They areimportant not only for them because they have a sense of participation and ownership, but it’salso very important for Canadian students.

      This matters for international students because it helps them feel involved, valued, and equally valuable for Canadian students so they can broaden their understanding.

    7. But there’s no such responsibility when we meet in a pub at night ....

      Outside of class, like in informal setting, people aren't expected to stop and listen to him in the same way.

    8. But I’ve been trained to speakJapanese very precisely at a high academic level. ..,

      In Japan, his academic background taught him to speak only after careful thought, making sure everything is well formed.

    9. He was able to participate more actively in his other graduate courses, but alsofaced different kinds of challenges in different courses,

      His engagement varied by course type, suggesting participation was strongly influenced by class structure, format, and expectations.

    1. .Even when they are not,'by abundant testimony of the medical fraternity continuancefor a -long time on her feet at work,' repeating this from day today, tends to injurious effects upon the body

      The same is true for men, but likely not noticed or focused on by doctors because there is no focus on them having to carry children

    Annotators

    1. generation is as deep and widespread as today’suncertainty over what constitutes British values. But this isonly one of the problems for English

      How this is different generations

    1. The room for cooking (the kitchen) used to be separated from the room where people socialized (the living room or great room), as it was assumed that one person (the wife) would cook in the kitchen while another person (the husband) relaxed alone or with company in the living room.

      These have been general societal norms but we are evolving and things are changing throughout time.

    2. A “good” mother is a mother who puts her children at the center of her life at all times,

      this idea of a ‘good’ mom being someone who gives up everything for her kids is so common. And like, it’s kind of true in how moms are expected to act but it really shouldn’t all fall on the mom. That’s a lot of pressure, and it feels unfair.

    1. In this way DJs might be understood asperforming a literature review, paying tribute to the Marvin Gayes, the Slyand the Family Stones, and other artists who came before the newer sound.Such acts of mixing demonstrated a DJ’s bona fides.

      Again showing the compositional innovation of hip hop but also how DJ's and writers aren't that different.

    2. Hip-hop’s willingness to sample and appropriatewords, lyrics, and beats has allowed hip-hop artists to do much more withmusic than was previously done. Far from the way white artists would stealblack artists’ songs in the early days of rock and roll, hip-hop artists oftenwillingly acknowledge what they take as a demonstration of their musicalknowledge. For example, DJs used to comb through record stores looking forthe most obscure beat to sample

      I think this sentence really highlights the difference hip-hop made to the musical field it also shows the diversity, but it also relates to the compositional innovation of hip hop and how DJ's use samples and make an important note to acknowledge the work that their sampling, just like writers try there best to find the best sources possible and make sure not to steal work and if they need to reference something from someone else's work they make sure to give credit to the original work.

    3. “Real Gs move in silence like lasagna” is a prime exampleof how seriously hip-hop artists take language and wordplay. Upon hearingthe line, it doesn’t make much sense to most people. What do lasagna andsilence have in common? Yet, when one has the line printed, the messagebecomes clear. The silent “g” in “lasagna” makes the point about a “G” mov-ing in silence, “G” being the slang for a gangsta

      I love this sentence it shows the complexity of word play when it comes to lyrics and how meanings are hidden anywhere when it comes to hip-hop, but largely go unrecognized, again highlighting why studying hip-hop to gain a different perspective on composition and English is important.

    Annotators

    1. In order to take part in trade and politics, the demand of skills in reading and writing arose

      Oversimplifying. Literacy served as an asset to not just trade and politics but also private affairs and culture.

    1. Next, focus in on several of these impossible wishes and use them as creative stimuli to generate ideas that are novel but more realistic.

      I don't necessarily disagree with this as a tactic for generating good ideas but "saying impossible wishes" and "more realistic" seem contradictory, Dreaming allows you to come up with ways to approach a challenge but it also means someone might struggle to come up with ideas if they already see certain wishes as impossible.