10,000 Matching Annotations
  1. Jan 2026
    1. Reviewer #1 (Public review):

      Summary:

      This study investigates whether prediction error extends beyond lower-order sensory-motor processes to include higher-order cognitive functions. Evidence is drawn from both task-based and resting-state fMRI, with addition of resting-state EEG-fMRI to examine power spectral correlates. The results partially support the existence of dissociable connectivity patterns: stronger ventral-dorsal connectivity is associated with high prediction error, while posterior-anterior connectivity is linked to low prediction error. Furthermore, spontaneous switching between these connectivity patterns was observed at rest and correlated with subtle intersubject behavioral variability.

      Strengths:

      Studying prediction error from the lens of network connectivity provides new insights into predictive coding frameworks. The combination of various independent datasets to tackle the question adds strength, including two well-powered fMRI task datasets, resting-state fMRI interpreted in relation to behavioral measures, as well as EEG-fMRI.

      Minor Weakness:

      The lack of spatial specificity of sensor-level EEG somewhat limits the inferences that can be obtained in terms of how the fMRI network processes and the EEG power fluctuations relate to each other.<br /> While the language no longer suggests a strong overlap of the source of the two signals, several scenarios remain open (e.g., the higher-order fMRI networks being the source of the EEG oscillations, or the networks controlling the EEG oscillations expressed in lower-order cortices, or a third process driving both the observations in fMRI networks and EEG oscillations...) and somewhat weaken interpretability of this section.

      Comments on revisions:

      My prior recommendations have been mostly addressed.

      Questions remaining about the NBS results:

      The authors write about the NBS cluster: "Visual examination of the cluster roughly points to the same four posterior-anterior and ventral-dorsal modules identified formally in main-text ". I think it might be good to add quantification, not just visual inspection. The size of the significant NBS cluster should be reported. What proportion of the edges that passed uncorrected threshold and entered NBS were part of the NBS cluster? Put simply, I don't think any edges beyond those passing NBS-based correction should be interpreted or used downstream in the manuscript.

      Also, NBS is not typically used by collapsing over effects in two effect directions, but the authors use NBS on the absolute value of Z. I understand the logic of the general manuscript focusing on strength rather than direction, but here I am wondering about the methodological validity. I believe that the editor who is an expert on the methodology may be able to comment on the validity of this approach (as opposed to running two separate NBS analyses for the two directions of effect).

    2. Reviewer #2 (Public review):

      Summary:

      This paper investigates putative networks associated with prediction errors in task-based and resting state fMRI. It attempts to test the idea that prediction errors minimisation includes abstract cognitive functions, referred to as global prediction error hypothesis, by establishing a parallel between networks found in task-based fMRI where prediction errors are elicited in a controlled manner and those networks that emerge during "resting state".

      Strengths:

      Clearly a lot of work and data went into this paper, including 2 task-based fMRI experiments and the resting state data for the same participants, as well as a third EEG-fMRI dataset. Overall well written with a couple of exceptions on clarity as per below and the methodology appears overall sound, with a couple of exceptions listed below that require further justification. It does a good job of acknowledging its own weakness.

      Weaknesses:

      The paper does a good job of acknowledging its greatest weakness, the fact that it relies heavily on reverse inference, but cannot quite resolve it. As the authors put, "finding the same networks during a prediction error task and during rest does not mean that the networks engagement during rest reflect prediction error processing". Again, the authors acknowledge the speculative nature of their claims in the discussion, but given that this is the key claim and essence of the paper, it is hard to see how the evidence is compelling to support that claim.

      Given how uncontrolled cognition is during "resting-state" experiments, the parallel made with prediction errors elicited during a task designed to that effect is a little difficult to make. How often are people really surprised when their brains are "at rest", likely replaying a previously experienced event or planning future actions under their control? It seems to be more likely a very low prediction error scenario, if at all surprising.

      The quantitative comparison between networks under task and rest was done on a small subset of the ROIs rather than on the full network - why? Noting how small the correlation between task and rest is (r=0.021) and that's only for part of the networks, the evidence is a little tenuous. Running the analysis for the full networks could strengthen the argument.

      Looking at the results in Figure 2C, the four-quadrant description of the networks labelled for low and high PE appears a little simplistic. The authors state that this four-quadrant description omits some ROIs as motivated by prior knowledge. This would benefit from a more comprehensive justification. Which ROIs are excluded and what is the evidence for exclusion?

      The EEG-fMRI analysis claiming 3-6Hz fluctuations for PE is hard to reconcile with the fact that fMRI captures activity that is a lot slower while some PEs are as fast as 150 ms. The discussion acknowledges this but doesn't seem to resolve it - would benefit from a more comprehensive argument.

      Comments on revisions:

      The authors have done a good job of addressing the issues raised during the review process. There is one issue remaining that still required attention. In R2.4. when referring to "existing knowledge of prominent structural pathways among these quadrants" please cite the relevant literature.

    3. Reviewer #3 (Public review):

      Summary:

      Bogdan et al. present an intriguing investigation into the spontaneous dynamics of prediction error (PE)-related brain states. Using two independent fMRI tasks designed to elicit prediction and prediction error in separate participant samples, alongside both fMRI and EEG data, the authors identify convergent brain network patterns associated with high versus low PE. Notably, they further show that similar patterns can be detected during resting-state fMRI, suggesting that PE-related neural states may recur outside of explicit task demands.

      Strengths:

      The authors use a well-integrated analytic framework that combines multiple prediction tasks and brain imaging modalities. The inclusion of several datasets probing PE under different contexts strengthens the claim of generalizability across tasks and samples. The open sharing of code and data is commendable and will be valuable for future work seeking to build on this framework.

      Weaknesses:

      A central challenge of the manuscript lies in interpreting the functional significance of PE-related brain network states during rest. Demonstrating that a task-defined cognitive state recurs spontaneously is intriguing, but without clear links to behavior, individual traits, or experiential content during rest, it remains difficult to interpret what such spontaneous brain states tell us about the mind and brain. For example, it is unclear whether these states support future inference or learning, reflect offline predictive processing, or instead suggest state reinstatement due to a more general form of neural plasticity and circuit dynamics in the brain. Demonstrating any one of these downstream relationships would be valuable since it has the potential to inform our understanding of cognitive function or more general principles of neural organization.

      I appreciate the authors' position that establishing the existence of such states is a necessary first step, and that future work may clarify their behavioral relevance. However, the current form makes it challenging to assess the conceptual advance of the present work in isolation.

      Relatedly, in my previous review I raised questions about both across- and within-individual variability-for example, whether individuals who exhibit stronger or more distinct PE-related fluctuations at rest also show superior performance on prediction-related tasks (across-individual), or whether momentary increases in PE-network expression during tasks relate to faster or more accurate prediction (within-individual). The authors thoughtfully addressed this suggestion by conducting an individual-differences analysis correlating each participant's fluctuation amplitude with approximately 200 behavioral and trait measures from the HCP dataset.

      The reported findings-a negative association with age and card-sorting performance, alongside a positive association with age-adjusted picture sequence memory-are interesting but difficult to interpret within a coherent functional framework. As presented, these results do not clearly support the idea that spontaneous PE-state fluctuations are related to enhancement in prediction, inference, or broader cognitive function. Instead, they raise the possibility that fluctuation amplitude may reflect more general factors (e.g., age) rather than a functionally meaningful PE-related process.

      Overall, while the methodological contribution is strong, the manuscript would benefit from a clearer articulation of what functional conclusions can or cannot be drawn from the presence of spontaneous PE-related states, as well as a more cautious framing of their potential cognitive significance.

      Further comments:

      I appreciate that the authors took my earlier suggestions seriously and incorporated additional analyses examining behavioral relevance and permutation tests in the revision.

    1. Reviewer #2 (Public review):

      This study identifies Visham, an asymmetric structure in developing mouse cysts resembling the Drosophila fusome, an organelle crucial for oocyte determination. Using immunofluorescence, electron microscopy, 3D reconstruction, and lineage labeling, the authors show that primordial germ cells (PGCs) and cysts, but not somatic cells, contain an EMA-rich, branching structure that they named Visham, which remains unbranched in male cysts. Visham accumulates in regions enriched in intercellular bridges, forming clusters reminiscent of fusome "rosettes." It is enriched in Golgi and endosomal vesicles and partially overlaps with the ER. During cell division, Visham localizes near centrosomes in interphase and early metaphase, disperses during metaphase, and reassembles at spindle poles during telophase before becoming asymmetric. Microtubule depolymerization disrupts its formation.

      Cyst fragmentation is shown to be non-random, correlating with microtubule gaps. The authors propose that 8-cell (or larger) cysts fragment into 6-cell and 2-cell cysts. Analysis of Pard3 (the mouse ortholog of Par3/Baz) reveals its colocalization with Visham during cyst asymmetry, suggesting that mammalian oocyte polarization depends on a conserved system involving Par genes, cyst formation, and a fusome-like structure.

      Transcriptomic profiling identifies genes linked to pluripotency and the unfolded protein response (UPR) during cyst formation and meiosis, supported by protein-level reporters monitoring Xbp1 splicing and 20S proteasome activity. Visham persists in meiotic germ cells at stage E17.5 and is later transferred to the oocyte at E18.5 along with mitochondria and Golgi vesicles, implicating it in organelle rejuvenation. In Dazl mutants, cysts form, but Visham dynamics, polarity, rejuvenation, and oocyte production are disrupted, highlighting its potential role in germ cell development.

      Overall, this is an interesting and comprehensive study of a conserved structure in the germline cells of both invertebrate and vertebrate species. Investigating these early stages of germ cell development in mice is particularly challenging. Although primarily descriptive, the study represents a remarkable technical achievement. The images are generally convincing, with only a few exceptions.

      Major comments:

      (1) Some titles contain strong terms that do not fully match the conclusions of the corresponding sections.

      (1a) Article title "Mouse germline cysts contain a fusome-like structure that mediates oocyte development":

      The term "mediates" could be misleading, as the functional data on Visham (based on comparing its absence to wild-type) actually reflects either a microtubule defect or a Dazl mutant context. There is no specific loss-of-function of visham only.

      (1b) Result title, "Visham overlaps centrosomes and moves on microtubules":

      The term "moves" implies dynamic behavior, which would require live imaging data that are not described in the article.

      (1c) Result title, "Visham associates with Golgi genes involved in UPR beginning at the onset of cyst formation":

      The presented data show that the presence of Visham in the cyst coincides temporally with the expression and activity of the UPR response; the term "associates" is unclear in this context.

      (1d) Result title, "Visham participates in organelle rejuvenation during meiosis":

      The term "participates" suggests that Visham is required for this process, whereas the conclusion is actually drawn from the Dazl mutant context, not a specific loss-of-function of visham only.

      (2) The authors aim to demonstrate that Visham is a fusome-like structure. I would suggest simply referring to it as a "fusome-like structure" rather than introducing a new term, which may confuse readers and does not necessarily help the authors' goal of showing the conservation of this structure in Drosophila and Xenopus germ cells. Interestingly, in a preprint from the same laboratory describing a similar structure in Xenopus germ cells, the authors refer to it as a "fusome-like structure (FLS)" (Davidian and Spradling, BioRxiv, 2025).

      Comments on revisions:

      The revised manuscript has been clearly improved, and the authors have addressed all of our comments. I would like to point out two minor issues:

      (1) As suggested by the reviewers, the authors now use the term fusome instead of visham. However, they also acknowledge that this structure lacks many components of the Drosophila fusome. It may therefore be more appropriate to refer to it as a "mouse fusome" or as a "fusome-like structure (FLS)," as used in Xenopus.

      (2) I agree with Reviewer 3 that co-localization between EMA and acTubulin on still images does not convincingly demonstrate that fusome vesicles move along microtubules (Figure S2E).

    2. Reviewer #3 (Public review):

      The manuscript provides evidence that mice have a fusome, a conserved structure most well studied in Drosophila that is important for oocyte specification. Overall, a myriad of evidence is presented demonstrating the existence of a mouse fusome. This work is important as it addresses a long-standing question in the field of whether mice have fusomes and sheds light on how oocytes are specified in mammals.

      Comments on revisions:

      Overall, the authors did a good job of responding to reviewer comments that have improved the manuscript by including higher quality microscope images, revising text for clarity and using the term mouse fusome instead of using a new term. However, two of the headings in the results section that didn't correspond to the data presented in that section still have not been revised eventhough the authors stated that they were revised in their response to reviewer comments. The heading of the first section of the results is: "PGCs contain a Golgi-rich structure known as the EMA granule" even though no evidence in that section shows it is Golgi rich. The heading of the fifth section of the results is: "The mouse fusome associates with polarity and microtubule genes including pard3" however, only evidence for pard3 is presented.

    1. Reviewer #1 (Public review):

      Summary:

      The issue of how the brain can maintain serial order of presented items in working memory is a major unsolved question in cognitive neuroscience. It has been proposed that this serial order maintenance could be achieved thanks to periodic reactivations of different presented items at different phases of an oscillation, but the mechanisms by which this could be achieved by brain networks, as well as the mechanisms of read-out, are still unclear. In an influential 2008 paper, the authors have proposed a mechanism by which a recurrent network of neurons could maintain multiple items in working memory, thanks to `population spikes' of populations of neurons encoding for the different items, occurring at alternating times. These population spikes occur in a specific regime of the network and are a result of synaptic facilitation, an experimentally observed type of synaptic short-term dynamics with time scales of order hundreds of ms.

      In the present manuscript, the authors extend their model to include another type of experimentally observed short-term synaptic plasticity termed synaptic augmentation, that operates on longer time scales on the order of 10s. They show that while a network without augmentation loses information about serial order, augmentation provides a mechanism by which this order can be maintained in memory thanks to a temporal gradient of synaptic efficacies. The order can then be read out using a read-out network whose synapses are also endowed with synaptic augmentation. Interestingly, the read-out speed can be regulated using background inputs.

      Strengths:

      This is an elegant solution to the problem of serial order maintenance, that only relies on experimentally observed features of synapses. The model is consistent with a number of experimental observations in humans and monkeys. The paper will be of interest to the broad readership of eLife and I believe it will have a strong impact on the field.

      Comments on revisions:

      I am happy with how the authors have addressed my comments, and believe the paper can be published in its present form.

    2. Reviewer #2 (Public review):

      In this manuscript, the authors present a model to explain how working memory (WM) encodes both existence and timing simultaneously using transient synaptic augmentation. A simple yet intriguing idea.

      The model presented here has the potential to explain what previous theories like 'active maintenance via attractors' and 'liquid state machine' do not, and describe how novel sequences are immediately stored in WM. Altogether, the topic is of great interest to those studying higher cognitive processes, and the conclusions the authors draw are certainly thought-provoking from an experimental perspective.

      Comments on revisions:

      The authors have done an excellent job of addressing the questions that I raised, and the manuscript is greatly improved - both in content and clarity. It is an insightful advance and I recommend publication.

    1. Reviewer #1 (Public review):

      The authors relate a language model developed to predict whether a given sentence correctly followed another given sentence to EEG recordings in a novel way, showing receptive fields related to widely used TRFs. In these responses (or "regression results"), differences between representational levels are found, as well as differences between attended and unattended speech stimuli, and whether there is hearing loss. These differences are found per EEG channel.

      In addition to these novel regression results, which are apparently captured from the EEG specifically around the sentence stimulus offsets, the authors also perform a more standard mTRF analysis using a software package (Eelbrain) and TRF regressors that will be more familiar to researchers adjacent to these topics, which was highly appreciated for its comparative value. Comparing these TRFs with the authors' original regression results, several similarities can be seen. Specifically, response contrasts for attended versus unattended speaker during mixed speech, for the phoneme, syllable, and sentence regressors, are greater for normal-hearing participants than hearing-impaired participants for both analyses, and the temporal and spatial extents of the significant differences are roughly comparable (left-front and 0 - 200 ms for phoneme and syllable, and left and 200 - 300 ms for sentence).

      The inclusion of the mTRF analysis is helpful also because some aspects of the authors' original regression results, between the EEG data and the HM-LSTM linguistic model, are less than clear. The authors state specifically that their regression analysis is only calculated in the -100 - 300 ms window around stimulus/sentence offsets. They clarify that this means that most of the EEG data acquired while the participants are listening to the sentences is not analyzed, because their HM-LSTM model implementation represents all acoustic and linguistic features in a condensed way, around the end of the sentence. Thus the regression between data and model only occurs where the model predictions exist, which is the end of the sentences. This is in contrast to the mTRF analysis, which seems to have been done in a typical way, regressing over the entire stimulus time, because those regressors (phoneme onset, word onset, etc.) exist over the entire sentence time. If my reading of their description of the HM-LSTM regression is correct, it is surprising that the regression weights are similar between the HM-LSTM model and the mTRF model.

      However, the code that the authors uploaded to OSF seems to clarify this issue. In the file ridge_lstm.py, the authors construct the main regressor matrices called X1 and X2 which are passed to sklearn to do the ridge regression. This ridge regression step is calculated on the continuous 10-minute bouts of EEG and stimuli, and it is calculated in a loop over lag times, from -100 ms to 300 ms lag. These regressor matrices are initialized as zeros, and are then filled in two steps: the HM_LSTM model unit weights are read from numpy files and written to the matrices at one timepoint per sentence (as the authors describe in the text), and the traditional phoneme, syllable, etc. annotations are ALSO read in (from csv files) and written to the matrices, putting 1s at every timepoint of those corresponding onsets/offsets. Thus the actual model regressor matrix for the authors' main EEG results includes BOTH the HM_LSTM model weights for each sentence AND the feature/annotation times, for whichever of the 5 features is being analyzed (phonemes, syllables, words, phrases, or sentences).

      So for instance, for the syllable HM_LSTM regression results, the regressor matrix contains: 1) the HM_LSTM model weights corresponding to syllables (a static representation, placed once per sentence offset time), AND 2) the syllable onsets themselves, placed as a row of 1s at every syllable onset time. And as another example, for the word HM_LSTM regression results, the regressor matrix contains: 1) the HM_LSTM model weights corresponding to words (a static representation, placed once per sentence offset time), AND 2) the word onsets themselves, placed as a row of 1s at every word onset time.

      If my reading of the code is correct, there are two main points of clarification for interpreting these methods:

      First, the authors' window of analysis of the EEG is not "limited" to 400 ms as they say; rather the time dimension of both their ridge regression results and their traditional mTRF analysis is simply lags (400 ms-worth), and the responses/receptive fields are calculated over the entire 10-minute trials. This is the normal way of calculating receptive fields in a continuous paradigm. The authors seem to be focusing on the peri-sentence offset time points because that is where the HM_LSTM model weights are placed in the regressor matrix. Also because of this issue, it is not really correct when the authors say that some significant effect occurred at some latency "after sentence offset". The lag times of the regression results should have the traditional interpretation of lag/latency in receptive field analyses.

      Second, as both the traditional linguistic feature annotations and the HM_LSTM model weights are part of the regression for the main ridge regression results here, it is not known what the contribution specifically of the HM_LSTM portion of the regression was. Because the more traditional mTRF analysis showed many similar results to the main ridge regression results here, it seems probable that the simple feature annotations themselves, rather than the HM_LSTM model weights, are responsible for the main EEG results. A further analysis separating these two sets of regressors would shed light on this question.

    2. Reviewer #3 (Public review):

      Summary:

      The authors aimed to investigate how the brain processes different linguistic units (from phonemes to sentences) in challenging listening conditions, such as multi-talker environments, and how this processing differs between individuals with normal hearing and those with hearing impairments. Using a hierarchical language model and EEG data, they sought to understand the neural underpinnings of speech comprehension at various temporal scales and identify specific challenges that hearing-impaired listeners face in noisy settings.

      Strengths:

      Overall, the combination of computational modeling, detailed EEG analysis, and comprehensive experimental design thoroughly investigates the neural mechanisms underlying speech comprehension in complex auditory environments.

      The use of a hierarchical language model (HM-LSTM) offers a data-driven approach to dissect and analyze linguistic information at multiple temporal scales (phoneme, syllable, word, phrase, and sentence). This model allows for a comprehensive neural encoding examination of how different levels of linguistic processing are represented in the brain.

      The study includes both single-talker and multi-talker conditions, as well as participants with normal hearing and those with hearing impairments. This design provides a robust framework for comparing neural processing across different listening scenarios and groups.

      Weaknesses:

      The study tests only a single deep neural network model for extracting linguistic features, which limits the robustness of the conclusions. A lower model fit does not necessarily indicate that a given type of information is absent from the neural signal-it may simply reflect that the model's representation was not optimal for capturing it. That said, this limitation is a common concern for data-driven, correlation-based approaches, and should be viewed as an inherent caveat rather than a critical flaw of the present work.

    1. Reviewer #1 (Public review):

      The authors attempted to compare calcium calcium-binding properties of wildtype calreticulin with calreticulin deletion mutant (CRTDel52) associated with myeloproliferative neoplasms.

      The researchers conducted their study using advanced techniques. They found almost no difference in calcium binding between the two proteins and observed no impact on calcium signaling, specifically store-operated calcium entry (SOCE). The study also noted an increase in ER luminal calcium-binding chaperone proteins. Surprisingly, the authors selected flow cytometry as a technique for measurements of ER luminal calcium. Considering the limitations of this approach, it would be better to use alternative approaches. This is particularly important as previous reports, using cells from MPN patients, indicate reduced ER luminal calcium and effects on SOCE (Blood, 2020). This issue matters because earlier research with MPN patient cells reported reduced ER luminal calcium levels and altered SOCE (Blood, 2020). How do the authors explain the difference between their results and previous findings about lower ER luminal calcium and changed SOCE in MPN patient cells expressing CRTDel52? Other studies have found that unfolded protein responses are activated in MPN cells with CRTDel52 calreticulin (see Blood, 2021), and increased UPR could account for higher levels of some ER-resident calcium-binding proteins observed here. Overall, it remains unclear how this work improves our understanding of MPN or clarifies calreticulin's role in MPN pathophysiology.

    2. Reviewer #2 (Public review):

      Summary:

      Tagoe and colleagues present a thorough analysis of the calcium (Ca2+) binding capacity of calreticulin (CRT), an endoplasmic reticulum (ER) Ca2+-buffer protein, using a mutant version (CRT del52) found in myeloproliferative neoplasms (MPNs). The authors use purified human CRT protein variants, CRT-KO cell lines, and an MPN cell line to elucidate the differing Ca2+ dynamics, both on the level of the protein and on cell-wide Ca2+-governed processes. In sum, the authors provide new insights into CRT that can be applied to both normal and malignant cell biology.

      First, the authors purify CRT protein and perform isothermal titration calorimetry to quantify the Ca2+ binding capacity of CRT. They use full-length human CRT, CRT del52, and two truncations of CRT (1-339 and 1-351, the former of which should lead to the entire loss of low-affinity Ca2+ binding). While CRT del52 has previously been shown to lead to a decrease in Ca2+ binding affinity in other models, the ITC data show that this is retained in CRT del52.

      Next, the authors utilize a CRT-KO cell line with subsequent addition of CRT protein variants to validate these findings with flow cytometric analysis. Cells were transfected with a ratiometric ER Ca2+ probe, and fluorescence indicates that CRT del52 is unable to restore basal ER Ca2+ levels to the same extent as CRT wild-type. To translate these findings to MPNs, the authors perform CRT-KO in a megakaryocytic cell line, where reconstitution with either CRT variant did not cause a difference in cytosolic calcium levels. The authors further test store-operated calcium entry (SOCE), an important process for maintaining ER Ca2+ levels, in these cells, and find that CRT-KO cells have lower SOCE activity, and that this can be slightly recovered with CRT addition.

      Finally, the authors ask whether other effects of CRT-KO/reconstitution can affect the cellular Ca2+ signaling pathway and levels. RNASeq analysis revealed that CRT-KO leads to an increase in various chaperone protein expressions, and that reconstitution with CRT del52 is unable to reduce expression to the same extent as reconstitution with CRT wildtype.

      Strengths:

      The authors provide new insights into CRT that can be applied to both normal and malignant cell biology.

      Weaknesses:

      (1) The authors should consider discussing the high-affinity Ca2+ binding site more in the introduction. Can they show a proof-of-concept experiment that validates that incubation of recombinant CRT reduces the function of that high-affinity Ca2+ binding site?

      (2) For Figure 2B, do you have an explanation for why the purified proteins run higher than predicted (48-52kDa) - are these proteins still tagged with pGB1?

      (3) The MEG-01 cell line has the BCR::ABL1 translocation, while CRT mutations are strictly found in BCR::ABL1 negative MPNs. Could these experiments be repeated in these cells treated with imatinib to decrease these effects, or see if basal MEG-01 Ca2+ levels/activity are changed with or without imatinib?

    1. Reviewer #1 (Public review):

      Summary:

      Sullivan and colleagues examined the modulation of reflexive visuomotor responses during collaboration between pairs of participants performing a joint reaching movement to a target. In their experiments, the players jointly controlled a cursor that they had to move towards narrow or wide targets. In each experimental block, each participant had a different type of target they had to move the joint cursor to. During the experiment, the authors used lateral perturbation of the cursor to test participants' fast feedback responses to the different target types. The authors suggest participants integrate the target type and related cost of their partner into their own movements, which suggests that visuomotor gains are affected by the partner's task.

      Strengths:

      The topic of the manuscript is very interesting, and the authors are using well-established methodology to test their hypothesis. They combine experimental studies with optimal control models to further support their work. Overall, the manuscript is very timely and shows important findings - that the feedback responses reflect both our and our partner's tasks.

      Weaknesses:

      However, in the current version of the manuscript, I believe the results could also be interpreted differently, which suggests that the authors should provide further support for their hypothesis and conclusions.

      Major Comments:

      (1) Results of the relevant conditions:

      In addition to the authors' explanation regarding the results, it is also possible that the results represent a simple modulation of the reflexive response to a scaled version of cursor movement. That is, when the cursor is partially controlled by a partner, which also contributes to reducing movement error, it can also be interpreted by the sensorimotor system as a scaling of hand-to-cursor movement. In this case, the reflexes are modulated according to a scaling factor (how much do I need to move to bring the cursor to the target). I believe that a single-agent simulation of an OFC model with a scaling factor in the lateral direction can generate the same predictions as those presented by the authors in this study. In other words, maybe the controller has learned about the nature of the perturbation in each specific context, that in some conditions I need to control strongly, whereas in others I do not (without having any model of the partner). I suggest that the authors demonstrate how they can distinguish their interpretation of the results from other explanations.

      (2) The effect of the partner target:

      The authors presented both self and partner targets together. While the effect of each target type, presented separately, is known, it is unclear how presenting both simultaneously affects individual response. That is, does a small target with a background of the wide target affect the reflexive response in the case of a single participant moving? The results of Experiment 2, comparing the case of partner- and self-relevant targets versus partner-irrelevant and self-relevant targets, may suggest that the system acted based on the relevant target, regardless of the presence and instructions regarding the self-target.

      (3) Experiment instructions:

      It is unclear what the general instructions were for the participants and whether the instructions provided set the proposed weighted cost, which could be altered with different instructions.

      (4) Some work has shown that the gain of visuomotor feedback responses reflects the time to target and that this is updated online after a perturbation (Cesonis & Franklin, 2020, eNeuro; Cesonis and Franklin, 2021, NBDT; also related to Crevecoeur et al., 2013, J Neurophysiol). These models would predict different feedback gains depending on the distance remaining to the target for the participant and the time to correct for the jump, which is directly affected by the small or large targets. Could this time be used to target instead of explaining the results? I don't believe that this is the case, but the authors should try to rule out other interpretations. This is maybe a minor point, but perhaps more important is the location (& time remaining) for each participant at the time of the jump. It appears from the figures that this might be affected by the condition (given the change in movement lengths - see Figure 3 B & C). If this is the case, then could some of the feedback gain be related to these parameters and not the model of the partner, as suggested? Some evidence to rule this out would be a good addition to the paper - perhaps the distance of each partner at the time of the perturbation, for example. In addition, please analyze the synchrony of the two partners' movements.

    2. Reviewer #2 (Public review):

      Summary:

      Sullivan and colleagues studied the fast, involuntary, sensorimotor feedback control in interpersonal coordination. Using a cleverly designed joint-reaching experiment that separately manipulated the accuracy demands for a pair of participants, they demonstrated that the rapid visuomotor feedback response of a human participant to a sudden visual perturbation is modulated by his/her partner's control policy and cost. The behavioral results are well-matched with the predictions of the optimal feedback control framework implemented with the dynamic game theory model. Overall, the study provides an important and novel set of results on the fast, involuntary feedback response in human motor control, in the context of interpersonal coordination.

      Review:

      Sullivan and colleagues investigated whether fast, involuntary sensorimotor feedback control is modulated by the partner's state (e.g., cost and control policy) during interpersonal coordination. They asked a pair of participants to make a reaching movement to control a cursor and hit a target, where the cursor's position was a combination of each participant's hand position. To examine fast visuomotor feedback response, the authors applied a sudden shift in either the cursor (experiment 1) or the target (experiment 2) position in the middle of movement. To test the involvement of partner's information in the feedback response, they independently manipulated the accuracy demand for each participant by varying the lateral length of the target (i.e., a wider/narrower target has a lower/higher demand for correction when movement is perturbed). Because participants could also see their partner's target, they could theoretically take this information (e.g., whether their partner would correct, whether their correction would help their partner, etc.) into account when responding to the sudden visual shift. Computationally, the task structure can be handled using dynamic game theory, and the partner's feedback control policy and cost function are integrated into the optimal feedback control framework. As predicted by the model, the authors demonstrated that the rapid visuomotor feedback response to a sudden visual perturbation is modulated by the partner's control policy and cost. When their partner's target was narrow, they made rapid feedback corrections even when their own target was wide (no need for correction), suggesting integration of their partner's cost function. Similarly, they made corrections to a lesser degree when both targets were narrower than when the partner's target was wider, suggesting that the feedback correction takes the partner's correction (i.e., feedback control policy) into account.

      The strength of the current paper lies in the combination of clever behavioral experiments that independently manipulate each participant's accuracy demand and a sophisticated computational approach that integrates optimal feedback control and dynamic game theory. Both the experimental design and data analysis sound good. While the main claim is well-supported by the results, the only current weakness is the lack of discussion of limitations and an alternative explanation. Adding these points will further strengthen the paper.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript aims to test the idea that visual recognition (of faces) is hierarchically organized in the human ventral occipital-temporal cortex (VOTC). The paper proposes that if VOTC has a hierarchical organization, this should be seen in two independent features of the VOTC signal. First, hierarchy assumes that signals along the hierarchy increase in representational complexity. Second, hierarchy assumes a progressive increase in the onset time of the earliest neural response at each level of the hierarchy. To test these predictions, the authors extract high-frequency broadband signals from iEEG electrodes in a very large sample of patients (N=140). They find that face selectivity in these signals is distributed across the VOTC with increasing posterior-anterior face selectivity, hence providing evidence for the first prediction. However, they also find broadband activity to occur concurrently, therefore challenging the view of a serial hierarchy.

      Strengths:

      (1) The hypothesis (that VOTC is hierarchically organized) and predictions (that hierarchy predicts increases in representational complexity and increases in onset time) were clearly described.

      (2) The number of subjects sampled (140) is extremely large for iEEG studies that typically involve <10 subjects. Also, 444 face selective recording contacts provide a very nice sampling of the areas of interest.

      Weaknesses:

      (1) A control analysis where areas have known differences in response onset should be performed to increase confidence that the proposed analyses would reveal expected results when a difference in response onset was present across areas. From Figure 3, it can be seen that many electrodes are placed in earlier visual areas (V1-V3) that have previously been shown to have earlier broadband responses to visual images compared to VOTC (e.g. Martin et al., 2019, JNeurosci https://doi.org/10.1523/JNEUROSCI.1889-18.2018). The same analyses as in Figures 4 and 5 should be used comparing VOTC to early visual areas to confirm that the analyses would detect that V1-V3 have earlier onsets compared to VOTC.

      (2) It is unclear why correlating mean timeseries helps understand how much variance is shared between regions (Figure 4). Any variance between images is lost when averaging time series across all images, and this metric thus overestimates the variance shared between areas. Moreover, the finding that correlating time domain signals across VOTC areas does not differ from correlating signals within an area could be driven by this averaging. For example, if the same analysis was done on electrodes in left and right V1 when half of the images had contrast in the left hemifield and the other half had contrast in the right hemifield, the average signals may correlate extremely well, while this correlation falls apart on a trial-by-trial basis. These analyses therefore need to be evaluated on a trial-by-trial basis.

      (3) Previous studies on visual processing in VOTC have shown that evoked potentials are more predictive of the onset of visual stimuli than broadband activity (e.g. Miller et al., 2016, PLOS CB, https://doi.org/10.1371/journal.pcbi.1004660). Testing the prediction from a hierarchical representation that signals along the VOTC increase in onset time should therefore include an evaluation of evoked potential onsets in addition to broadband signals.

      (4) Testing the second prediction, that the onset time of processing increases along the VOTC posterior to anterior path, is difficult using the iEEG broadband signal, because from a signal processing perspective, broadband signals are inherently temporally inaccurate, given that they are filtered. Any filtering in the signal introduces a certain level of temporal smoothing. The manuscript should clearly describe the level of temporal smoothing for the filter settings used.

      (5) The onsets of neural activity in VOTC are surprisingly early: around 80-100 ms. This is earlier than what has previously been reported. For example, the cited Quian Quiroga et al. (2023) found single neuron responses to have the earlier onset around 125 ms (their Figure 3). Similarly, the cited Jacques et al., 2016b and Kadipasaoglu et al., 2017 papers also observe broadband onsets in VOTC after 100 ms. Understanding the temporal smoothing in the broadband signal, as well as showing that typical evoked potentials have latencies compared to other work, would increase confidence that latencies are not underestimated due to factors in the analysis pipeline.

      (6) Understanding the extent to which neural processing in the VOTC is hierarchical is essential for building models of vision that capture processing in the human brain, and the data provides novel insight into these processes.

      For additional context, a schematic figure of the hierarchical view and a more parallel system described in the paragraph on models of visual recognition (lines 553) would help the reader interpret and understand the implications of the paper.

    2. Reviewer #2 (Public review):

      Summary:

      This very ambitious project addresses one of the core questions in visual processing related to the underlying anatomical and functional architecture. Using a large sample of rare and high-quality EEG recordings in humans, the authors assess whether face-selectivity is organised along a posterior-anterior gradient, with selectivity and timing increasing from posterior to anterior regions. The evidence suggests that it is the case for selectivity, but the data are more mixed about the temporal organisation, which the authors use to conclude that the classic temporal hierarchy described in textbooks might be questioned, at least when it comes to face processing.

      Strengths:

      A huge amount of work went into collecting this highly valuable dataset of rare intracranial EEG recordings in humans. The data alone are valuable, assuming they are shared in an easily accessible and documented format. Currently, the OSF repository linked in the article is empty, so no assessment of the data can be made. The topic is important, and a key question in the field is addressed. The EEG methodology is strong, relying on a well-established and high SNR SSVEP method. The method is particularly well-suited to clinical populations, leading to interpretable data in a few minutes of recordings. The authors have attempted to quantify the data in many different ways and provided various estimates of selectivity and timing, with matching measures of uncertainty. Non-parametric confidence intervals and comparisons are provided. Collectively, the various analyses and rich illustrations provide superficially convincing evidence in favour of the conclusions.

      Weaknesses:

      (1) The work was not pre-registered, and there is no sample size justification, whether for participants or trials/sequences. So a statistical reviewer should assess the sensitivity of the analyses to different approaches.

      (2) Frequentist NHST is used to claim lack of effects, which is inappropriate, see for instance:

      Greenland, S., Senn, S. J., Rothman, K. J., Carlin, J. B., Poole, C., Goodman, S. N., & Altman, D. G. (2016). Statistical tests, P values, confidence intervals, and power: A guide to misinterpretations. European Journal of Epidemiology, 31(4), 337-350. https://doi.org/10.1007/s10654-016-0149-3

      Rouder, J. N., Morey, R. D., Verhagen, J., Province, J. M., & Wagenmakers, E.-J. (2016). Is There a Free Lunch in Inference? Topics in Cognitive Science, 8(3), 520-547. https://doi.org/10.1111/tops.12214

      (3) In the frequentist realm, demonstrating similar effects between groups requires equivalence testing, with bounds (minimum effect sizes of interest) that should be pre-registered:

      Campbell, H., & Gustafson, P. (2024). The Bayes factor, HDI-ROPE, and frequentist equivalence tests can all be reverse engineered-Almost exactly-From one another: Reply to Linde et al. (2021). Psychological Methods, 29(3), 613-623. https://doi.org/10.1037/met0000507

      Riesthuis, P. (2024). Simulation-Based Power Analyses for the Smallest Effect Size of Interest: A Confidence-Interval Approach for Minimum-Effect and Equivalence Testing. Advances in Methods and Practices in Psychological Science, 7(2), 25152459241240722. https://doi.org/10.1177/25152459241240722

      (4) The lack of consideration for sample sizes, the lack of pre-registration, and the lack of a method to support the null (a cornerstone of this project to demonstrate equivalence onsets between areas), suggest that the work is exploratory. This is a strength: we need rich datasets to explore, test tools and generate new hypotheses. I strongly recommend embracing the exploration philosophy, and removing all inferential statistics: instead, provide even more detailed graphical representations (include onset distributions) and share the data immediately with all the pre-processing and analysis code.

      (5) Even if the work was pre-registered, it would be very difficult to calculate p-values conditional on all the uncertainty around the number of participants, the number of contacts and the number of trials, as they are random variables, and sampling distributions of key inferences should be integrated over these unknown sources of variability. The difficulty of calculating/interpreting p-values that are conditional on so many pre-processing stages and sources of uncertainty is traditionally swept under the rug, but nevertheless well documented:

      Kruschke, J.K. (2013) Bayesian estimation supersedes the t test. J Exp Psychol Gen, 142, 573-603. https://pubmed.ncbi.nlm.nih.gov/22774788/

      Wagenmakers, E.-J. (2007). A practical solution to the pervasive problems of p values. Psychonomic Bulletin & Review, 14(5), 779-804. https://doi.org/10.3758/BF03194105<br /> https://link.springer.com/article/10.3758/BF03194105

      (6) Currently, there is no convincing evidence in the article to clearly support the main claims.

      Bootstrap confidence intervals were used to provide measures of uncertainty. However, the bootstrapping did not take the structure of the data into account, collapsing across important dependencies in that nested structure: participants > hemispheres > contacts > conditions > trials.

      Ignoring data dependencies and the uncertainty from trials could lead to a distorted CI. Sampling contacts with replacement is inappropriate because it breaks the structure of the data, mixing degrees of freedom across different levels of analysis. The key rule of the bootstrap is to follow the data acquisition process, and therefore, sampling participants with replacement should come first. In a hierarchical bootstrap, the process can be repeated at nested levels, so that for each resampled participant, then contacts are resampled (if treated as a random variable), then trials/sequences are resampled, keeping paired measurements together (hemispheres, and typically contacts in a standard EEG experiment with fixed montage). The same hierarchical resampling should be applied to all measurements and inferences to capture all sources of variability. Selectivity and timing should be quantified at each contact after resampling of trials/sequences before integrating across hemispheres and participants using appropriate and justified summary measures.

      The authors already recognise part of the problem, as they provide within-participant analyses. This is a very good step, inasmuch as it addresses the issue of mixing-up degrees of freedom across levels, but unfortunately these analyses are plagued with small sample sizes, making claims about the lack of differences even more problematic--classic lack of evidence == evidence of absence fallacy. In addition, there seem to be discrepancies between the mean and CI in some cases: 15 [-20, 20]; 8 [-24, 24].

      (7) Three other issues related to onsets:

      (a) FDR correction typically doesn't allow localisation claims, similarly to cluster inferences:

      Winkler, A. M., Taylor, P. A., Nichols, T. E., & Rorden, C. (2024). False Discovery Rate and Localizing Power (No. arXiv:2401.03554). arXiv. https://doi.org/10.48550/arXiv.2401.03554

      Rousselet, G. A. (2025). Using cluster-based permutation tests to estimate MEG/EEG onsets: How bad is it? European Journal of Neuroscience, 61(1), e16618. https://doi.org/10.1111/ejn.16618

      (b) Percentile bootstrap confidence intervals are inaccurate when applied to means. Alternatively, use a bootstrap-t method, or use the pb in conjunction with a robust measure of central tendency, such as a trimmed mean.

      Rousselet, G. A., Pernet, C. R., & Wilcox, R. R. (2021). The Percentile Bootstrap: A Primer With Step-by-Step Instructions in R. Advances in Methods and Practices in Psychological Science, 4(1), 2515245920911881. https://doi.org/10.1177/2515245920911881

      (c) Defining onsets based on an arbitrary "at least 30 ms" rule is not recommended:

      Piai, V., Dahlslätt, K., & Maris, E. (2015). Statistically comparing EEG/MEG waveforms through successive significant univariate tests: How bad can it be? Psychophysiology, 52(3), 440-443. https://doi.org/10.1111/psyp.12335

      (8) Figure 5 and matching analyses: There are much better tools than correlations to estimate connectivity and directionality. See for instance:

      Ince, R. A. A., Giordano, B. L., Kayser, C., Rousselet, G. A., Gross, J., & Schyns, P. G. (2017). A statistical framework for neuroimaging data analysis based on mutual information estimated via a Gaussian copula. Human Brain Mapping, 38(3), 1541-1573. https://doi.org/10.1002/hbm.23471

      (9) Pearson correlation is sensitive to other features of the data than an association, and is maximally sensitive to linear associations. Interpretation is difficult without seeing matching scatterplots and getting confirmation from alternative robust methods.

    1. Reviewer #1 (Public review):

      Summary:

      In this study, the authors elegantly combined latent variable models (i.e., HMM, GPFA and dynamical system models) with a calcium imaging observation model (i.e., latent Poisson spiking and autoregressive calcium dynamics (AR)).

      Strengths:

      Integrating a calcium observation model into existing latent variable models improves significantly the inference of latent neural states compared to existing approaches such as spike deconvolution or Gaussian assumptions.<br /> The authors also provide an open-source access to their method for direct application to calcium imaging data analysis.

      Weaknesses:

      As acknowledged by the authors, their method is dependent on the quality of calcium trace extraction from fluorescence videos. It should be noted that this limitation applies to alternative strategies.

      While the contribution of this study should prove useful for researchers using calcium imaging, the novelty is limited, as it consists of an integration of the calcium imaging model from Ganmor et al. 2016 with existing LVM frameworks.

    2. Reviewer #2 (Public review):

      Summary:

      This compelling study proposes a framework to implement latent variable models using population level calcium imaging data. The study incorporates autoregressive dynamics and latent Poisson spiking to improve inference of latent states across different model classes including HMMs, Gaussian Process Factor Analysis and nonlinear dynamical systems models. This approach allows for a more seamless integration of existing methods typically used with spiking data to apply on calcium imaging data. The authors test the model on piriform cortex recordings as well as a biophysical simulator to validate their methods. This approach promises to have wide usability for neuroscientists using large population level calcium imaging.

      Strengths:

      The strengths of this study are the flexibility in the choice of models and relatively easy adaptation to user-specific use cases.

      Weaknesses:

      The weakness of the study lies in its limited validation of biological calcium imaging data. Calcium dynamics in a task-specific context in a sensory brain region might be very different from slower dynamics in a region of integration. The biophysical properties of the data would also be dependent on the SNR of the imaging platform and the generation of calcium indicator being used.

    3. Reviewer #3 (Public review):

      Summary:

      S. Keeley & collaborators propose a computational approach to infer time-varying latent variables directly from calcium traces (for instance, obtained with 2p imaging) without the need for deconvolving the traces into spike trains in a preliminary, independent step. Their approach rests on 1 of 3 families of latent models: GPFA, HMM and dynamical systems - which they augment with an observation model that maps latent variables to fluorescence traces. They validate their approach on simulated and real data, showing that the approach improves latent variable inference and model fitting, compared to more traditional approaches (although not directly compared with the 2-step one; see below). They provide a GitHub repository with code to fit their models (which I have not tested).

      Strengths:

      The approach is sound and well-motivated. The authors are specialists in latent variable models. The manuscript is succinct, well-written, and the figures are clear. I particularly liked the diversity of latent models considered, in particular latent models with continuous (GPFA) vs. discrete (HMM) dynamics, which are useful for characterizing different types of neural computations. The validation on both simulated and real data is convincing.

      Weaknesses:

      The main weakness that I see is that the approach is tested only on a single real dataset (odor response dataset). The other model fits are obtained from simulated data. While the results are convincing, it would be useful to see the approach tested on other datasets, for instance, datasets with different brain areas, different behavioral conditions, or different calcium indicators. This would help assess the generality of the approach and its robustness to different experimental conditions.

      The other points below mostly pertain to clarifications and possible extensions of the approach, and to simple model recovery experiments that would help quantify the advantage of the proposed approach over more traditional ones.

      I have a question related to interpretability and diagnosis of model fits. One advantage of the two-step approach: (1) deconvolution => (2) latent variance inference, is that one can inspect the quality of the deconvolution step independently from the latent variable inference step. In the proposed approach, it seems more difficult to diagnose potential problems with model fitting. For instance, if the inferred latent variables are not interpretable, how can one determine whether this is due to a poor choice of latent model (e.g., HMM with too few states), or a poor fit of the observation model (e.g., wrong parameters for the calcium dynamics)? Are there any diagnostic tools that could help identify potential problems with model fitting?

      Could the authors comment on whether their approach allows for instance to compare different forms of latent models (e.g., HMM vs. GPFA) in terms of model evidence, cross-validated log-likelihood or other model comparison metrics? This would be useful to quantitatively determine which type of latent dynamics is more appropriate for a given dataset.

      The HMM part reveals a pretty large number of states, with one state being interpretable (evoked response). Shouldn't we expect a simpler scenario, with 2 states? I know this is a difficult question that is more general and common with HMM approaches, but it would be useful to discuss this point. For instance, would a hierarchical HMM (with a smaller number of "super-states") be more appropriate here?

      While it certainly makes sense that models accounting for the full transformation of latent => spikes => fluorescence data should outperform the two-step (1) deconvolution => (2) latent variance inference approach, the amount of improvement is not clear. A direct comparison (e.g., w/ parameter & model recovery metrics) between the two approaches on simulated data would be useful to quantify the advantage of the proposed approach over more traditional ones.

      It would be useful to discuss the possible extension of the approach to other types of data that are related to neural activity but have different observation models, e.g., voltage imaging, or neuromodulator sensors (e.g., GRAB-NE, dLight, etc). Do the authors see any specific challenges that would arise in these cases and that would need to be addressed in the future (other than changing the Poisson spiking part)?

    1. Reviewer #1 (Public review):

      I read this paper with great interest based on my experience in insect sciences. I have some minor comments (and recommendations) that I believe the authors should address.

      (1) The paper has an original biological question that is overly broad and mechanistically ambitious. The central biological question, namely how CLas infection enhances fecundity of Diaphorina citri via dopamine signaling, is clearly stated and well motivated by previous literature. However, my advice to the authors is that, while the general question is clear, the manuscript attempts to answer multiple mechanistic layers simultaneously. As a result, I feel that the biological narrative becomes diffuse, especially in later sections where DA, miRNA regulation, AKH signaling, and JH signaling are all proposed as parts of a single linear cascade. In summary, my key concern is that the paper often moves from correlation to causal hierarchy without fully disentangling whether these pathways act sequentially, in parallel, or redundantly. A more explicitly framed primary hypothesis (e.g., "DA-DcDop2 is necessary and sufficient for CLas-induced fecundity") may improve conceptual clarity.

      (2) On the novelty of the data, I feel they are moderately novel, with substantial confirmatory components. If I am correct, the novel contributions include the identification of DcDop2 as the DA receptor responsive to CLas infection in D. citri, the discovery that miR-31a directly targets DcDop2, which is supported by luciferase assays and RIP, and thirdly, the integration of dopamine signaling into the already-described CLas-AKH-JH-fecundity framework. My advice to the authors is to focus more on the manuscript's novelty, which lies more in pathway integration than in discovering fundamentally new biological phenomena. This is appropriate for a mechanistic paper, but should be framed as an extension of existing models rather than a paradigm shift.

      (3) On the conclusions, I recommend that the authors modify their statements a little. I feel that there are some overstated or insufficiently supported claims. For instance, the assertion that CLas "hijacks" the DA-DcDop2-miR-31a-AKH-JH cascade implies direct pathogen manipulation, but no CLas-derived effector or mechanism is identified. Also that the model suggests a linear signaling hierarchy, but the data largely show correlation and partial dependency rather than strict epistasis. In third, the term "mutualistic interaction" may be too strong, as host fitness costs outside fecundity (e.g., longevity, immunity) are not evaluated. In conclusion, I confirm that the data support a functional association, but mechanistic causality and evolutionary interpretation are somewhat overstated.

    2. Reviewer #2 (Public review):

      Summary:

      Nian and colleagues comprehensively apply metabolomics, molecular, and genetic approaches to demonstrate that CLas hijacks the DA/DcDop2-miR-31a-AKH-JH signaling cascade to enhance lipid metabolism and fecundity in D. citri, while concurrently promoting its own replication.

      Strengths:

      These findings provide solid evidence of a mutualistic interaction between CLas proliferation and ovarian development in the insect host. This insight significantly advances our understanding of the molecular interplay between plant pathogens and vector insects, and offers novel targets and strategies for HLB field management.

      Weaknesses:

      While the article investigates the involvement of dopamine signaling and specific microRNAs in enhancing fecundity and pathogen proliferation, it still needs to provide a detailed mechanistic understanding of these interactions. The precise molecular pathways and feedback mechanisms by which CLas manipulates dopamine signaling in Diaphorina citri remain unclear.

    1. Reviewer #1 (Public review):

      Summary:

      Large language models (LLMs) have been developed rapidly in recent years and are already contributing to progress across scientific fields. The manuscript tries to address a specific question: whether LLMs can accurately infer signaling networks from gene lists. However, the evaluation is inadequate due to four major weaknesses described below. Despite these limitations, the authors conclude that current general-purpose LLMs lack adequate accuracy, which is already widely recognized. Its key contribution should instead be to provide concrete recommendations for the development of specialized LLMs for this task, which is completely absent. Developing such specific LLMs would be highly valuable, as they could substantially reduce the time required by researchers to analyze signaling networks.

      Strengths:

      The manuscript raises a good question: whether current LLMs can accurately generate signaling networks from gene lists.

      Weaknesses:

      (1) The authors evaluate LLM performance using only three signaling networks: "hypertrophy", "fibroblast", and "mechanosignaling". Given the large number of well-established signaling pathways available, this is not a comprehensive assessment. Moreover, the analysis need not be restricted to signaling networks. Other network types, including metabolic and transcriptional regulatory networks, are already accessible in well-known databases such as KEGG, Reactome, BioCyc, WikiPathways, and Pathway Commons. Including these additional networks would substantially strengthen the evaluation.

      (2) In LLM evaluation, the authors use the gene lists that exactly match those in their "ground truth" networks, thereby fixing the set of nodes and evaluating only the predicted edges. However, in practical research, the relevant genes or nodes are not fully known. A more realistic assessment would therefore include gene lists with both genes present in the ground-truth network and additional genes absent from it, to evaluate the ability of the LLM to exclude irrelevant genes.

      (3) The authors report only the recall/sensitivity of the LLM, without assessing specificity. In practical applications, if an LLM generates a large number of incorrect interactions that greatly exceed the correct ones, researchers may be misled or may lose confidence in the LLM output. Therefore, a comprehensive evaluation must include both sensitivity and specificity. Furthermore, it would be informative to check whether some of the "false positives" might in fact represent biologically plausible interactions that are absent from the manually curated "ground truth". Manually generated "ground truth" can overlook genuine interactions, and the ability of LLMs to recover such missing edges could be particularly valuable. This may even represent one of the most important potential contributions of LLMs.

      (4) It is widely known that applying differential equation models to highly complex biological networks, such as the three networks in the manuscript, is meaningless, because these systems involve a large number of parameters whose values can drastically alter the results. As Richard Feynman once said: "with four parameters I can fit an elephant, and with five I can make him wiggle his trunk." Thus, the evaluation of LLMs on "logic-based differential equation models" does not make much sense.

    2. Reviewer #2 (Public review):

      Summary:

      The authors evaluate whether commonly used LLMs (ChatGPT, Claude and Gemini) can reconstruct signalling networks and predict effects of network perturbations, and propose a pipeline for benchmarking future models. Across three phenotypes (hypertrophy, fibroblast signalling, and mechanosignalling), LLMs capture upstream ligand-receptor interactions and conserved crosstalk but fail to recover downstream transcriptional programmes. Logic-based simulations show that LLM-derived networks underperform compared to manually curated models. The authors also propose that their pipeline can be used for benchmarking future models aimed at reconstructing signalling networks.

      Strength:

      The authors compare the outcomes from three LLMs with three manually curated and validated models. Additionally, they have investigated gene network reconstruction in the context of three distinct phenotypes. Using logic-based modelling, the authors assessed how LLM-derived networks predict perturbation effects, providing functional validation beyond network overlap.

      Weaknesses:

      The authors have used legacy models for all three LLMs, and the study would benefit from testing the current versions of the LLMs (ChatGPT 5.2, Claude 4.5 and Gemini 2.5). Additional metrics such as node coverage, node invention, direction accuracy and sign accuracy would be useful to make robust comparisons across models.

    1. Reviewer #1 (Public review):

      Summary:

      The authors use methylphenidate (MPH) administration after learning a Pavlovian-to-instrumental transfer (PIT) task to parse decision making from instrumental influences. While the main pharmacological effects were null, individual differences in working memory ability moderated the tendency of MPH to boost cognitive control in order to override PIT-biased instrumental learning. Importantly, this working memory moderator had symmetrical effects in appetite and aversive conditions, and these patterns replicated within each valence condition across different values of gain/loss (Fig S1c), suggesting a reliable effect that is generalized across instances of Pavlovian influence.

      Strengths:

      The idea of using pharmacological challenge after learning but prior to transfer is a novel technique that highlights the influence of catecholamines on the expression of learning under Pavlovian bias, and importantly it dissociates this decision feature from the learning of stimulus-outcome or action-outcome pairings.

      Comments on revisions:

      I have no further recommendations or concerns.

    2. Reviewer #2 (Public review):

      Summary:

      In this study, Geurts et al. investigated the effects of the catecholamine reuptake inhibitor methylphenidate (MPH) on value-based decision making using a combination of aversive and appetitive Pavlovian to Instrumental Transfer (PIT) in a human cohort. Using an elegant behavioural design they showed a valence- and action-specific effects of Pavlovian cues on instrumental responses. Initial analyses showed no effect of MPH on these processes. However the authors performed a more in-depth analysis and demonstrated that MPH actually modulates PIT in action-specific manner, depending on individual working memory capacities. The authors interpret that as an effect on cognitive control of Pavlovian biasing of actions and decision-making more than an invigoration of motivational biases.

      Strengths:

      A major strength a this study is its experimental design. The elegant combination of appetitive and aversive Pavlovian learning with approach/avoidance instrumental actions allows the authors to precisely investigate the differential modulation of value-based decision making, depending on the context and environmental stimuli. Importantly, MPH was only administered after Pavlovian and instrumental learning, restricting the effect to PIT performance only. Finally, the use of a placebo-controlled crossover design allows within-comparisons between the PIT effect under placebo and MPH and the investigation of the relationships between working memory abilities, PIT and MPH effects.

      Weaknesses:

      Previous weaknesses regarding the neurobiological circuits underlying such effects and the possible role of dopamine vs noradrenaline have been clearly discussed in the new version of the manuscript.

      Comments on revisions:

      The authors answered my previous points. The changes to the manuscript clearly improve the clarity of the results and the strength of the study.

    1. Reviewer #1 (Public review):

      This manuscript discusses from a theory point of view he mechanisms underlying the formation of specialized or mixed factories. To investigate this, a chromatin polymer model was developed to mimic the chromatin binding-unbinding dynamics of various complexes of transcription factors (TFs).

      The model revealed that both specialized (i.e., demixed) and mixed clusters can emerge spontaneously, with the type of cluster formed primarily determined by cluster size. Non-specific interactions between chromatin and proteins were identified as the main factor promoting mixing, with these interactions becoming increasingly significant as clusters grow larger.

      These findings, observed in both simple polymer models and more realistic representations of human chromosomes, reconcile previously conflicting experimental results. Additionally, the introduction of different types of TFs was shown to strongly influence the emergence of transcriptional networks, offering a framework to study transcriptional changes resulting from gene editing or naturally occurring mutations.

      Overall I think this is an interesting paper discussing a valuable model of how chromosome 3D organisation is linked to transcription.

      Comments on revisions: It's a good paper.

    2. Reviewer #2 (Public review):

      Summary:

      With this report, I suggest what are in my opinion crucial additions to the otherwise very interesting and credible research manuscript "Cluster size determines morphology of transcription factories in human cells".

      Strengths:

      The manuscript in itself is technically sound, the chosen simulation methods are completely appropriate the figures are well-prepared, the text is mostly well-written spare a few typos. The conclusions are valid and would represent a valuable conceptual contribution to the field of clustering, 3D genome organization and gene regulation related to transcription factories, which continues to be an area of most active investigation.

      Weaknesses:

      However, I find that the connection to concrete biological data is weak. This holds especially given that the data that are needed to critically assess the applicability of the derived cross-over with factory size is, in fact, available for analysis, and the suggested experiments in the Discussion section are actually done and their results can be exploited. In my judgement, unless these additional analysis are added to a level that crucial predictions on TF demixing and transcriptional bursting upon TU clustering can be tested, the paper is more fitted for a theoretical biophysics venue than for a biology journal such as eLife.

      Comments on revisions:

      The authors have addressed my comments with exemplary diligence, which has clarified all my major concerns. In all cases, either the relevant work was added, or it was explained in the form of a convincing argument why the suggested modifications were not implemented or not possible to implement.

      As a discretionary suggestion, the authors might consider using a title that even more directly highlights the, in my opinion, main take-away of this work. This is not because anything is incorrect about the current title, simply an even more to-the-point title might attract more readers. I would suggest something along the lines of

      "Cluster size-dependent demixing drives specialization of transcription factories"

      Overall, I congratulate the authors on their excellent work and appreciate the opportunity to engage with this manuscript during a very insightful review process.

    3. Reviewer #3 (Public review):

      Summary:

      In this work, the authors present a chromatin polymer model with some specific pattern of transcription units (TUs) and diffusing TFs; they simulate the model and study TFclustering, mixing, gene expression activity, and their correlations. First, the authors designed a toy polymer with colored beads of a random type, placed periodically (every 30 beads, or 90kb). These colored beads are considered a transcription unit (TU). Same-colored TUs attract with each other mediated by similarly colored diffusing beads considered as TFs. This led to clustering (condensation of beads) and correlated (or anti-correlation) "gene expression" patterns. Beyond the toy model, when authors introduce TUs in a specific pattern, it leads to emergence of specialized and mixed cluster of different TFs. Human chromatin models with realistic distribution of TUs also lead to the mixing of TFs when cluster size is large.

      Strengths:

      This is a valuable polymer model for chromatin with a specific pattern of TUs and diffusing TF-like beads. Simulation of the model tests many interesting ideas. The simulation study is convincing and the results provide solid evidence showing the emergence of mixed and demixed TF clusters within the assumptions of the model.

    1. Reviewer #1 (Public review):

      In this manuscript, Aghabi et al. present a comprehensive characterization of ZFT, a metal transporter located at the plasma membrane of the eukaryotic parasite Toxoplasma gondii. The authors provide convincing evidence that ZFT plays a crucial role in parasite fitness, as demonstrated by the generation of a conditional knock-down mutant cell line, which exhibits a marked impact on mitochondrial respiration, a process dependent on several iron-containing proteins. Consistent with previous reports, the authors also show that disruption of mitochondrial metabolism leads to conversion into the persistent bradyzoite stage.

      The study then employed advanced techniques, such as inductively coupled plasma-mass spectrometry (ICP-MS) and X-ray fluorescence microscopy (XFM), to demonstrate that ZFT depletion results in reduced parasite-associated metals, particularly iron and zinc. Additionally, the authors show that ZFT expression is modulated by the availability of these metals, although defects in the transporter could not be compensated by exogenous addition of iron or zinc. Finally, the authors used heterologous expression of ZFT in Xenopus oocytes and yeast mutants, highlighting the dual substrate specificity of the transporter. The ability of ZFT to transport both iron and zinc is thus supported by two experimental approaches in heterologous systems. First by demonstrating ZFT ability to transport zinc, as the expression of Toxoplasma ZFT can compensate for a lack of zinc transport in yeast. Then, by showing the ability of ZFT to transport iron, as assessed in the Xenopus oocytes model. Furthermore, phenotypic analyses suggest defects in iron availability upon ZFT depletion, particularly with regard to Fe-S mitochondrial proteins and mitochondrial function.

      Overall, the manuscript provides a solid, well-rounded argument for ZFT's role in metal transport, using a combination of complementary approaches. The converging evidence, including changes in metal concentrations upon ZFT depletion, data on metal transport obtained in heterologous systems, and phenotypic changes linked to iron deficiency, presents a convincing case. Given that metal acquisition remains largely uncharacterized in Toxoplasma, this manuscript provides an important first step in identifying a metal transporter in these parasites, and the data presented are generally convincing and insightful.

      Comments on revisions:

      The revised manuscript has successfully addressed all of the key points raised in the initial review. Notably, the metal transport experiments in Xenopus oocytes now provide compelling evidence supporting the role of ZFT function. I congratulate the authors on their efforts and have no further concerns to raise.

    2. Reviewer #2 (Public review):

      Summary:

      The intracellular pathogen Toxoplasma gondii scavenges metal ions such as iron and zinc to support its replication; however, mechanistic studies of iron and zinc uptake are limited. This study investigates the function of a putative iron and zinc transporter, ZFT. In this paper, the authors provide evidence that ZFT mediates iron and zinc uptake by examining the regulation of ZFT expression by iron and zinc levels, the impact of altered ZFT expression on iron sensitivity, and the effects of ZFT depletion on intracellular iron and zinc levels in the parasite. The effects of ZFT depletion on parasite growth are also investigated, showing the importance of ZFT function for the parasite.

      Strengths:

      A key strength of the study is the use of multiple complementary approaches to demonstrate that ZFT is involved in iron and zinc uptake. The heterologous expression of ZFT in a Xenopus oocyst system where ZFT was shown to transport iron and zinc is an important addition to the study. The authors also build on their finding that loss of ZFT impairs parasite growth by showing that ZFT depletion induces stage conversion and leads to defects in both the apicoplast and mitochondrion.

      Weaknesses:

      The inclusion of the data showing iron and zinc transport when ZFT is expressed in a Xenopus oocyst system alleviated one of the main weaknesses of the original paper - the lack of direct biochemical evidence that ZFT acted as an iron transporter.

    3. Reviewer #3 (Public review):

      Summary:

      Aghabi et al set out to characterize a T. gondii transmembrane protein with a ZIP domain, termed ZFT. The authors investigate the consequences of ZFT downregulation and overexpression for parasite fitness. Downregulation of ZFT causes defects in the parasite's endosymbiotic organelles, the apicoplast and the mitochondrion. Specifically, lack of ZFT causes a decrease in mitochondrial respiration, consistent with its role as an iron transporter. This impact on the mitochondria appears to trigger partial differentiation to bradyzoites. The authors furthermore demonstrate that expression of TgZFT can rescue a yeast mutant lacking its zinc transporter and perform an array of direct metal ion measurements including X-ray fluorescence microscopy and inductively coupled mass spectrometry (ICP-MS). These reveal reduced metal ions in parasites depleted in ZFT. In the manuscript's revision, the authors performed additional transport assays in Xenopus oocysts, providing further evidence for the transporter trafficking iron. Overall, the data by Aghabi et al. convincingly support that ZFT is a major metal ion transporter in T. gondii, importing iron and zinc for diverse essential processes.

      Strengths:

      This study's strength lies in the thorough characterization of the transporter. The authors combine a number of techniques to measure the impact of ZFT depletion, ranging form the direct measurement of metal ions to determining the consequences for the parasite's metabolism (mitochondrial respiration) as well as performing a yeast mutant complementation and transport assays in Xenopus oocysts expressing the T. gondii protein. This work is very thorough and clearly presented, leaving little doubt about this protein's function.

      Weaknesses:

      None. The authors have addressed all my previous queries/ concerns.

    1. Reviewer #1 (Public review):

      Summary and Strengths:

      This manuscript presents a thoughtful and well-executed analysis of how S. aureus adapts to disulfide stress using a redox-sensitive regulator, Spx, as a lynchpin to coordinate nutrient uptake, redox balance, and growth. The work is strengthened by a systematic and complementary experimental approach that combines genetic, biochemical, and physiological measurements. The authors carefully test alternative explanations and build a coherent model linking stress sensing to downstream metabolic consequences. Several results, particularly those connecting cysteine uptake to growth defects, provide convincing support for the proposed trade-off. Overall, the authors largely achieve their aims, and the evidence generally supports the central conclusions. The conceptual framework and experimental approaches should be of broad interest to researchers studying S. aureus physiology and pathogenesis and to those studying bacterial stress responses and metabolic trade-offs.

      Weaknesses:

      Clarifying several interpretive points would further strengthen confidence in the proposed model. Some conclusions rely on data presentations or experimental designs that are not immediately clear to the reader. In particular, aspects of the protein stability analysis, global regulatory comparisons, and assays linking cysteine uptake to iron limitation would benefit from clearer justification and more precise interpretation. In addition, certain conclusions could be more carefully framed to reflect partial rather than complete rescue effects.

    2. Reviewer #2 (Public review):

      Summary:

      The manuscript titled "Activation of the Spx redox sensor counters cysteine-driven Fe(II) depletion under disulfide stress" by Hall and colleagues describes that an active redox switch is required for surviving under the diamide-induced disulfide stress. Furthermore, the SpxC10A mutant exhibits transcriptional dysregulation of genes involved in thiol maintenance and disulfide repair. The authors further demonstrate a role for Spx in regulating the uptake of L-cysteine, which otherwise leads to the chelation of intracellular iron and thus the repression of growth.

      Strengths:

      The authors demonstrate that the SpxC10A mutant accumulates high levels of thiols, leading to the chelation of intracellular iron and subsequent repression of the SpxC10A mutant's growth.

      Weaknesses:

      The authors did not show a direct regulation of L-cysteine uptake through CymR.

    3. Reviewer #3 (Public review):

      Summary:

      The paper from Hall et al. reports the effects of an altered function spx allele on the physiology of S. aureus. Since Spx is essential in this organism, the authors compare WT with a spx C10A allele that retains Spx functions that are independent of the formation of a C10-C13 disulfide. However, the major role of Spx in maintaining disulfide homeostasis in this organism appears to be reduced by this mutation, including a reduction (relative to WT) in the DIA-induction of thioredoxin, thioredoxin reductase, and BSH biosynthesis and reduction enzymes.

      Strengths:

      Based on a wide range of studies, the authors develop a model in which Spx is required for adaptation to disulfide stress, and this adaptation involves (in part) induction of both cystine/Cys uptake and the Fur regulon. Overall, the results are compelling, but further efforts to clarify the presentation will aid readers in being able to follow this very complicated story.

      Weaknesses:

      (1) More details are needed on how relative growth is defined and calculated (e.g., line 145 and Figure 1C). The raw data (growth curves) should be included when reporting relative growth so that readers can see what changed (lag, growth rate, final OD?). Later in the paper, the authors refer to "the diamide-induced growth delay of the spxC10A mutant" (line 379), but this is not apparent from the presented data.

      (2) Are the spx C10A, spx C13A, and spx C10A,C13A all really equivalent? In all cases, the Spx protein is presumably made (as confirmed for C10A in panel 1D). However, the only evidence to suggest that they are equivalent is the similar growth effects in panel 1C, and (as noted above), this data presentation can mask differences in how the mutations affect protein activity.

      (3) Figure 1D and Figure 1 Supplement 2 report results related to the effect of diamide treatment on protein half-life (t1/2). Only single results are shown for both panels, and the conclusions do not seem to be statistically robust. For example, in Figure 1, Supplement 2 concludes that Spx C10A has a t1/2 is 3.38 min (this should be labeled correctly in the Figure legend as the red line). and WT Spx is 8.69 min. However, Figure 1D suggests that the protein levels at time 0 may not be equivalent, and this is lost in the data processing. Indeed, there are significant differences in Spx levels between time 0 - and + DIA, which is curious. Further, the authors' conclusion relies very heavily on line-fitting that includes a final point that has very low signal intensity (as judged from Figure 1D) and therefore is likely the least reliable of all the data. It might be worth showing curve fitting for multiple gels. Regardless of the overfitting of the data, the general conclusion that Spx is partially stabilized against proteolysis by ClpXP, and that the C10A mutant is reduced in stabilization, is probably correct.

      (4) Figure 2 concludes that despite differences in the mRNA profiles between WT and spx C10A after 15 min. of DIA treatment, the overall level of responsiveness of the bacillithiol pool is unchanged. The authors find it "surprising" that the BSH pool responds normally despite some differences in gene expression. This is not surprising. The major events visualized in panel 2D are the chemical oxidation of BSH to BSSB and, presumably, the re-reduction by Bdr(YpdA). While it is seen that BSH synthesis (bshC) and ypdA expression may be less induced by DIA in the C10A mutant (2C), there is no evidence that the basal levels are different prior to stress. Therefore, the chemical oxidation and enzymatic re-reduction might be expected to occur at similar rates, as observed.

      (5) Line 215. For the reason stated above, there is no reason to invoke Cys uptake as needed for the reduction of BSSB. Further, since CySS (presumably an abbreviation for cystine) is imported, this itself can contribute to disulfide stress.

      (6) Line 235. Following on the above point, "diamide-induced disulfide stress increased L-CySS uptake in the spxC10A mutant to re-establish the BSH redox equilibrium." This is counterintuitive since LCySS is itself a disulfide and is thought to be reduced to 2 L-Cys in cells by BSH (leading to an increase in BSSB, not a reduction). Is there a known cystine reductase? Could cystine or L-cys be affecting gene regulation? (e.g., through CymR or Spx or ?). Cystine can also lead to mixed disulfide formation (e.g., could it modify Spx on C13?).

      (7) l. 247 "a functional Spx redox switch allows S. aureus to avoid this trade-off and maintain thiol homeostasis without excessive L-CySS uptake." Can the authors expand on how this is thought to work? Does Spx normally affect cystine uptake? I thought this was CymR? I am not following the logic here.

      (8) Line 258. "The fur mutant, which is known to accumulate iron...". My understanding is that fur mutant strains typically have higher bioavailable (free) Fe pools. This is seen in E. coli, for example, using EPR methods. However, they also often have lower total Fe due to the iron-sparing response, which represses the expression of abundant, Fe-rich proteins. Please provide a reference that supports this statement that in S. aureus fur mutants have higher total iron per cell.

      (9) Figure 4. For the reasons stated above (point 1), it is hard to interpret data presented only as "Rel. Growth". Perhaps growth curve data could be included in a supplement.

      (10) The interpretation of Figure 4 is complicated. It is not clear that there is necessarily a change in bioavailable Fe pools, although it does seem clear that Fe homeostasis is perturbed. It has been shown that one strong effect of DIA on B. subtilis physiology is to oxidize the BSH pool to BSSB (as shown also here), and this leads to a mobilization of Zn (buffered by BSH). Elevated Zn pools can inactivate some Fe(II)-dependent enzymes, which could account for the rescue by Fe(II) supplementation. Zn(II) can also dysregulate PerR and likely Fur regulons.

    1. Reviewer #1 (Public review):

      Summary:

      This study uses a novel DNA origami nanospring to measure the stall force and other mechanical parameters of the kinesin-3 family member, KIF1A, using light microscopy. The key is to use SNAP tags to tether a defined nanospring between a motor-dead mutant of KIF5B and the KIF1A to be integrated. The mutant KIF5B binds tightly to a subunit of the microtubule without stepping, thus creating resistance to the processive advancement of the active KIF1A. The nanospring is conjugated with 124 Cy3 dyes, which allows it to be imaged by fluorescence microscopy. Acoustic force spectroscopy was used to measure the relationship between the extension of the NS and force as a calibration. Two different fitting methods are described to measure the length of the extension of the NS from its initial diffraction-limited spot. By measuring the extension of the NS during an experiment, the authors can determine the stall force. The attachment duration of the active motor is measured from the suppression of lateral movement that occurs when the KIF1A is attached and moving. There are numerous advantages of this technology for the study of single molecules of kinesin over previous studies using optical tweezers. First, it can be done using simple fluorescence microscopy and does not require the level of sophistication and expense needed to construct an optical tweezer apparatus. Second, the force that is experienced by the moving KIF1A is parallel to the plane of the microtubule. This regime can be achieved using a dual beam optical tweezer set-up, but in the more commonly used single-beam set-up, much of the force experienced by the kinesin is perpendicular to the microtubule. Recent studies have shown markedly different mechanical behaviors of kinesin when interrogated by the two different optical tweezer configurations. The data in the current manuscript are consistent with those obtained using the dual-beam optical tweezer set-up. In addition, the authors study the mechanical behavior of several mutants of KIF1A that are associated with KIF1A-associated neurological disorder (KAND).

      Strengths:

      The technique should be cheaper and less technically challenging than optical tweezer microscopy to measure the mechanical parameters of molecular motors. The method is described in sufficient detail to allow its use in other labs. It should have a higher throughput than other methods.

      Weaknesses:

      The experimenter does not get a "real-time" view of the data as it is collected, which you get from the screen of an optical tweezer set-up. Rather, you have to put the data through the fitting routines to determine the length of the nanospring in order to generate the graphs of extension (force) vs time. No attempts were made to analyze the periods where the motor is actually moving to determine step-size or force-velocity relationships.

      Comments on revisions:

      I am satisfied with the revision made by the authors in response to my first round of criticisms.

    2. Reviewer #2 (Public review):

      Summary:

      This work is important in my view because it complements other single-molecule mechanics approaches, in particular optical trapping, which inevitably exerts off-axis loads. The nanospring method has its own weaknesses (individual steps cannot be seen), but it brings new clarity to our picture of KIF1A and will influence future thinking on the kinesins-3 and on kinesins in general.

      Strengths:

      By tethering single copies of the kinesin-3 dimer under test via a DNA nanospring to a strong binding mutant dimer of kinesin-1, the forces developed and experienced by the motor are constrained into a single axis, parallel to the microtubule axis. The method is imaging-based which should improve accessibility. In principle, at least, several single-motor molecules can be simultaneously tested. The arrangement ensures that only single molecules can contribute. Controls establish that the DNA nanospring is not itself interacting appreciably with the microtubule. Forces are convincingly calibrated and reading the length of the nanospring by fitting to the oblate fluorescent spot is carefully validated. The excursions of the wild type KIF1A leucine zipper-stabilised dimer are compared with those of neuropathic KIF1A mutants. These mutants can walk to a stall plateau, but the force is much reduced. The forces from mutant/WT heterodimers are also reduced.

      Weaknesses:

      The tethered nanospring method has some weaknesses; it only allows the stall force to be measured in the case that a stall plateau is achieved, and the thermal noise means that individual steps are not apparent. The nanospring does not behave like a Hookean spring - instead linearly increasing force is reported by exponentially smaller extensions of the nanospring under tension. The estimated stall force for Kif1A (3.8 pN) is in line with measurements made using 3 bead optical trapping, but those earlier measurements were not of a stall plateau, but rather of limiting termination (detachment) force, without a stall plateau.

      Comments on revisions:

      The authors have successfully addressed my previous criticisms.

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript, Blanco-Ameijeiras et al. present an organoid-based model of the caudal neural tube that builds upon established principles from embryonic development and prior organoid work. By systematically testing and refining signaling conditions, the authors generate caudal progenitor populations that self-organize into neuroepithelia with molecular features consistent with secondary neurulation. Bulk-RNA sequencing supports the emergence of caudal neural identities, and the authors further examine cellular features such as apico-basal polarity and interkinetic nuclear migration. Finally, they provide evidence for a conserved, YAP-dependent mechanism of tube formation specific to secondary neurulation. The manuscript provides valuable methodological resources, including troubleshooting guidance that will be especially useful for the field. While this work represents a significant advance toward modeling human spinal cord development - particularly the process of secondary neurulation - the claims of complete caudalization and full AP-axis representation require additional experimental support and clarification.

      Strengths:

      (1) Methodological clarity and transparency: The first figure and accompanying text provide an exemplary explanation of protocol optimization and troubleshooting. This transparency - showing approaches that failed as well as those that succeeded - sets a high standard for reproducibility and will be highly beneficial to laboratories aiming to adopt or build upon this model.

      (2) Testing across multiple cell lines: Multiple hPSC and hiPSC lines were evaluated, strengthening the robustness and generalizability of the reported protocol.

      (3) Biological relevance: The focus on secondary neurulation fills a notable gap in current human organoid models of spinal cord development. The identification of YAP-dependent mechanisms in tube formation is a valuable insight with potential translational relevance.

      (4) Resource creation: The detailed parameters and signaling regimes will serve as a resource for the spinal cord and organoid communities.

      Weaknesses:

      (1) The manuscript over-interprets bulk RNA-seq data to make strong claims on the organoid AP patterning and caudalization. Bulk sequencing provides population-level averages and cannot confirm that individual organoids represent discrete AP levels. To support claims of generating every AP identity, the authors must perform staining or in situ hybridization for HOX genes on individual organoids. Further, the current interpretation of CDX2 as marking "very distal" identity is inaccurate in vitro; CDX2 marks caudal progenitors across the spinal cord axis. The language should be revised accordingly.

      (2) The claim of being the first organoid system to model secondary neurulation overlooks prior work showing HOXC9 in human organoids (Xue et al., Nature 2024; Libby et al., Development 2021), which would reflect the beginning of secondary neurulation. While this system may indeed be the first isolated secondary neurulation organoid model that expresses HOXD9/10 - a meaningful advance - bulk RNA-seq alone is insufficient to support the exclusivity of this claim. Additional single-organoid-level spatial analyses (via immunofluorescence of in situ hybridisation) and frequency quantification of regional identities are required to fully characterize the system.

      (3) Similarly, as written, there are overstatements taken from the bulk RNA sequencing to determine dorsal-ventral identity. Although dorsal markers are present, the dataset also contains ventral-associated genes (PAX6, SP8, NKX6-1, NKX6-2, PRDM12). To claim a "dorsal-only" identity, the authors should perform PAX7 immunostaining to demonstrate dorsalization of the entire organoid tissue.

      (4) The studies identifying YAP as a key driver of lumen fusion in Figure 6 are important and should be extended to the apical organoid system to demonstrate that this is truly a feature of secondary neurulation.

    2. Reviewer #2 (Public review):

      Summary:

      In this study, Blanco-Ameijeiras and colleagues present the use of stem cells to create human spinal cord organoids that recapitulate anterior-posterior identity, with a large focus on posterior fates. In particular, the authors show robust transcriptional landscape specification that reflects certain anterior-posterior spinal cord development.

      Recapitulation of spinal cord development is essential to understand the fundamentals of developmental defects in a systematic manner. This work provides a broad approach to test certain aspects of neural tube morphogenesis, particularly posterior and dorsal identities. Perhaps the shorter protocol is an interesting upgrade for current standards, and the mechanical interpretation provides good proof of concept work that aligns with the need to better understand neural tube mechanobiology.

      Strengths:

      The manuscript addresses a major gap by focusing on posterior spinal cord identity and secondary neurulation, a phase that is less well captured by existing neural tube organoid models (although some do recapitulate that). The manuscript situates the approach within vertebrate development and human embryology.

      Morphometric quantifications are well described and provide a dynamic interpretation of cell-level interpretation, and that is a true strength of the work. This is important to develop important metrics that can later be used to compare modulations and pathway disruption.

      The protocols are well described and documented.

      Weaknesses:

      Some key data lacks proper quantification to robustly support the claims. For example, it is not clear how many organoids in total are counted in Figure 1D to derive the % of organoids expressing certain markers (e.g. SOX2 or BRA).

      Some claims are overstated. In the manuscript, the organoids show primarily dorsal and posterior identities under the current conditions, yet the discussion sometimes reads as if a more complete dorsoventral recapitulation is achieved. Therefore, one can either demonstrate ventral patterning (e.g., SHH / FOXA2) or reduce the claims about spinal cord identity, which, given the results, are more specific to a particular region.

      The mention of anterior organoids seems to distract the reader from the important work, which primarily focuses on the posterior identity. Further, it is not understood why SOX2 identity is reduced by Day 7 in Figure 1D. Since SOX2 in the manuscript is considered a neural marker (although also pluripotency along with NANOG, etc.), a further explanation should be provided. The author should also test the presence of PAX6, which is one of the earliest neuroectoderm markers in humans (Zhang X. et al., Cell Stem Cell 2010).

      The authors position the work as a substantial addition to the field. The work is very much welcomed; however, some claims align with an interpretation that leads the readers to understand a novelty that is beyond the work presented here. For example, in certain instances in the intro, the manuscript conveys that this work consists of the first recapitulation of spinal cord fates anterior or posterior, while other works (Rifes P. Nature Cell Biology 2020, Xue X. Nature 2024) recapitulate dorsoventral and anterior-posterior patterning and identity (albeit not of secondary neurulation) through controlled gradients of WNT and RA activity. To clearly position the importance of this work, the intro should focus on secondary neurulation and posterior identities.

      In a similar fashion, the claim that "Importantly though, to our knowledge these are the first neural organoids exhibiting a robust spinal cord transcriptome identity" is not very well understood when other neural tube organoid systems (including spinal cord identities) have been exhaustively profiled at the single cell level (Rifes P. Xue X. Abdel Fattah A.). Further explanation is therefore needed.

      The mechanical angle is important and adds to the large body of research that traces NT morphogenesis to mechanics. However, the YAP localization images can be much improved. Lower magnification images are needed to show the entire organoid to robustly convince the reader of the correct and varying localization of the YAP protein. The authors should also check for YAP-associated genes in their bulk RNA sequencing.

      The quantification of the YAP analysis in a total of 23 and 18 cells in the two conditions and in 7 organoids is by no means enough to draw a conclusion about YAP localization, and an increase in the number of cells is needed. Moreover, the use of dasatinib as an inhibitor for YAP is great, but there is no evidence shown that in this culture system, the inhibitor actually inhibits YAP. As such, IF images are required to confirm cytosolic YAP. Additionally, the authors can try other inhibitors (such as verteporfin) since most inhibitors are broadband.

      Given the mechanically oriented conclusions, other relevant works have shown posteriorized and ventralized neural tube organoids using RA and SHH activation, which were also mechanically stimulated via actuation, such as work done from the Ranga lab (Nature comm. 2021/2023). Although not strictly related to YAP, the therein molecular profiling, mechanical stimulation, lumen measurements, and NTD-like phenotype using PCP-mutated genes make these important relevant mentions since the current work adds important aspects with YAP analysis.

    3. Reviewer #3 (Public review):

      Summary:

      In this manuscript, Blanco-Ameijeiras and collaborators describe the 3D differentiation of human pluripotent stem cells into the posterior spinal cord. The authors first test the exposure of different combinations of extrinsic signals to generate human neural organoids with distinct antero-posterior identities, as shown by bulk transcriptome analysis. They show that neural organoids, whether anterior or posterior, display tissue architecture, organisation and dynamics resembling the in vivo situation. Increasing the size of initial cell aggregates leads to the formation of a single lumen through a multi-lumen stage and a process of cell intercalation, mimicking the situation that they recently described for chick secondary neurulation (Gonzalez-Gobartt et al. Dev Cell. 2021 PMID: 33878300). The authors go on to show that, as in chick, YAP is involved in the resolution of multiple lumens into a single lumen. They conclude that their human organoid approach faithfully models human secondary neurulation, which may be instrumental in unravelling the mechanisms of human neural tube defects.

      Strengths:

      Overall, this is an important study demonstrating that lumen formation in human spinal organoids recapitulates key aspects of secondary neurulation observed in animal models. This organoid approach may be instrumental in unravelling the mechanisms of human neural tube defects.

      Weaknesses:

      The significance of the findings is tempered by several limitations. While the authors show convincing evidence that organoids undergo lumen formation with similar morphological, cellular and molecular features as seen in chick in their previous work (Gonzalez-Gobartt et al. Dev Cell. 2021 PMID: 33878300), whether this is linked to their caudal spinal cord identity is unclear.

    1. Reviewer #1 (Public review):

      Summary

      In this study, the authors have performed tissue-specific ribosome pulldown to identify gene expression (translatome) differences in the anterior vs posterior cells of the C. elegans intestine. They have performed this analysis in fed and fasted states of the animal. The data generated will be very useful to the C. elegans community, and the role of pyruvate shown in this study will result in interesting follow-up investigations.

      However, several strong claims made in the study are solely based on in silico predictions and are not supported by experimental evidence.

      Strengths:

      Several studies in the past have predicted different functions of the anterior (INT1) vs posterior (INT2-9) epithelial cells of the C. elegans intestine based on their anatomy and ultrastructure, but detailed characterization of differences in gene expression between these cell types (and whether indeed these are different 'cell types') was lacking prior to this study. The genes and drivers identified to be exclusively expressed in the anterior vs posterior segments of the intestine will be very helpful to selectively modulate different parts of the C. elegans intestine in future studies.

      Another strength of this study is the careful experimental design to test how the anterior vs posterior cell types of the intestine respond differently to food deprivation and recovery after return to food. These comparisons between 'states' of a cell in different physiological conditions are difficult to pick up in single-cell analyses due to low sequencing depth, which can fail to identify subtle modulation of gene expression.

      The TRAP-associated bulk RNA-seq approach used in this study is more suitable for such comparisons and provides additional information on post-transcriptional regulation during metabolic stress.

      A key finding of this study is that pyruvate levels modulate the translation state of anterior intestinal cells during fasting. Characterization of pyruvate metabolism genes, especially of the enzymes involved in its mitochondrial breakdown, provides novel insights into how gut epithelial cells respond to the acute absence of food.

      Weaknesses:

      Unlike previous TRAP-seq studies (PMID: 30580965, 36044259, 36977417) that reported sequencing data for both input and IP samples, this study only reports the sequencing data for IP samples. Since biochemical pulldowns are variable across replicates, it is difficult to know if the observed differences between different conditions are due to biological factors or differences in IP efficiency. More importantly, since two different TRAP lines were utilized in this study and a large proportion of the results focus on the differences between the translational profiles of INT1 vs INT2-9 cells, it is essential to know if the IP worked with similar efficiency for both TRAP strains that likely have different expression levels of the HA-tagged ribosomal protein. One way to estimate this would be to perform qRT-PCR of genes that are known to be enriched in all intestinal cells and determine whether their fold-enrichment over housekeeping genes (normalized to input) is similar in INT1 vs INT2-9 TRAP strains and across the fed vs fasted conditions. The authors, in fact, mention variability across biological replicates, due to which certain replicates were excluded from their WGCNA analysis.

      It appears that GFP expression is also detectable in INT2 (in addition to strong expression in INT1 in Fig.1A). Compared to INT3-9, which looks red, INT2 cells appear yellow, suggesting that the expression patterns of the two TRAP drivers are not mutually exclusive, which changes the interpretation of many of the results described in the study.

      Some parts of the study overemphasize the differences between the INT1 vs INT2-9 cell types, which is a biased representation of the results. For example, the authors specifically point out that 270 genes are differentially expressed in opposite directions in INT1 vs INT2-9 cell types during acute (30 min) fasting without mentioning the 1,268 genes that are differentially expressed in the same direction. They also do not mention here that 96% of the genes are differentially expressed in the same direction in INT1 and INT2-9 cell types after prolonged (180 min) fasting, suggesting that the divergent translational responses of these cell types are only observed in the first 30 minutes of food deprivation. Similar results have also been reported for the effect of fasting on locomotory and feeding behaviors, where 30 min of fasting produces more variable effects, which become more consistent after longer periods of fasting (PMID: 36083280). Hence, the effects of brief food deprivation should be interpreted with caution.

      Many of the interpretations of this study primarily rely on pathway enrichment analyses, which are based on the known function of genes. The function of uncharacterized genes that were found to be differentially expressed in INT1 vs INT2-9 cell types, e.g., the ShKT proteins, was not explored in this study. In addition, overreliance on pathway enrichment tools (instead of functional validation) has resulted in several conflicting findings. For example, one of the main messages of this study is that INT1 cells specialize in immune and stress response in response to fasting, which relies on pathway analysis in Figs 5E and 5F. However, pathway analysis at a different time point (shown in Figure S5A) indicates that INT2-9 cells show a much stronger increase in translation of stress and pathogen-responsive genes compared to INT1 cells. Hence, some of the results should be interpreted as different translational effects in INT1 vs INT2-9 cells after different lengths of food deprivation, without making broad claims about selective pathways being affected only in specific cell types.

      The authors have compared their TRAP-seq results with genes enriched in the anterior and posterior intestine clusters from a previously published whole-animal adult scRNA dataset (PMID: 37352352). They claim that their TRAP-seq results are in agreement with the findings of the scRNA study. However, among the 10 genes from the 'posterior intestine' scRNA cluster in Fig.S1E, six are downregulated in the INT1 vs INT2-9 comparison, while four are upregulated. Hence, there is no clear agreement between the two studies in terms of the top enriched genes in the anterior vs posterior intestine, which should be considered for cross-study comparisons in the future.

      The authors describe in the manuscript that they have performed INT1-specific RNAi for two C-type lectin genes that are upregulated during fasting. Due to a recent expansion of C-type lectin genes in C. elegans, there is a high chance of off-target effects of RNAi that is designed for members of this gene family. More trustworthy results could have been obtained using CRISPR-based loss-of-function alleles for these genes, one of which is publicly available. Also, the authors do not provide any explanation for why knockdown of these stress-response genes, which are activated in INT1 cells in response to food deprivation, results in improved resistance to pathogens. This, in fact, suggests a role of INT1 cells in increasing pathogen susceptibility, and not pathogen resistance, during food deprivation.

      Many of the studies in this field (e.g., references 2-4 in this article) have investigated the effects of food deprivation ranging from 4 hr to 24 hr, which results in activation of starvation responses in C. elegans. In contrast, the authors have used shorter time periods of fasting (30 min and 180 min), and most of their follow-up experiments have used 30 min of food deprivation. Previous work has shown that the effects of food deprivation can either accumulate over time (i.e., the effect gets stronger with longer food deprivation) or can be transient (i.e., only observed briefly after removal of food and not observed during long-term food deprivation). Starvation-induced transcription factors such as DAF-16/FoxO and HLH-30 show strong translocation to the nucleus only after 30 min of fasting. Though gene expression changes in all stages of food deprivation are of biological relevance, the authors have missed the opportunity to explore whether increased INS-7 secretion from the anterior intestine is dependent on these starvation-induced transcription factors (which can be easily tested using loss-of-function alleles) or is due to other fast-acting regulatory mechanisms induced due to the absence of food contents in the gut lumen. A previous study (PMID: 40991693) has shown that DAF-16 activation during prolonged starvation shuts down insulin peptide secretion from the intestinal epithelial cells. Hence, it is not clear if increased INS-7 secretion is only a feature of short-term food deprivation or is also a signature of long-term starvation (e.g., at 8 hr or 16 hr timepoints). Since most of the INS-7 secretion data in this study are for 30 min of fasting, it remains unknown whether the discovered regulators of INS-7 secretion can be generalized for extended food deprivation that triggers major metabolic changes, such as fat loss (e.g., conditions shown in Figure 1D).

      Two previous studies (PMID: 18025456, 40991693) have shown a strong reduction in the expression of ins-7 in the anterior intestine using GFP-based reporters (both promoter fusions and endogenous CRISPR-generated) and in whole-animal RNA-seq data from starved animals. These results are in contrast to the increased INS-7 secretion from INT1 cells during fasting that is reported in this study. The authors here have reported that INS-7 translation is higher in INT1 compared to INT2-9 during fed, acute fasted, and chronic fasted conditions, but they have not shown whether INS-7 translation is upregulated during acute and chronic fasting in INT1 cells in their TRAP-seq analysis. Knowing whether increased INS-7 secretion during acute fasting is due to increased transcription, translation, or secretion of INS-7 is crucial to resolve the discrepancy between these studies.

    2. Reviewer #2 (Public review):

      Summary:

      In this study, the authors set out to understand whether the discrete segments of the C.elegans intestine were specialized to carry out distinct functions during an animal's exposure and adaptation to a fast-changing nutrient environment. To achieve this, the authors used a method called Translating ribosome affinity purification (TRAP), which provides a snapshot of what genes are being translated into proteins (and therefore functionally prioritized by the animal) under different fasting and re-feeding conditions. By expressing the TRAP constructs in two distinct segments of the intestine (INT1) and (INT2-9), the authors were able to identify how these segments responded to changing nutrient availability.

      Already under steady state nutrient conditions, the authors found that INT1 and INT2-9 appeared to have different 'tasks', with INT1 expressing more immune- and stress-response related genes. Exposing animals to different regimens of starvation and refeeding also showed marked differences between the intestinal segments, and the gene expression patterns in INT1 were consistent with INT1 cells playing an integrative role in linking nutrient cues to the secretion of insulin molecules that regulate fat metabolism with food intake. In summary, the data presented catalogue, for the first time, gene expression differences between two areas of the intestine, suspected to play different roles, and through clever experiments, links these gene expression changes to responses to nutrient availability.

      Strengths:

      The data presented catalogue - for the first time and in a careful manner - gene expression differences between two areas of the intestine. They strongly support the presence of intriguing differences between two areas of the intestine in immune, metabolic, and stress-response regulation, and link these gene expression changes to the responses of these regions to nutrient availability.

      Weaknesses:

      The conclusions of this paper are mostly well-supported by data, but the relevance of the changing gene expression patterns could be better clarified and extended in the discussion.

    3. Reviewer #3 (Public review):

      Summary:

      In this study, Liu and colleagues utilize TRAP-seq to profile the repertoire of actively translated mRNAs in different intestinal cell types (anterior INT1 vs. posterior INT2-9 cells) in C. elegans. A key goal of this study was to identify transcripts differentially expressed/translated between these intestinal cell subtypes in the context of animals being well fed or subjected to acute (30 minutes) or chronic (3 hours) starvation, followed by refeeding.

      The authors identify a number of differentially expressed genes across all of the conditions tested. They then provide an initial survey of the landscape of translatome changes through Weighted Gene Network Correlation Analysis (WGNA), and some high-level functional surveys via Gene Ontology (GO) term analysis and protein domain analysis. The authors validate the enriched expression patterns of some of their identified candidate genes using fluorescent promoter fusion reporters, confirming INT1-specific expression. The authors further implicate the role of several other candidate genes in pathogen avoidance and in response to nutritional cues by knocking them down specifically in INT1 cells by RNAi. Finally, the authors identify pyruvate as a major nutrient signal coming from the bacterial diet that suppresses the release of a key insulin peptide (INS-7), and identify some of the genes expressed in INT1 that are required for this response.

      Strengths:

      (1) Good use of and justification for TRAP-seq, because scRNA-seq would be difficult under the varied conditions used (starvation, refeeding).

      (2) The manuscript is generally clear to read, and the data are generally well-presented with good supporting data that includes replicates, sample sizes, error measurements, and associated statistics.

      (3) The dataset will be an interesting resource to mine for future studies focusing on mechanisms of how particular intestinal cell types respond to different environmental signals.

      Weaknesses:

      (1) A limitation of TRAP-seq, although powerful, is that only relative comparisons can be made between genotypes/conditions to identify differentially-expressed genes, rather than assessing whether a given gene is expressed at a certain level in a cell type under a certain condition. This limitation is due to the non-specific association of sticky RNA species with the beads during the immunoprecipitation step. This is a minor point, however, and the authors do a nice job of focusing their analysis on differentially expressed transcripts in the current study.

      (2) Another limitation of the current study is that the experiments testing the role of candidate genes identified by their profiling experiments do not delve a bit deeper into providing a mechanistic understanding of the phenotypes being studied. At present, the results are thus viewed more as a genomics-based screen with some limited follow-up on interesting hits. However, this reviewer appreciates that when placed in the context of the work presented, a presentation of the profiling data along with some validation is an excellent starting point for future mechanistic studies elaborating on these interesting candidates.

      Appraisal of whether the authors achieved their aims, and whether the results support their conclusions:

      The main goal of the study was to survey the dynamic responses at the level of actively translated mRNAs of the INT1 vs INT2-9 cells in response to metabolic challenge.

      Overall, the authors use established methods to perform their genome-wide analysis, and the set of differentially regulated genes is enriched for expected molecular functions and forms coherent networks in anticipated pathways.

      The validation experiments (promoter::GFP fusion reporters, INT1-specific knockdowns of highly regulated genes) further corroborate the quality of the TRAP-seq datasets generated.

      I have a few points for the authors that would further strengthen this work:

      (1) The authors rightfully focus on the top differentially-regulated candidates, but it's unclear at present how far down their fold change list would lead to expression pattern validations. It would be useful to test a few more promoter::GFP fusion reporters at different enrichment/fold-change/statistical cutoffs.

      (2) Although the INT1-specific RNAi provides a convenient strategy for rapidly perturbing and testing genes of interest for phenotypes, independently validating the knockdowns with genetic mutants, or alternatively (if genes are essential), degron alleles.

      Impact:

      The TRAP-seq data and list of differentially-expressed candidate genes will form an interesting set of high-priority candidates to study for their role in the reception and transduction of nutritional cues in response to food status and pathogens. This data will thus benefit the C. elegans community of researchers studying the mechanisms governing these phenomena.

    1. Reviewer #1 (Public review):

      Summary:

      The authors Hall et al. establish a purification method for snake venom metalloproteinases (SVMPs). By generating a generic approach to purify this divergent class of recombinant proteins, they enhance the field's accessibility to larger quantities of SVMPs with confirmed activity and, for some, characterized kinetics. In some cases, the recombinant protein displayed comparable substrate specificity and substrate recognition compared to the native enzyme, providing convincing evidence of the authors' successful recombinant expression strategy. Beyond describing their route towards protein purification, they further provide evidence for self-activation upon Zn2+ incubation. They further provide insights on how to design high-throughput screening (HTS) methods for drug discovery and outline future perspectives for the in-depth characterization of these enzyme classes to enable the development of novel biomedical applications.

      Strengths:

      The study is well-presented and structured in a compelling way. The purification strategy results in highly pure protein products, well characterized by size exclusion chromatography, SDS page as well as confirmed by mass spectrometry analysis. Further, a significant portion of the manuscript focuses on enzyme activity, thereby validating function. Particularly convincing is the comparability between recombinant vs. native enzymes; this is successfully exemplified by insulin B digestion. By testing the fluorogenic substrate, the authors provide evidence that their production method of recombinant protein can open up possibilities in HTS. Since their purification method can be applied to three structurally variable SVMP classes, this demonstrates the robust nature of the approach.

      Weaknesses:

      The universal applicability of the approach could be emphasized more clearly. The potential for this generic protocol for recombinant SVMP zymogen production to be adapted to other SVMPs is somewhat obscured by the detailed optimization steps. A general schematic overview would strengthen the manuscript, presented as a final model, to illustrate how this strategy can be extended to other targets with similar features. Such a schematic might, for example, outline the propeptide fusion design, including its tags, relevant optimizations during expression, lysis, purification (e.g., strategies for metal ion removal and maintenance of protease inactivity), as well as the controllable auto-activation.

      The product obtained from the purification protocol appears to be a heterogeneous mixture of self-activated and intact protein species. The protocol would benefit from improved control over the self-activation process. The Methods section does not indicate whether residual metal ions were attempted to be removed during the purification, which could influence premature activation. Additionally, it has not been discussed whether the shift to pH 8 in the purification process is necessary from the initial steps onwards, given that a lower pH would be expected to maintain enzyme latency.

      The characterization of PIII activity using the fluorogenic peptide effectively links the project to its broader implications for drug design. However, the absence of comparable solutions for PI and PII classes limits the overall scope and impact of the finding.

      Overall, the authors successfully purified active SVMP proteins of all three structurally diverse classes in high quality and provided convincing evidence throughout the manuscript to support their claims. The described method will be of use for a broader community working with self-activating and cytotoxic proteases.

    2. Reviewer #2 (Public review):

      Summary:

      The aim of the study by Hall et al. was to establish a generic method for the production of Snake Venom Metalloproteases (SVMPs). These have been difficult to purify in the mg quantities required for mechanistic, biochemical, and structural studies.

      Strengths:

      The authors have successfully applied the MultiBac system and describe with a high level of detail the downstream purification methods applied to purify the SVMP PI, PII, and PIII. The paper carefully presents the non-successful approaches taken (such as expression of mature proteins, the use of protease inhibitors, prodomain segments, and co-expression of disulfide-isomerases) before establishing the construct and expression conditions required. The authors finally convincingly describe various activity assays to demonstrate the activity of the purified enzymes in a variety of established SVMP assays.

      Weaknesses:

      The manuscript suffers from a lack of bottoming out and stringent scientific procedures in the methodology and the characterization of the generated enzymes.

      As an example, a further characterization of the generated protein fragments in Figure 3 by intact mass spectroscopy would have aided in accurate mass determination rather than relying on SEC elution volumes against a standard. Protein shape and charge can affect migration in SEC. Also, the analysis of N-linked glycosylation demonstrates some reactivity of PIII to PNGase F, but fails to conclude whether one or more sites are occupied, or whether other types of glycosylation is present. Again, intact mass experiments would have resolved such issues.

      The activity assays in Figure 4 are not performed consistently with kinetic assays and degradation assays performed for some, but not all, enzymes, and there is no Echis ocellatus comparison in Figure 4h. Overall, whilst not affecting the main conclusion, this leaves the reader with an impression of preliminary data being presented. For consistency, application of the same assays to all enzymes (high-grade purified) would have provided the reader with a fuller picture.

      Overall, the data presented demonstrates a very credible path for the production of active SVMP for further downstream characterization. The generality of the approach to all SVMP from different snakes remains to be demonstrated by the community, but if generally applicable, the method will enable numerous studies with the aim of either utilizing SVMPS as therapeutic agents or to enable the generation of specific anti-venom reagents, such as antibodies or small molecule inhibitors.

    3. Reviewer #3 (Public review):

      Summary:

      The presented study describes the long journey towards the expression of members' SVMP toxins from snake venom, which are toxins of major importance in a snakebite scenario. As in the past, their functional analysis relied on challenging isolation; the toxins' heterologous expression offers a potential solution to some major obstacles hindering a better understanding of toxin pathophysiology. Through a series of laborious and elegantly crafted experiments, including the reporting of various failed attempts, the authors establish the expression of all three SVMP subtypes and prove their activity in bioassays. The expression is carried out as naturally occurring zymogens that autocleave upon exposure to zinc, which is a novel modus operandi for yielding fusion proteins and sheds also some new light on the potential mechanism that snakes use to activate enzymatic toxins from zymogenic preforms.

      Strengths:

      The manuscript draws from an extensive portfolio of well-reasoned and hypothesis-driven experiments that lead to a stepwise solution. The wetlands data generated is outstanding, although not all experiments along this rocky road to victory were successful. A major strength of the paper is that, translationally speaking, it opens up novel routes for biodiscovery since a first reliable platform for expression of an understudied, yet potent toxin class is established. The discovered strategy to pursue expression as zymogens could see broad application in venom biotechnology, where several toxin types are pending successful expression. The work further provides better insights into how snake toxins are processed.

      Weaknesses:

      The manuscript contains several chapters reporting failed experiments, which makes it difficult to follow in places. The reporting of experimental details, especially sample sizes and replicates, could be optimised. At the time of writing, it remains unclear whether the glycosilations detected at a pIII SVMP could have an impact on the bioactivities measured, which is a major aspect, and future follow-ups should clarify this. Finally, the work, albeit of critical importance, would benefit from a more down-to-earth evaluation of its findings, as still various persistent obstacles that need to be overcome.

      Major comments to the manuscript:

      (1) Lines 148-149: "indicating that expressing inactivated SVMPs could be a viable, although inefficient, approach". I think this text serves a good purpose to express some thoughts on the nature of how the current draft is set up. It is quite established that various proteases cause extreme viability losses to their expression host (whether due to toxicity, but surely also because of metabolic burden), which is why their expression as inactive fusion proteins is the default strategy in all cases I have thus far seen. I believe that, especially in venom studies, this is of importance given the increased toxicity often targeting cellular integrity, and especially here, because Echis are known to feed on arthropods at younger life history stages, making it very likely that some venom components are especially active against insects and other invertebrates. With that in mind, I would argue that exploring their production in inactive form is the obvious strategy one would come up with and not really the conclusion of a series of (well-conducted and scientifically sound!) experiments. For me, the insight of inactive expression is largely confirmatory of what is established, unless I miss something in the authors' rationale. If yes, it would be important to clarify that in the online version.

      (2) Line 173: Here, Alphafold 3 was used, whereas in previous sections (e.g., line 153, line 210), it was Alphafold 2. I suggest using one release across the manuscript.

      (3) Line 252-254: I fully agree, the PIII SVMP is glycosylated. Glycosylation is an important mediator of snake venom activity, and several works have described their importance in the field. This raises the question, which glycosylations have been introduced here in the SVMP, and to verify that these are glycosylations that belong to those found in snakes. This is important as insects facilitate thousands of N- and O- O-glycosylations to modulate the activity of their proteome, of which many are specific to insects. If some of these were integrated into the SVMP, this could have an impact on downstream produced bioassays and also antigenicity (the surface would be somewhat different from natural toxins, causing different selection).

      (4) General comment for the bioassays: It would be good to specify the replicates again and report the data, including standard deviations.

      Discussion:

      I think the data generated in the study is very valuable and will be instrumental for pushing the frontiers in SVMP research, but still I would like to see a bit of modesty in their discussion. As I have pointed out above, it is unclear which effect the glycosilations may have (i.e., are the glycosilations found reminiscent of natural ones?), despite their being functionally important. Also, yes, isolation of SVMPs is challenging, but the reality is that their expression is equally challenging, as evidenced by the heaps of presented negative data (with which I have no problems, I think reporting such is actually important). So far, the "generic" protocol has been used to express one member per structural class of Echis SVMP, but no evidence is provided that it would work equally well on other members from taxonomically more distant snakes (e.g., the pIII known from Naja oxiana). It is very likely, but at the time of writing, purely speculative. Lastly, the reality is also that the expression in insect cells can only be carried out by highly specialized labs (even in the expression world, as most laboratories work with bacterial or fungal hosts), whereas the isolation can be attempted in most venom labs. That said, production in insect cells also has economic repercussions as it will be very challenging to generate yields that are economically viable versus other systems, which is pivotal because the authors talk about bioprospecting and the toxins used in snakebite agent research. Again, I believe the paper is highly important and excellently crafted, but I think especially the discussion should see some refinement to address the drawbacks and to evaluate the paper's findings with more modesty.

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript, Ampartzidis et al. report the establishment of an iPSC-derived neuroepithelial model to examine how mutations from spina bifida patients disrupt fundamental cellular properties that underlie neural tube closure. The authors utilize an adherent neural induction protocol that relies on dual SMAD inhibition to differentiate three previously established iPSC lines with different origins and reprogramming methods. The analysis is comprehensive and outstanding, demonstrating reproducible differentiation, apical-basal elongation, and apical constriction over an 8-day period among the 3 lines. In inhibitor studies, it is shown that apical constriction is dependent on ROCK and generates tension, which can be measured using an annular laser ablation assay. Since this pathway is dependent on PCP signaling, which is also implicated in neural tube defects, the authors investigated whether VANGL2 is required by generating 2 lines with a pathogenic patient-derived sequence variant. Both lines showed reduced apical constriction and reduced tension in the laser ablation assays. The authors then established lines obtained from amniocentesis, including 2 control and 2 spina bifida patient-derived lines. These remarkably exhibited different defects. One line showed defects in apical-basal elongation, while the other showed defects in neural differentiation. Both lines were sequenced to identify candidate variants in genes implicated in NTDs. While no smoking gun was found in the line that disrupts neural differentiation (as is often the case with NTDs), compound heterozygous MED24 variants were found in the patient whose cells were defective in apical-basal elongation. Since MED24 has been linked to this phenotype, this finding is especially significant.

      Some details are missing regarding the method to evaluate the rigor and reproducibility of the study.

      Major Comments:

      It is mentioned throughout the manuscript that 3 plates were evaluated per line. I believe these are independently differentiated plates. This detail is critical concerning rigor and reproducibility. This should be clearly stated in the Methods section and in the first description of the experimental system in the Results section for Figure 1.

      For the patient-specific lines - how many lines were derived per patient?

      Was the Vangl2 variant introduced by prime editing? Base editing? The details of the methods are sparse.

      Significance:

      This paper is significant not only for verifying the cell behaviors necessary for neural tube closure in a human iPSC model, but also for establishing a robust assay for the functional testing of NTD-associated sequence variants. This will not only demonstrate that sequence variants result in loss of function but also determine which cellular behaviors are disrupted.

    2. Reviewer #2 (Public review):

      Summary:

      The authors' work focuses on studying cell morphological changes during differentiation of hPSCs into neural progenitors in a 2D monolayer setting. The authors use genetic mutations in VANGL2 and patient-derived iPSCs to show that (1) human phenotypes can be captured in the 2D differentiation assay, and (2) VANGL2 in humans is required for neural contraction, which is consistent with previous studies in animal models. The results are solid and convincing, the data are quantitative, and the manuscript is well written. The 2D model they present successfully addresses the questions posed in the manuscript. However, the broad impact of the model may be limited, as it does not contain NNE cells and does not exhibit tissue folding or tube closure, as seen in neural tube formation. Patient-derived lines are derived from amniotic fluid cells, and the experiments are performed before birth, which I find to be a remarkable achievement, showing the future of precision medicine.

      Major comments:

      (1) Figure 1. The authors use F-actin to segment cell areas. Perhaps this could be done more accurately with ZO-1, as F-actin cables can cross the surface of a single cell. In any case, the authors need to show a measure of segmentation precision: segmented image vs. raw image plus a nuclear marker (DAPI, H2B-GFP), so we can check that the number of segmented cells matches the number of nuclei.

      (2) Lines 156-166. The authors claim that changes in gene expression precede morphological changes. I am not convinced this is supported by their data. Fig. 1g (epithelial thickness) and Fig. 1k (PAX6 expression) seem to have similar dynamics. The authors can perform a cross-correlation between the two plots to see which Δt gives maximum correlation. If Δt < 0, then it would suggest that gene expression precedes morphology, as they claim. Fig. 1j shows that NANOG drops before the morphological changes, but loss of NANOG is not specific to neural differentiation and therefore should not be related to the observed morphological changes.

      (3) Figure 2d. The laser ablation experiment in the presence of ROCK inhibitor is clear, as I can easily see the cell outlines before and after the experiment. In the absence of ROCK inhibitor, the cell edges are blurry, and I am not convinced the outline that the authors drew is really the cell boundary. Perhaps the authors can try to ablate a larger cell patch so that the change in area is more defined.

      (4) Figure 2d. Do the cells become thicker after recoil?

      (5) Figure 3. The authors mention their previous study in which they show that Vangl2 is not cell-autonomously required for neural closure. It will be interesting to study whether this also the case in the present human model by using mosaic cultures.

      (6) Lines 403-415. The authors report poor neural induction and neuronal differentiation in GOSB2. As far as I understand, this phenotype does not represent the in vivo situation. Thus, it is not clear to what extent the in vitro 2D model describes the human patient.

      (7) The experimental feat to derive cell lines from amniotic fluid and to perform experiments before birth is, in my view, heroic. However, I do not feel I learned much from the in vitro assays. There are many genetic changes that may cause the in vivo phenotype in the patient. The authors focus on MED24, but there is not enough convincing evidence that this is the key gene. I would like to suggest overexpression of MED24 as a rescue experiment, but I am not sure this is a single-gene phenotype. In addition, the fact that one patient line does not differentiate properly leads me to think that the patient lines do not strengthen the manuscript, and that perhaps additional clean mutations might contribute more.

      Significance:

      This study establishes a quantitative, reproducible 2D human iPSC-to-neural-progenitor platform for analyzing cell-shape dynamics during differentiation. Using VANGL2 mutations and patient-derived iPSCs, the work shows that (1) human phenotypes can be captured in a 2D differentiation assay and (2) VANGL2 is required for neural contraction (apical constriction), consistent with animal studies. The results are solid, the data are quantitative, and the manuscript is well written. Although the planar system lacks non-neural ectoderm and does not exhibit tissue folding or tube closure, it provides a tractable baseline for mechanistic dissection and genotype-phenotype mapping. The derivation of patient lines from amniotic fluid and execution of experiments before birth is a remarkable demonstration that points toward precision-medicine applications, while motivating rescue strategies and additional clean genetic models. However, overall, I did not learn anything substantively new from this manuscript; the conclusions largely corroborate prior observations rather than extend them. In addition, the model was unsuccessful in one of the two patient-derived lines, which limits generalizability and weakens claims of patient-specific predictive value.

    3. Reviewer #3 (Public review):

      Summary:

      This manuscript by Ampartzidis et al., significantly extends the human induced pluripotent stem cell system originally characterized by the same group as a tool for examining cellular remodeling during differentiation stages consistent with those of human neural tube closure (Ampartzidis et al., 2023). Given that there are no direct ways to analyze cellular activity in human neural tube closure in vivo, this model represents an important platform for investigating neural tube defects which are a common and deleterious human developmental disease. Here, the authors carefully test whether this system is robust and reproducible when using hiPSC cells from different donors and pluripotency induction methods and find that despite all these variables the cellular remodeling programs that occur during early neural differentiation are statistically equivalent, suggesting that this system is a useful experimental substrate. Additionally, the carefully selected donor populations suggest these aspects of human neural tube closure are likely to be robust to sexual dimorphism and to reasonable levels of human genetic background variation, though more fully testing that proposition would require significant effort and be beyond the scope of the current work. Subsequent to this careful characterization, the authors next tested whether this system could be used to derive specific insights into cell remodeling during early neural differentiation. First, they used a reverse genetics approach to knock in a human point mutation in the critical regulator of planar cell polarity and apical constriction, Vangl2. Despite being identified in a patient, this R353C variant has not been directly functionally tested in a human system. The authors find that this variant, despite showing normal expression and phospho-regulation, leads to defects consistent with a failure in apical constriction, a key cell behavior required to drive curvature change during cranial closure. Finally, the authors test the utility of their hiPSC platform to understand human patient-specific defects by differentiating cells derived from two clinical spina bifida patients. The authors identify that one of these patients is likely to have a significant defect in fully establishing early proneural identity as well as defects in apicobasal thickening. While early remodeling occurs normally in the other patient, the authors observe significant defects in later neuronal induction and maturation. In addition, using whole exome sequencing the authors identify candidate variant loci that could underly these defects.

      Major comments:

      (1) One of my few concerns with this work is that the relative constriction of the apical surface with respect to the basal surface is not directly quantified for any of the experiments. This worry is slightly compounded by the 3D reconstructions Figure 1h, and the observation that overall cell volume is reduced and cell height increased simultaneously to area loss. Additionally, the net impact of apical constriction in tissues in vivo is to create local or global curvature change, but all the images in the paper suggest that the differentiated neural tissues are an uncurved monolayer even missing local buckles. I understand that these cells are grown on flat adherent surfaces limiting global curvature change, but is there evidence of localized buckling in the monolayer? While I believe-along with the authors-that their phenotypes are likely failures in apical constriction, I think they should work to strengthen this conclusion. I think the easiest way (and hopefully using data they already have) would be to directly compare apical area to basal area on a cell wise basis for some number of cells. Given the heterogeneity of cells, perhaps 30-50 cells per condition/line/mutant would be good? I am open to other approaches; this just seems like it may not require additional experiments.

      (2) Another slight experimental concern I have regards the difference in laser ablation experiments detailed in Figure 3h-i from those of Figure 2d-e. It seems like WT recoil values in 3h-I are more variable and of a lower average than the earlier experiments and given that it appears significance is reached mainly by impact of the lower values, can the authors explain if this variability is expected to be due to heterogeneity in the tissue, i.e. some areas have higher local tension? If so, would that correspond with more local apical constriction?

      Significance:

      Overall, I am enthusiastic about this work and believe it represents a significant step forward in the effort to establish precision medicine approaches for diagnoses of the patient-specific causative cellular defects underlying human neural tube closure defects. This work systematizes an important and novel tool to examine the cellular basis of neural tube defects. While other hiPSC models of neural tube closure capture some tissue level dynamics, which this model does not, they require complex microfluidic approaches and have limited accessibility to direct imaging of cell remodeling. Comparatively, the relative simplicity of the reported model and the work demonstrating its tractability as a patient-specific and reverse genetic platform make it unique and attractive. This work will be of interest to a broad cross section of basic scientists interested in the cellular basis of tissue remodeling and/or the early events of nervous system development as well as clinical scientists interested in modeling the consequences of patient specific human genetic deficits identified in neural tube defect pregnancies.

    1. Reviewer #1 (Public review):

      Summary:

      This work addresses a key question in cell signalling, how does the membrane composition affect the behaviour of a membrane signalling protein? Understanding this is important, not just to understand basic biological function but because membrane composition is highly altered in diseases such as cancer and neurodegenerative disease. Although parts of this question have been addressed on fragments of the target membrane protein, EGFR, used here, Srinivasan et al. harness a unique tool, membrane nanodisks, which allow them to probe full length EGFR in vitro in great detail with cutting-edge fluorescent tools. They find interested impacts on EGFR conformation in differently charged and fluid membranes, explaining previously identified signalling phenotypes.

      Strengths:

      The nanodisk system enables full length EGFR to be studied in vitro and in a membrane with varying lipid and cholesterol concentrations. The authors combine this with single-molecule FRET utilising multiple pairs of fluorophores at different places on the protein to probe different conformational changes in response to EGF binding under different anionic lipid and cholesterol concentrations. They further support their findings using molecular dynamics simulations which help uncover the full atomistic detail of the conformations they observe.

      Weaknesses:

      Much of the interpretation of the results comes down to a bimodal model of an 'open' and 'closed' state between the intracellular tail of the protein and the membrane. Some of the data looks like a bimodal model is appropriate but not all. The authors have just this bimodal model statistically and although adding a third component is a better fit, I agree with the authors that it cannot be justified statistically, given the data. Further work beyond the scope of this study would be needed to try to define further states.

    2. Reviewer #2 (Public review):

      Summary:

      Nanodiscs and synthesized EGFR are co-assembled directly in cell-free reactions. Nanodiscs containing membranes with different lipid compositions are obtained by providing liposomes with corresponding lipid mixtures in the reaction. The authors focus on the effects of lipid charge and fluidity on EGFR activity.

      Strengths:

      The authors implement a variety of complementary techniques to analyze data and to verify results. They further provide a new pipeline to study lipid effects on membrane protein function. The manuscript describes a comprehensive study on the analysis of membrane protein function in context of different lipid environments.

      Weaknesses:

      As the implemented strategy is relatively new, some uncertainties in the interpretation of the data consequently remain. However, using state-of-the-art techniques, the authors support their results by appropriate data and sufficient controls in the revised manuscript.

    1. Reviewer #1 (Public review):

      Summary:

      This study identifies a mechanism responsible for the accumulation of the MET receptor in invadopodia, following stimulation of Triple-negative breast cancer (TNBC) cells with HGF. HGF-driven accumulation and activation of MET in invadopodia causes the degradation of the extracellular matrix, promoting cancer cell invasion, a process here investigated using gelatin-degradation and spheroid invasion assays.

      Mechanistically, HGF stimulates the recycling of MET from RAB14-positive endodomes to invadopodia, increasing their formation. At invadopodia, MET induces matrix degradation via direct binding with the metalloprotease MT1-MMP. The delivery of MET from the recycling compartment to invadopodia is mediated by RCP, which facilitates the colocalization of MET to RAB14 endosomes. In this compartment, HGF induces the recruitment of the motor protein KIF16B, promoting the tubulation of the RAB14-MET recycling endosomes to the cell surface. This pathway is critical for the HGF-driven invasive properties of TNBC cells, as it is impaired upon silencing of RAB14.

      Strengths:

      The study is well-organized and executed using state-of-the-art technology. The effects of MET recycling in the formation of functional invadopodia are carefully studied, taking advantage of mutant forms of the receptor that are degradation-resistant or endocytosis-defective.

      Data analyses are rigorous, and appropriate controls are used in most of the assays to assess the specificity of the scored effects. Overall, the quality of the research is high.

      The conclusions are well-supported by the results, and the data and methodology are of interest for a wide audience of cell biologists.

      Weaknesses:

      The role of the MET receptor in invadopodia formation and cancer cell dissemination has been intensively studied in many settings, including triple-negative breast cancer cells. The novelty of the present study mostly consists of the detailed molecular description of the underlying mechanism based on HGF-driven MET recycling. The question of whether the identified pathway is specific for TNBC cells or represents a general mechanism of HGF-mediated invasion detectable in other cancer cells is not addressed or at least discussed.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Khamari and colleagues investigate how HGF-MET signaling and the intracellular trafficking of the MET receptor tyrosine kinase influence invadopodia formation and invasion in triple-negative breast cancer (TNBC) cells. They show that HGF stimulation enhances both the number of invadopodia and their proteolytic activity. Mechanistically, the authors demonstrate that HGF-induced, RAB4- and RCP-RAB14-KIF16B-dependent recycling routes deliver MET to the cell surface specifically at sites where invadopodia form. Moreover, they report that MET physically interacts with MT1-MMP - a key transmembrane metalloproteinase required for invadopodia function- and that these two proteins co-traffic to invadopodia upon HGF stimulation.

      Although the HGF-MET axis has previously been implicated in invadopodia regulation (e.g., by Rajadurai et al., Journal of Cell Science 2012), studies directly linking ligand-induced MET trafficking with the spatial regulation of MT1-MMP localization and activity have been lacking.

      Overall, the manuscript addresses a relevant and timely topic and provides several novel insights. However, some sections require clearer and more concise writing (details below). In addition, the quality, reliability, and robustness of several data sets need to be improved.

      Strengths:

      A key strength of the study is the novel demonstration that HGF-mediated, RAB4- and RAB14-dependent recycling of MET delivers this receptor, together with MT1-MMP, to invadopodia -highlighting a previously unrecognized mechanism, regulating the formation and proteolytic function of these invasive structures. Another strong point is the breadth of experimental approaches used and the substantial amount of supporting data. The authors also include an appropriate number of biological replicates and analyze a sufficiently large number of cells in their imaging experiments, as clearly described in the figure legends.

      Weaknesses:

      (1) Inappropriate stimulation times for endocytosis and recycling assays.

      The experiments examining MET endocytosis and recycling following HGF stimulation appear to use inappropriate incubation times. After ligand binding, RTKs typically undergo endocytosis within minutes and reach maximal endosomal accumulation within 5-15 minutes. Although continuous stimulation allows repeated rounds of internalization, the temporal dynamics of MET trafficking should be examined across shorter time points, ideally up to 1 hour (e.g., 15, 30, and 60 minutes). The authors used 2-, 3-, or 6-hour HGF stimulation, which, in my opinion, is far too long to study ligand-induced RTK trafficking.

      (2) Low efficiency of MET silencing in Figure S1I.

      The very low MET knockdown efficiency shown in Figure S1I raises concerns. Given the potential off-target effects of a single shRNA and the insufficient silencing level, it is difficult to conclude whether the reduction in invadopodia number in Figure 1F is genuinely MET-dependent. The authors later used siRNA-mediated silencing (Figure S5C), which was more effective. Why was this siRNA not used to generate the data in Figure 1F? Why did the authors rely on the inefficient shRNA C#3?

      (3) Missing information on incubation times and inconsistencies in MET protein levels.

      The figure legends do not indicate how long the cells were incubated with HGF or the MET inhibitor PHA665752 prior to immunoblotting. This information is crucial, particularly because both HGF and PHA665752 cause a substantial decrease in the total MET protein level. Notably, such a decrease is absent in MDA-MB-231 cells treated with HGF in the presence of cycloheximide (Figure S2F). The authors should comment on these inconsistencies.

      Additionally, the MET bands in Figure S1J appear different from those in Figure S1C, and MET phosphorylation seems already high under basal conditions, with no further increase upon stimulation (Figure S1J). The authors should address these issues.

      (4) Insufficient representation and randomization of microscopic data.

      For microscopy, only single representative cells are shown, rather than full fields containing multiple cells. This is particularly problematic for invadopodia analysis, as only a subset of cells forms these structures. The authors should explain how they ensured that image acquisition and quantification were randomized and unbiased. The graphs should also include the percentage of cells forming invadopodia, a standard metric in the field. Furthermore, some images include altered cells - for example, multinucleated cells - which do not accurately represent the general cell population.

      (5) Use of a single siRNA/shRNA per target.

      As noted earlier, using only one siRNA or shRNA carries the risk of off-target effects. For every experiment involving gene silencing (MET, RAB4, RAB14, RCP, MT1-MMP), at least two independent siRNAs/shRNAs should be used to validate the phenotype.

      (6) Insufficient controls for antibody specificity.

      The specificity of MET, p-MET, and MT1-MMP staining should be demonstrated in cells with effective gene silencing. This is an essential control for immunofluorescence assays.

      (7) Inadequate demonstration of MET recycling.

      MET recycling should be directly demonstrated using the same approaches applied to study MT1-MMP recycling. The current analysis - based solely on vesicles near the plasma membrane - is insufficient to conclude that MET is recycled back to the cell surface.

      (8) Insufficient evidence for MET-MT1-MMP interaction.

      The interaction between MET and MT1-MMP should be validated by immunoprecipitation of endogenous proteins, particularly since both are endogenously expressed in the studied cell lines.

      (9) Inconsistent use of cell lines and lack of justification.

      The authors use two TNBC cell lines: MDA-MB-231 and BT-549, without providing a rationale for this choice. Some assays are performed in MDA-MB-231 and shown in the main figures, whereas others use BT-549, creating unnecessary inconsistency. A clearer, more coherent strategy is needed (e.g., present all main findings in MDA-MB-231 and confirm key results in BT-549 in supplementary figures).

      (10) Inconsistency in invadopodia numbers under identical conditions.

      The number of invadopodia formed in Figure 1E is markedly lower than in Figure 1C, despite identical conditions. The authors should explain this discrepancy.

      (11) Questionable colocalization in some images.

      In some figures - for example, Figure 2G - the dots indicated by arrows do not convincingly show colocalization. The authors should clarify or reanalyze these data.

      (12) Abstract, Introduction, and Discussion require substantial rewriting.

      (a) The abstract should be accessible to a broader audience and should avoid using abbreviations and protein names without context.

      (b) The introduction should better describe the cellular processes and proteins investigated in this study.

      (c) The discussion currently reads more like an extended summary of results. It lacks deeper interpretation, comparison with existing literature, and consideration of the broader implications of the findings.

    1. Reviewer #1 (Public review):

      Summary:

      The authors report the structure of the human CTF18-RFC complex bound to PCNA. Similar structures (and more) have been reported by the O'Donnell and Li labs. This study should add to our understanding of CTF18-RFC in DNA replication and clamp loaders in general. However, there are numerous major issues that I recommend the authors fix.

      Strengths:

      The structures reported are strong and useful for comparison with other clamp loader structures that have been reported lately.

    2. Reviewer #2 (Public review):

      Summary

      Briola and co-authors have performed a structural analysis of the human CTF18 clamp loader bound to PCNA. The authors purified the complexes and formed a complex in solution. They used cryo-EM to determine the structure to high resolution. The complex assumed an auto-inhibited conformation, where DNA binding is blocked, which is of regulatory importance and suggests that additional factors could be required to support PCNA loading on DNA. The authors carefully analysed the structure and compared it to RFC and related structures.

      Strength & Weakness

      Their overall analysis is of high quality, and they identified, among other things, a human-specific beta-hairpin in Ctf18 that flexible tethers Ctf18 to Rfc2-5. Indeed, deletion of the beta-hairpin resulted in reduced complex stability and a reduction in the rate of primer extension assay with Pol ε. Moreover, the authors identify that the Ctf18 ATP-binding domain assumes a more flexible organisation.

      The data are discussed accurately and relevantly, which provides an important framework for rationalising the results.

      All in all, this is a high-quality manuscript that identifies a key intermediate in CTF18-dependent clamp loading.

    3. Reviewer #3 (Public review):

      Summary:

      CTF18-RFC is an alternative eukaryotic PCNA sliding clamp loader which is thought to specialize in loading PCNA on the leading strand. Eukaryotic clamp loaders (RFC complexes) have an interchangeable large subunit which is responsible for their specialized functions. The authors show that the CTF18 large subunit has several features responsible for its weaker PCNA loading activity, and that the resulting weakened stability of the complex is compensated by a novel beta hairpin backside hook. The authors show this hook is required for the optimal stability and activity of the complex.

      Relevance:

      The structural findings are important for understanding RFC enzymology and novel ways that the widespread class of AAA ATPases can be adapted to specialized functions. A better understanding of CTF18-RFC function will also provide clarity into aspects of DNA replication, cohesion establishment and the DNA damage response.

      Strengths:

      The cryo-EM structures are of high quality enabling accurate modelling of the complex and providing a strong basis for analyzing differences and similarities with other RFC complexes. They use complementary pre-steady state FRET and polymerase primer extension assays to investigate the role of a unique structural element in CTF18.

      Weaknesses:

      The manuscript would have benefited from a more detailed biochemical analysis using mutagenesis and assays to tease apart the functional relevance of the many differences with the canonical RFC complex.

      Overall appraisal:

      Overall, the work presented here is solid and important. The data is sufficient to support the stated conclusions.

    1. Reviewer #1 (Public review):

      Summary:

      GPCRs affect the EV-miRNA cargoes

      Strengths:

      Novel idea of GPCRs-mediated control of EV loading of miRNAs

      Weaknesses:

      Incomplete findings failed to connect and show evidence of any physiological parameters that are directly related to the observed changes. The mechanical detail is completely lacking.

      Comments on revisions:

      The revised version of the manuscript falls short of the required standard by lacking additional experiments. Some of the conditions for acceptability could have been met only through clarifying uncertainties via further experiments, which, unfortunately, have not been conducted.

    2. Reviewer #2 (Public review):

      Summary:

      This study examines how activating specific G protein-coupled receptors (GPCRs) affects the microRNA (miRNA) profiles within extracellular vesicles (EVs). The authors seek to identify whether different GPCRs produce unique EV miRNA signatures and what these signatures could indicate about downstream cellular processes and pathology processes.

      Methods:

      Used U2OS human osteosarcoma cells, which naturally express multiple GPCR types.

      Stimulated four distinct GPCRs (ADORA1, HRH1, FZD4, ACKR3) using selective agonists.

      Isolated EVs from culture media and characterized them via size exclusion chromatography, immunoblotting, and microscopy.

      Employed qPCR-based miRNA profiling and bioinformatics analyses (e.g., KEGG, PPI networks) to interpret expression changes.

      Key Findings:

      No significant change in EV quantity or size following GPCR activation.

      Each GPCR triggered a distinct EV miRNA expression profile.

      miRNAs differentially expressed post-stimulation were linked to pathways involved in cancer, insulin resistance, neurodegenerative diseases, and other physiological/pathological processes.

      miRNAs such as miR-550a-5p, miR-502-3p, miR-137, and miR-422a emerged as major regulators following specific receptor activation.

      Conclusions:

      The study offers evidence that GPCR activation can regulate intercellular communication through miRNAs encapsulated within extracellular vesicles (EVs). This finding paves the way for innovative drug-targeting strategies and enhances understanding of drug side effects that are mediated via GPCR-related EV signaling.

      Strengths:

      Innovative concept: The idea of linking GPCR signaling to EV miRNA content is novel and mechanistically important.

      Robust methodology: The use of multiple validation methods (biochemical, biophysical, and statistical) lends credibility to the findings.

      Relevance: GPCRs are major drug targets, and understanding off-target or systemic effects via EVs is highly valuable for pharmacology and medicine.

      Weaknesses:

      Sample Size & Scope: The analysis included only four GPCRs. Expanding to more receptor types or additional cell lines would enhance the study's applicability.

      Exploratory Nature: This study is primarily descriptive and computational. It lacks functional validation, such as assessing phenotypic effects in recipient cells, which is acknowledged as a future step.

      EV heterogeneity: The authors recognize that they did not distinguish EV subpopulations, potentially confounding the origin and function of miRNAs.

      Comments on revisions:

      All the comments have been taken into account. I wish the authors success in their future research.

    1. Reviewer #1 (Public review):

      Significance:

      While most MAVEs measure overall function (which is a complex integration of biochemical properties, including stability), VAMP-seq-type measurements more strongly isolate stability effects in a cellular context. This work seeks to create a simple model for predicting the response for a mutation on the "abundance" measurement of VAMP-seq.

      Public Review:

      Of course, there is always another layer of the onion, VAMP-seq measures contributions from isolated thermodynamic stability, stability conferred by binding partners (small molecule and protein), synthesis/degradation balance (especially important in "degron" motifs), etc. Here the authors' goal is to create simple models that can act as a baseline for two main reasons:

      (1) how to tell when adding more information would be helpful for a global model;

      (2) how to detect when a residue/mutation has an unusual profile indicative of an unbalanced contribution from one of the factors listed above.

      As such, the authors state that this manuscript is not intended to be a state-of-the-art method in variant effect prediction, but rather a direction towards considering static structural information for the VAMP-seq effects. At its core, the method is a fairly traditional asymmetric substitution matrix (I was surprised not to see a comparison to BLOSUM in the manuscript) - and shows that a subdivision by burial makes the model much more predictive. Despite only having 6 datasets, they show predictive power even when the matrices are based on a smaller number. Another success is rationalizing the VAMPseq results on relevant oligomeric states.

      Comments on revision:

      We have no further comments on this manscript.

    2. Reviewer #3 (Public review):

      "Effects of residue substitutions on the cellular abundance of proteins" by Schulze and Lindorff-Larsen revisits the classical concept of structure-aware protein substitution matrices through the scope of modern protein structure modelling approaches and comprehensive phenotypic readouts from multiplex assays of variant effects (MAVEs). The authors explore 6 unique protein MAVE datasets based on protein abundance through the lens of protein structural information (residue solvent accessibility, secondary structure type) to derive combinations of context-specific substitution matrices that predict variant impact on protein abundance. They are clear to outline that the aim of the study is not to produce a new best abundance predictor, but to showcase the degree of prediction afforded simply by utilizing structural information.

      Both the derived matrices and the underlying 'training' data are comprehensively evaluated. The authors convincingly demonstrate that taking structural solvent accessibility contexts into account leads to more accurate performance than either a structure-unaware matrix, secondary structure-based matrix, or matrices combining both solvent accessibility and secondary structure. The capacity for the approach to produce generalizable matrices is explored through training data combinations, highlighting factors such as the variable quality of the experimental MAVE data and the biochemical differences between the protein targets themselves, which can lead to limitations. Despite this, the authors demonstrate their simple matrix approach is generally on par with dedicated protein stability predictors in abundance effect evaluation, and even outperforms them in a niche of solvent accessible surface mutations, revealing their matrices provide orthogonal abundance-specific signal. More importantly, the authors further develop this concept to creatively show their matrices can be used to identify surface residues that have buried-like substitution profiles, which are shown to correspond to protein interface residues, post-translational modification sites, functional residues or putative degrons.

      The paper makes a strong and well-supported main point, demonstrating the widespread utility of the authors' approach, empowered through protein structural information and cutting edge MAVE datasets. This work creatively utilizes a simple concept to produce a highly interpretable tool for protein abundance prediction (and beyond), which is inspiring in the age of impenetrable machine learning models.

    1. Reviewer #1 (Public review):

      Summary:

      The authors clearly demonstrate that overexpressed Dcp-1, but not Drice, is activated without canonical apoptosome components. Using TurboID-based proximity labeling, they revealed distinct proximal proteomes, among which Sirtuin 1, an Atg8a deacetylase, which promotes autophagy, was specifically required for Dcp-1 activation. Additionally, the show that autophagy-related genes, including Bcl-2 family members Debcl and Buffy, are required for Dcp-1 activation.

      Using structure-based prediction using AlphaFold3, they identified that Bruce, an autophagy-regulated inhibitor of apoptosis, acts as a Dcp-1-specific regulator acting outside the apoptosome-mediated pathway. Finally, they show that Bruce suppresses wing tissue growth. These findings indicate that non-lethal Dcp-1 activity is governed by the autophagy-Bruce axis, enabling distinct non-lethal functions independent of cell death.

      Strengths:

      This is an excellent paper with very good structure, excellent quality data and analysis.

      Weaknesses:

      This reviewer did not identify any weaknesses or recommendations for revision.

    2. Reviewer #2 (Public review):

      Summary:

      The Drosophila executioner caspase Dcp-1 has established roles in cell death, autophagy, and imaginal disc growth. This study reports previously unrecognized factors that work together with Dcp-1. Specifically, the authors performed a turboID-based proximal ligation experiment to identify factors associated Dcp-1 and Drice. Dcp-1-specific interactors were further examined for their genetic interaction. The authors report autophagy-related genes, including Debcl and Buffy, to be required for Dcp-1 activation. In addition, the authors present evidence of an interaction between Bruce and Dcp-1. Bruce-expression blocks the Dcp-1 overexpression phenotype. Inhibition of effector caspases or overexpression of Bruce commonly reduced wing growth, suggesting a relationship between the two proteins.

      Strengths:

      On the positive side, the study identifies new Dcp-1-interacting proteins and provides a functional link between Dcp-1 and Sirt1, Fkbp59, Debcl, Buffy, Atg2, and Atg8a.

      Weaknesses:

      The data supporting the Dcp-1/Bruce interaction are not strong, even though the title of this manuscript highlights Bruce. For example, the authors' turboID data does not support Dcp-1/Bruce interaction. The case for the interaction is based on a single experiment that overexpresses a truncated Bruce transgene in S2 cells.

    3. Reviewer #3 (Public review):

      Summary:

      The present paper by Shinoda et al. from the Miura group builds upon findings reported in an earlier study by the same team (Shinoda et al., PNAS, 2019), which identified a non-apoptotic role for the Drosophila executioner caspase Dcp-1 in promoting wing tissue growth. That earlier work attributed this function primarily to Dcp-1 and to Decay, a caspase structurally related to executioner caspases, but not to DrICE, the principal apoptotic executioner caspase. The authors further proposed that this non-apoptotic caspase activity operates independently of the initiator caspase Dronc.

      In the current study, the authors both corroborate aspects of their previous findings and extend the investigation to mechanisms regulating Dcp-1 in this context. They identify roles for the giant IAP Bruce, two BCL-2 family members, and autophagy-related components in modulating non-apoptotic Dcp-1 activity. Moreover, they show that Bruce binds to a BIR-like peptide exposed upon Dcp-1 cleavage, but not to DrICE. The study further suggests that low levels of Dcp-1 activity promote wing tissue growth, whereas excessive activity induces cell death, as evidenced by impaired wing development following Dcp-1 overexpression. Overall, the manuscript provides several intriguing insights into the non-apoptotic regulation of the comparatively weak apoptotic executioner caspase Dcp-1 and complements the group's earlier work. However, several concerns remain regarding certain interpretations of the data and the experimental rigour of some of the results.

      Strengths:

      A major strength of the work is its systematic genetic and biochemical approaches, which combine tissue-specific manipulation with protein interaction mapping to explore how Dcp-1 is regulated. The identification of several regulatory factors, including an inhibitor of cell death protein and components linked to autophagy, provides a coherent framework for understanding how Dcp-1 activity might be tuned.

      Weaknesses:

      The evidence supporting some key claims remains incomplete. In particular, the type of cell death form induced when Dcp-1 is overexpressed is not clearly established, and additional tests would be needed to distinguish between the different cell death types.

      Likely impact:

      The study contributes to a growing body of work showing that proteins traditionally associated with cell death can have broader roles in tissue development. This conceptual advance is likely to be of interest to researchers studying growth control and tissue maintenance.

      Specific points:

      (1) Nature of the wing ablation phenotype

      A central concern is whether the wing ablation phenotype observed upon Dcp-1 overexpression truly reflects apoptotic cell death. The authors show in Figure 1c that nuclei in cells overexpressing Dcp-1, but not DrICE, zymogens are highly condensed, which is suggestive of apoptosis. However, it is equally plausible that this phenotype reflects a form of non-apoptotic, Dcp-1-dependent cell death (e.g. autophagy-dependent cell death). This distinction could be readily addressed using TUNEL labelling and direct caspase activity assays. The latter would be particularly informative, as it remains unclear whether zymogen Dcp-1 is capable of cleaving standard effector caspase reporters in vivo. Does the anti-cleaved Dcp-1 antibody detect Dcp-1 activation following overexpression of the Dcp-1 zymogen?

      (2) Role of Decay

      In their earlier study, the authors identified Decay as another caspase influencing wing growth, albeit more modestly than Dcp-1. It is therefore unclear why this line of investigation was not pursued further in the current work. This omission is notable, as Decay is not implicated in apoptosis and, to date, no substantial physiological function has been assigned to this caspase in any system. At a minimum, this point should be discussed explicitly.

      (3) Figure 2: Proximity labelling analysis

      The authors use TurboID-mediated proximity labelling to reveal distinct Dcp-1- and DrICE-associated proteomes across tissues, with a particular focus on the wing disc. They further demonstrate that RNAi-mediated knockdown of the Dcp-1-associated proteins Sirt1 and Fkbp59 suppresses the wing ablation phenotype induced by Dcp-1 overexpression, suggesting that these factors are required for Dcp-1 activity. However, it should be clarified whether Bruce was identified as a Dcp-1 interactor in the proximity labelling dataset, given its proposed central regulatory role. In addition, further discussion of Fkbp59, its known functions and how it might mechanistically influence Dcp-1 activity would be valuable.

      (4) Figure 3: Autophagy-related factors

      Given that Sirt1 is known to promote autophagy, the authors next examine autophagy-related proteins and identify roles for Atg2, Atg8a, Debcl, and Buffy in Dcp-1 activation. Notably, these proteins do not promote cell death in the Hid-induced canonical apoptotic pathway. However, it is important to determine whether knockdown of Debcl, Buffy, Atg2, or Atg8a alone affects wing development in the absence of Dcp-1 overexpression, to exclude the possibility that these perturbations independently impair wing formation.

      (5) Evidence for canonical autophagy

      The involvement of autophagy would be more convincingly demonstrated by testing additional core autophagy genes, such as Atg7, Atg5, and Atg12, as well as performing a combined knockdown of Atg8a and Atg8b. Moreover, direct assessment of autophagy at the cellular level using established genetic reporters would substantially strengthen the conclusions.

      (6) Figures 4-5: Functional consequences

      It would be informative to determine whether Synr, Debcl, or Buffy influence wing size on their own and whether their overexpression enhances wing growth.

      (7) Terminology and interpretation of cell death

      Taken together, the results suggest that Dcp-1 zymogen overexpression induces a form of non-apoptotic cell death, potentially autophagy-dependent or related. The reviewer does not understand the authors' insistence on referring to this process as apoptosis. The authors should be more cautious in their terminology: there is no canonical versus non-canonical apoptosis; there is simply apoptosis. Without stronger evidence, these effects should not be described as apoptotic cell death.

    1. Reviewer #1 (Public review):

      Summary:

      The authors aimed to overcome a major technical limitation in pancreatic slice research - the inefficient viral transduction of dense, enzyme-active human pancreas tissue - while maintaining tissue integrity and physiological responsiveness. They developed a modified culture and infection protocol that incorporates gentle orbital agitation, removal of protease inhibitors, and physiological temperature during adenoviral transduction. This method increased transduction efficiency by approximately threefold without impairing insulin secretion or calcium signaling responses.

      Strengths:

      The study's major strengths are its clear methodological innovation, experiment optimization, and multiparametric validation. The authors provide compelling evidence that their approach enhances the expression of genetically encoded calcium indicators (GCaMP6m) and integrators (CaMPARI2), preserving both endocrine and exocrine cell functionality. The demonstration of targeted biosensor expression in β-cells and multiplexed imaging of redox and calcium dynamics highlights the versatility of the system. The CaMPARI2-based approach is particularly impactful, as it decouples maximum calcium response assessment from real-time imaging, thereby increasing throughput and reducing bias. The authors successfully apply the technique to samples from non-diabetic, T1D, and T2D donors, revealing disease-relevant alterations in β-cell calcium responses consistent with known physiological dysfunctions. The analysis of islet size versus calcium response further underscores the utility of this platform for probing structure-function relationships in situ.

      Weaknesses:

      The primary limitations are a lack of live/dead assessment to differentiate viability-related effects from methodological improvements, a lack of quantification of the transduction efficiency (while relative efficiency is clearly increased, it is not shown what is absolute efficiency is), lack of IF confirmation of the cell-specific transduction efficiency. These limitations, however, do not detract from the overall strength of the technical advance.

      Overall, this work offers a convincing and practical advance for the diabetes and islet biology community. It substantially improves the toolkit available for live human pancreas studies and will likely catalyze further mechanistic investigations of islet heterogeneity, disease progression, and therapeutic response.

    2. Reviewer #2 (Public review):

      (1) The photoconversion protocol requires a more detailed and quantitative discussion. The current description ("5 s pulses for 5 min, leading to 2.5 min of total light delivery") is too brief to evaluate whether the chosen illumination parameters maintain the CaMPARI2 signal within its linear dynamic range. Because CaMPARI2 photoconversion reflects the time integral of 405 nm photoconverting light exposure in the presence of intracellular [Ca²⁺], the red/green fluorescence ratio is directly proportional to cumulative illumination time until saturation occurs. Previous characterization (PMID: 30361563) shows that photoconversion is approximately linear over the first 0-80 s of 405 nm exposure, after which red fluorescence plateaus. The total exposure used here (=150 s) may therefore exceed the linear regime, potentially obscuring differences between cells with moderate versus strong Ca²⁺ activity. The authors should (i) justify the selected illumination parameters, (ii) provide evidence that the chosen conditions remain within the linear response range for the specific optical setup, (iii) discuss how overexposure might affect quantitative interpretation of red/green ratios and comparisons between experimental groups. Inclusion of calibration data would substantially strengthen the methodological rigor and reproducibility of the study.

      (2) For Figure 8a (middle panels), the data points for 16G and KCl show overlaps, raising the possibility that at it 16G may already be saturated. The authors should comment on the potential for CaMPARI2 saturation at 16G, and clarify whether this affects the interpretation of the KCl results "At maximal stimulation by KCl, there was no size-function correlation (R = 0.15, p = 0.14)."

      (3) The term "calcium activity" is used throughout the manuscript but remains vague. Pancreatic islets typically display a biphasic Ca²⁺ response to high glucose-an initial sustained peak followed by repetitive oscillations - and these phases differ in both kinetics and physiological meaning. Ca²⁺ responses are usually quantified using parameters such as rise time, amplitude, and duration for the initial peak, and amplitude, frequency, burst duration, and duty cycle for the oscillatory phase. The authors should clarify how "calcium activity" is defined in their analyses and discuss the appropriateness of directly comparing Ca²⁺ signals with distinct temporal patterns.

      (4) The CaMPARI2 red/green ratio reflects the time-integral of 405 nm photoconverting light exposure in the presence of Ca²⁺, two Ca²⁺ responses with the same duty cycle but different amplitudes could, in principle, yield the same red/green ratios. This raises an important question regarding how well the CaMPARI2 signal distinguishes differences in Ca²⁺ amplitude versus time spent above threshold. The authors should directly relate single-cell Ca²⁺ traces to corresponding red/green ratios to demonstrate the extent to which CaMPARI2 photoconversion truly reflects "Ca²⁺ activity." Such validation would clarify whether the metric is sensitive to variations in oscillation amplitude, duty cycle, or both, and would strengthen the interpretation of CaMPARI2-based functional comparisons.

    3. Reviewer #3 (Public review):

      Summary:

      Lazimi and coworkers present an updated experimental protocol by which viral vectors can be used with live pancreas slices in order to efficiently transduce fluorescent protein biosensors. This is of high importance, given that live human pancreas slices provide a means to study islet function while maintaining the architecture of the local environment. Thus, efficiently delivering a wide range of fluorescent protein biosensors provides expanded capabilities to study the human islet and its dysfunction in type 1 and type 2 diabetes. The authors demonstrate the improved transduction provided by their revised protocol, which includes orbital culture, while retaining or, in some cases, improving cell viability, hormone release, and Ca2+ responses. Further, the authors demonstrate how a 'Ca2+ integrator', CAMPARI2, can be used to profile the Ca2+ response of large numbers of cells and islets, to capture the variability in islet responses in healthy and diabetic cases.

      Strengths:

      The data presented are generally robust, and the methods are well described, such that this protocol could be repeated by other investigators. All findings are representative of multiple donors. Importantly, the data is highly novel.

      Weaknesses:

      Weaknesses in the manuscript mainly include a lack of technical details by which data is presented or analyzed, as well as caveats by which certain data related to islet size are interpreted.

    1. Reviewer #1 (Public review):

      Liu, Li, Ge, and colleagues use whole genome sequence data to estimate the recombination landscape of domesticated chickens and their wild ancestor, Red Junglefowl. They compare landscapes estimated using the deep learning method RelERNN (Adrion et al. 2020) to understand the consequences of domestication for the evolution of recombination. The authors build on previous work in tomato, maize, and other domesticated species to examine how recombination rate and patterning evolve under the demography and selection pressures of domestication. They do so by comparing estimates of local recombination rates across chromosomes and populations, asking if/how well certain sequence and chromatin-based predictors predict recombination rate, and testing for an association between recombination rate and the proportion of introgressed ancestry from Red Junglefowl.

      This study provides evidence for the hypothesis that recombination evolves rapidly in domesticated lineages -- so much so that we see little hotspot sharing between breeds in the present-day! Strengths of the paper include the collection/analysis of data from several domesticated sub-populations and efforts to control for demography and structure in the inference of recombination landscapes (given the challenges of some methods under non-equilibrium demography: https://academic.oup.com/mbe/article/35/2/335/4555533). It is also reassuring to see patterns that have been thoroughly established (e.g., the negative relationship between recombination rate and chromosome size) validated.

      However, I have concerns about the data and methodology.

      (1) My main concern is that the demographic and recombination rate estimates inferred using ~20 whole genomes are likely quite variable and, without quantification of the uncertainty or systematic assessment of the possible biases in the methodology, it is difficult to have confidence in analyses which make use of the RelERNN landscapes.

      (a) Similar studies in rye (https://academic.oup.com/mbe/article/39/6/msac131/6605708) and tomato (https://academic.oup.com/mbe/article/39/1/msab287/6379725) used data from far more individuals (916 individuals split up into populations of size 50 for rye, >75 samples for tomato) to infer recombination maps and conduct downstream analyses. Studies in human genetics make use of an even greater number! The evidence (Lines 189-196 of the main text) that the sample size is sufficient to capture fine-scale variation in recombination is weak. In particular, correlations between the true and estimated recombination rate are based on *equilibrium* demography at sample sizes of 5, 10, and 20, yet used draw the inference "20 samples per population are sufficient to reconstruct their recombination landscapes" under the *non-equilibrium* demography (inferred using SMC+).

      (b) RelERNN learns the recombination landscape by using several signatures (the decay of linkage disequilibrium and, as described in https://academic.oup.com/genetics/advance-article-abstract/doi/10.1093/genetics/iyaf108/8157390, choppiness of the allele frequency spectrum) left in present-day genomes. Both signatures depend strongly on local SNP density. It does not seem the effect of SNP density on the inferred recombination rate is examined, despite the potential for correlated noise in inferred recombination rate (in SNP-sparse regions of the genome) to confound downstream inference.

      (c) It is unclear if the demographic histories for chickens (Figure S6) broadly match what have been previously estimated from whole-genome data, or if a large class of demographic models are compatible with the data (i.e., confidence intervals for the demographic histories are quite large). In Figure S6, its bottlenecks are somewhat weak and affect only a couple of the groups, despite the history of domestication and the expectation that effective sizes vary more widely. The groups affected (LX and WL) are those that have the weakest correlations between recombination rate under the equilibrium and non-equilibrium demographic models.

      (2) The authors test for the effects of chromatin modifications, GC content, etc using correlations between local recombination rate and the features individually. However, joint inference of the effects under a GLM (the distribution of recombination rates is probably better described by, e.g., a Gamma distribution) would permit more straightforward causal inference, given, e.g., the potential effects of chromatin marks on deleterious mutation accumulation. I recognize this likely would not change the direction or significance of the effects in question, but it is worth noting given readers who may want to learn something from the effect sizes and the nature of causes and effects is difficult to disentangle without a multivariate approach.

      Overall:

      Previous work on recombination landscape evolution in birds (namely, the zebra finch and long-tailed finch; Singhal & Leffler 2015) has shown that many hotspots, i.e., small stretches of the genome that experience rates of crossing over that are much higher than the genome-wide average, are conserved over tens of millions of years of evolution. Work in tomato, maize, rye, and other flowering plants with histories of domestication have shown that hotspots can be dynamic. The results of Liu, Li, Ge, and colleagues complement those analyses and will, therefore, be of interest to those working on the evolution of recombination. Additionally, the finding that minor parent ancestry is negatively associated with recombination is interesting to an otherwise general rule in evolutionary biology. Finally, it is quite exciting to see recombination maps inferred using RelERNN, and in a demography-aware fashion!

      That all said, it is difficult to have certainty in the results due to the relatively limited sample size for each of the populations, the lack of control for SNP density, the uncertainty in both recombination maps and demographic histories, and the lack of a joint modelling framework to carefully tease apart effects that are reported in isolation.

    2. Reviewer #2 (Public review):

      Summary:

      Liu et al. use whole genome sequencing data from several strains of chicken as well as a subspecies of the chicken wild ancestor to study the impact of domestication on the recombination landscape. They analyze these data using several machine-learning/AI based methods, using simulation to partially inform their analysis. The authors claim to find substantial deviations in the fine-scale recombination landscape between breeds, and surprising patterns between recombination and introgression/selection. However, there are substantial inconsistencies between the author's findings and the current understanding in the field, supported by indirect evidence that is hard to interpret at best.

      Strengths:

      The data produced by the authors of this and a previous paper is well-suited to answer the questions that they pose. The authors use simulations to support some decisions made in analyzing this data, which partially alleviates some potential questions, and could be extended to address additional concerns. Should further analysis support the claims currently made regarding hotspot turnover and introgression frequency vs. recombination rate, these findings would indeed be striking observations at odds with current understanding in the field.

      Weaknesses:

      I have several major concerns regarding the ability of the analyses to support the claims in this paper, summarized below.

      Substantial deviations from field-standard benchmarks the estimated recombination landscape appear to have been disregarded, particularly with regard to the WL breed.<br /> o For example, the number of detected hotspots per subspecies ranges from maybe 500 to over 100,000 based on figure 2A. While the mean is indeed comparable to estimates from other species (lines 315-317), this characterization masks that each recombination map has far too few or too many hotspots to be biologically accurate (at least without substantial corroboration from more direct analyses). As such, statements about hotspot overlap between breeds and hotspot conservation cannot be taken at face value. Authors might consider using alternative methods to detect hotspots, assessing their power to detect hotspots in each breed, and evaluating hotspot overlap between breeds with respect to random expectation.<br /> o Furthermore, the authors consider the recombination landscape at promoters (Figure S10) and H3K4me3 sites (Figure 2C) and find that levels are slightly elevated, but the magnitude of the elevation (negligible to ~1.5x) is substantially lower than that of any other species studied to date without PRDM9. The magnitude of elevation for both comparisons is especially small for WL, which suggests that the recombination estimates for this breed are particularly noisy, and yet this breed is the focus of the introgression analysis.

      Introgression and strong selection can both be thought of as changing the local Ne along the genome. Estimating recombination from patterns of LD most directly estimates rho (the population recombination rate, 4*Ne*r), and disentangling local changes in Ne from local changes in r is non-trivial. Furthermore, selective sweeps, particularly easy-to-detect hard sweeps, are often characterized by having very little genetic variation. Estimating recombination rate from patterns of LD in regions with very little variation seems particularly challenging, and could bias results such as in Figure S15. The authors do not discuss the implications of these challenges for their analyses, which seems particularly relevant for their analyses of introgression and selection with recombination, as well as comparisons between WL (which the authors report to have undergone more selection and introgression) with other breeds. Authors should quantify their ability/power to detect recombination rates and hotspots under these conditions using simulation - some of these simulations are already mentioned in the paper, but are not analyzed in this way. Also useful would be quantifying the impact of simulated bottlenecks on estimates of recombination rate.

      In many analyses (e.g. hotspot and coldspot overlap, histone mark analysis), authors appear to use 1000 randomly selected regions of the same length as a control. If this characterization is accurate, authors should match the number of control regions to the number of features that they're comparing to. A more careful analysis might also select random regions from the same chromosome, match for GC content where appropriate, etc.

      Authors provide very little detail about the number/locations of coldspots or selective sweeps- how many were detected in each subspecies? Does the fraction of hotspots and coldspots which overlap selective sweeps vary between species? It is unclear whether the numbers in the text (lines 356-364) represent a single breed or an analysis across breeds.

    1. Reviewer #1 (Public review):

      Summary:

      These authors have developed a method to induce MI or MII arrest. While this was previously possible in MI, the advantage of the method presented here is it works for MII, and chemically inducible because it is based on a system that is sensitive to the addition of ABA. Depending on when the ABA is added, they achieve a MI or MII delay. The ABA promotes dimerizing fragments of Mps1 and Spc105 that can't bind their chromosomal sites. The evidence that the MI arrest is weaker than the MII arrest is convincing and consistent with published data and indicating the SAC in MI is less robust than MII or mitosis. The authors use this system to find evidence that the weak MI arrest is associated with PP1 binding to Spc105. This is a nice use of the system.

      The remainder of the paper uses the SynSAC system to isolate populations enriched for MI or MII stages and conduct proteomics. This shows a powerful use of the system, but more work is needed to validate these results, particularly in normal cells.

      Overall, the most significant aspect of this paper is the technical achievement, which is validated by the other experiments. They have developed a system and generated some proteomics data that maybe useful to others when analyzing kinetochore composition at each division.

    2. Reviewer #2 (Public review):

      Summary:

      The manuscript submitted by Koch et al. describes a novel approach to collect budding yeast cells in metaphase I or metaphase II by synthetically activating the spinde checkpoint (SAC). The arrest is transient and reversible. This synchronization strategy will be extremely useful for studying meiosis I and meiosis II, and compare the two divisions. The authors characterized this so named syncSACapproach and could confirm previous observations that the SAC arrest is less efficient in meiosis I than in meiosis II. They found that downregulation of the SAC response through PP1 phosphatase is stronger in meiosis I than in meiosis II. The authors then went on to purify kinetochore-associated proteins from metaphase I and II extracts for proteome and phosphoproteome analysis. Their data will be of significant interest to the cell cycle community (they compared their datasets also to kinetochores purified from cells arrested in prophase I and -with SynSAC in mitosis).

      Significance:

      The technique described here will be of great interest to the cell cycle community. Furthermore, the authors provide data sets on purified kinetochores of different meiotic stages and compare them to mitosis. This paper will thus be highly cited, for the technique, and also for the application of the technique.

    3. Reviewer #3 (Public review):

      Summary:

      In their manuscript, Koch et al. describe a novel strategy to synchronize cells of the budding yeast Saccharomyces cerevisiae in metaphase I and metaphase II, thereby facilitating comparative analyses between these meiotic stages. This approach, termed SynSAC, adapts a method previously developed in fission yeast and human cells that enables the ectopic induction of a synthetic spindle assembly checkpoint (SAC) arrest by conditionally forcing the heterodimerization of two SAC components upon addition of the plant hormone abscisic acid (ABA). This is a valuable tool, which has the advantage that induces SAC-dependent inhibition of the anaphase promoting complex without perturbing kinetochores. Furthermore, since the same strategy and yeast strain can be also used to induce a metaphase arrest during mitosis, the methodology developed by Koch et al. enables comparative analyses between mitotic and meiotic cell divisions. To validate their strategy, the authors purified kinetochores from meiotic metaphase I and metaphase II, as well as from mitotic metaphase, and compared their protein composition and phosphorylation profiles. The results are presented clearly and in an organized manner. Despite the relevance of both the methodology and the comparative analyses, several main issues should be addressed:

      (1) In contrast to the strong metaphase arrest induced by ABA addition in mitosis (Supp. Fig. 2), the SynSAC strategy only promotes a delay in metaphase I and metaphase II as cells progress through meiosis. This delay extends the duration of both meiotic stages, but does not markedly increase the percentage of metaphase I or II cells in the population at a given timepoint of the meiotic time course (Fig. 1C). Therefore, although SynSAC broadens the time window for sample collection, it does not substantially improve differential analyses between stages compared with a standard NDT80 prophase block synchronization experiment. Could a higher ABA concentration or repeated hormone addition improve the tightness of the meiotic metaphase arrest?

      (2) Unlike the standard SynSAC strategy, introducing mutations that prevent PP1 binding to the SynSAC construct considerably extended the duration of the meiotic metaphase arrests. In particular, mutating PP1 binding sites in both the RVxF (RASA) and the SILK (4A) motifs of the Spc105(1-455)-PYL construct caused a strong metaphase I arrest that persisted until the end of the meiotic time course (Fig. 3A). This stronger and more prolonged 4A-RASA SynSAC arrest would directly address the issue raised above. It is unclear why the authors did not emphasize more this improved system. Indeed, the 4A-RASA SynSAC approach could be presented as the optimal strategy to induce a conditional metaphase arrest in budding yeast meiosis, since it not only adapts but also improves the original methods designed for fission yeast and human cells. Along the same lines, it is surprising that the authors did not exploit the stronger arrest achieved with the 4A-RASA mutant to compare kinetochore composition at meiotic metaphase I and II.

      (3) The results shown in Supp. Fig. 4C are intriguing and merit further discussion. Mitotic growth in ABA suggest that the RASA mutation silences the SynSAC effect, yet this was not observed for the 4A or the double 4A-RASA mutants. Notably, in contrast to mitosis, the SynSAC 4A-RASA mutation leads to a more pronounced metaphase I meiotic delay (Fig. 3A). It is also noteworthy that the RVAF mutation partially restores mitotic growth in ABA. This observation supports, as previously demonstrated in human cells, that Aurora B-mediated phosphorylation of S77 within the RVSF motif is important to prevent PP1 binding to Spc105 in budding yeast as well.

      (4) To demonstrate the applicability of the SynSAC approach, the authors immunoprecipitated the kinetochore protein Dsn1 from cells arrested at different meiotic or mitotic stages, and compared kinetochore composition using data independent acquisition (DIA) mass spectrometry. Quantification and comparative analyses of total and kinetochore protein levels were conducted in parallel for cells expressing either FLAG-tagged or untagged Dsn1 (Supp. Fig. 7A-B). To better detect potential changes, protein abundances were next scaled to Dsn1 levels in each sample (Supp. Fig. 7C-D). However, it is not clear why the authors did not normalize protein abundance in the immunoprecipitations from tagged samples at each stage to the corresponding untagged control, instead of performing a separate analysis. This would be particularly relevant given the high sensitivity of DIA mass spectrometry, which enabled quantification of thousands of proteins. Furthermore, the authors compared protein abundances in tagged-samples from mitotic metaphase and meiotic prophase, metaphase I and metaphase II (Supp. Fig. 7E-F). If protein amounts in each case were not normalized to the untagged controls, as inferred from the text (lines 333 to 338), the observed differences could simply reflect global changes in protein expression at different stages rather than specific differences in protein association to kinetochores.

      (5) Despite the large amount of potentially valuable data generated, the manuscript focuses mainly on results that reinforce previously established observations (e.g., premature SAC silencing in meiosis I by PP1, changes in kinetochore composition, etc.). The discussion would benefit from a deeper analysis of novel findings that underscore the broader significance of this study.

      Significance:

      Koch et al. describe a novel methodology, SynSAC, to synchronize budding yeast cells in metaphase I or metaphase II during meiosis, as well and in mitotic metaphase, thereby enabling differential analyses among these cell division stages. Their approach builds on prior strategies originally developed in fission yeast and human cells models to induce a synthetic spindle assembly checkpoint (SAC) arrest by conditionally forcing the heterodimerization of two SAC proteins upon addition of abscisic acid (ABA). The results from this manuscript are of special relevance for researchers studying meiosis and using Saccharomyces cerevisiae as a model. Moreover, the differential analysis of the composition and phosphorylation of kinetochores from meiotic metaphase I and metaphase II adds interest for the broader meiosis research community. Finally, regarding my expertise, I am a researcher specialized in the regulation of cell division.

    1. Reviewer #1 (Public review):

      Summary:

      The authors performed an elegant investigation to clarify the roles of CHD4 in chromatin accessibility and transcription regulation. In addition to the common mechanisms of action through nucleosome repositioning and opening of transcriptionally active regions, the authors considered here a new angle of CHD4 action through modulating the off rate of transcription factor binding. Their suggested scenario is that the action of CHD4 is context-dependent and is different for highly-active regions vs low-accessibility regions.

      Strengths:

      This is a very well-written paper that will be of interest to researchers working in this field. The authors performed large work with different types of NGS experiments and the corresponding computational analyses. The combination of biophysical measurements of the off-rate of protein-DNA binding with NGS experiments is particularly commendable.

      Comments on revised version:

      The authors have addressed all my points

    2. Reviewer #2 (Public review):

      This study leverages acute protein degradation of CHD4 to define its role in chromatin and gene regulation. Previous studies have relied on KO and/or RNA interference of this essential protein and as such are hampered by adaptation, cell population heterogeneity, cell proliferation and indirect effects. The authors have established an AID2-based method to rapidly deplete the dMi-2 remodeller to circumvent these problems. CHD4 is gone within an hour, well before any effects on cell cycle or cell viability can manifest. This represents an important technical advance that, for the first time, allows a comprehensive analysis of the immediate and direct effect of CHD4 loss of function on chromatin structure and gene regulation.

      Rapid CHD4 degradation is combined with ATAC-seq, CUT&RUN, (nascent) RNA-seq and single molecule microscopy to comprehensively characterise the impact on chromatin accessibility, histone modification, transcription and transcription factor (NANOG, SOX2, KLF4) binding in mouse ES cells.

      The data support the previously developed model that high levels of CHD4/NuRD maintain a degree of nucleosome density to limit TF binding at open regulatory regions (e.g. enhancers). The authors propose that CHD4 activity at these sites is an important prerequisite for enhancers to respond to novel signals that require an expanded or new set of TFs to bind.

      What I find even more exciting and entirely novel is the finding that CHD4 removes TFs from regions of limited accessibility to repress cryptic enhancers and to suppress spurious transcription. These regions are characterised by low CHD4 binding and have so far never been thoroughly analysed. The authors correctly point out that the general assumption that chromatin regulators act on regions where they seem to be concentrated (i.e. have high ChIP-seq signals) runs the risk of overlooking important functions elsewhere. This insight is highly relevant beyond the CHD4 field and will prompt other chromatin researchers to look into low level binding sites of chromatin regulators.

      The biochemical and genomic data presented in this study is of high quality (I cannot judge single microscopy experiments due to my lack of expertise). This is an important and timely study that is of great interest to the chromatin field.

      Comments on revised version:

      All my comments below have been addressed in the revised version of the manuscript.

      The revised manuscript provides a significant advance of our understanding of how the nucleosome remodeler CHD4 exerts its function. In particular, the findings suggest an intriguing role of CHD4 in TF removal at genomic regions where only low levels of CHD4 can be detected. In the future, it will be interesting to see if this activity is shared by other ATP-dependent nucleosome remodelers.

    3. Reviewer #3 (Public review):

      Summary:

      In this manuscript an inducible degron approach is taken to investigate the function of the CHD4 chromatin remodelling complex. The cell lines and approaches used are well thought out and the data appear to be of high quality. They show that loss of CHD4 results in rapid changes to chromatin accessibility at thousands of sites. At the majority of locations where changes are detected, chromatin accessibility is decreased and these sites are strongly bound by CHD4 prior to activation of the degron and so likely represent primary sites of action. Somewhat surprisingly while chromatin accessibility is reduced at these sites transcription factor occupancy is little changed. Following CHD4 degradation occupancy of the key pluripotency transcription factors NANOG and SOX2 increases at many locations genome wide and at many of these sites chromatin accessibility increases. These represent important new insights into the function of CHD4 complexes.

      Strengths:

      The experimental approach is well suited to providing insight into a complex regulator such as CHD4. The data generated to characterise how cells respond to loss of CHD4 is of high quality. The study reveals major changes in transcription factor occupancy following CHD4 depletion.

      Weaknesses:

      The main weakness can be summarised as relating to the fact authors favour the interpretation that all rapid changes following CHD4 degradation occur as a direct effect of the loss of CHD4 activity. The possibility that rapid indirect effects arise does not appear to have been given sufficient consideration. This is especially pertinent where effects are reported at sites where CHD4 occupancy is initially very low (e.g sites where accessibility is gained, in comparison to that at sites where chromatin acdessibility is lost). The revised discussion acknowledges rapid indirect effects cannot be excluded.

    1. Reviewer #1 (Public review):

      Summary:

      The article presents the details of the high-resolution light-sheet microscopy system developed by the group. In addition to presenting the technical details of the system, its resolution has been characterized and its functionality demonstrated by visualizing subcellular structures in a biological sample.

      Strengths:

      The article includes extensive supplementary material that complements the information in the main article.

      Live imaging has been incorporated, as requested, increasing the value of the paper.

      Weaknesses:

      None

    2. Reviewer #2 (Public review):

      Summary:

      The authors present Altair-LSFM (Light Sheet Fluorescence Microscope), a high-resolution, open-source light-sheet microscope, that may be relatively easy to align and construct due to a custom-designed mounting plate. The authors developed this microscope to fill a perceived need that current open-source systems are primarily designed for large specimens and lack sub-cellular resolution or achieve high-resolution but are difficult to construct and are unstable. While commercial alternatives exist that offer sub-cellular resolution, they are expensive. The authors manuscript centers around comparisons to the highly successful lattice light-sheet microscope, including the choice of detection and excitation objectives. The authors thus claim that there remains a critical need for a high-resolution, economical and easy to implement LSFM systems and address this need with Altair.

      Strengths:

      The authors succeed in their goals of implementing a relatively low cost (~ USD 150K) open-source microscope that is easy to align. The ease of alignment rests on using custom-designed baseplates with dowel pins for precise positioning of optics based on computer analysis of opto-mechanical tolerances as well as the optical path design. They simplify the excitation optics over Lattice light-sheet microscopes by using a Gaussian beam for illumination while maintaining lateral and axial resolutions of 235 and 350 nm across a 260-um field of view after deconvolution. In doing so they rest on foundational principles of optical microscopy that what matters for lateral resolution is the numerical aperture of the detection objective and proper sampling of the image field on to the detection, and the axial resolution depends on the thickness of the light-sheet when it is thinner than the depth of field of the detection objective. This concept has unfortunately not been completely clear to users of high-resolution light-sheet microscopes and is thus a valuable demonstration. The microscope is controlled by an open-source software, Navigate, developed by the authors, and it is thus foreseeable that different versions of this system could be implemented depending on experimental needs while maintaining easy alignment and low cost. They demonstrate system performance successfully by characterizing their sheet, point-spread function, and visualization of sub-cellular structures in mammalian cells including microtubules, actin filaments, nuclei, and the Golgi apparatus.

      Weaknesses:

      There is still a fixation on comparison to the first-generation lattice light-sheet microscope, which has evolved significantly since then:

      (1) One of the major limitations of the first generation LLSM was the use of a 5 mm coverslip, which was a hinderance for many users. However, the Zeiss system elegantly solves this problem and so does Oblique Plane Microscopy (OPM), while the Altair-LSFM retains this feature which may dissuade widespread adoption. This limitation and how it may be overcome in future iterations is now discussed in the manuscript but remains a limitation in the currently implemented design.

      (2) Further, on the point of sample flexibility, all generations of the LLSM, and by the nature of its design the OPM, can accommodate live-cell imaging with temperature, gas, and humidity control. In the revised manuscript the authors now implement temperature control, but ideal live cell imaging conditions that would include gas and humidity control are not implemented. While, as the authors note, other microscopes that lack full environmental control have achieved widespread adoption, in my view this still limits the use cases of this microscope. There is no discussion on how this limitation of environmental control may be overcome in future iterations.

      (3) While the microscope is well designed and completely open source it will require experience with optics, electronics, and microscopy to implement and align properly. Experience with custom machining or soliciting a machine shop is also necessary. Thus, in my opinion it is unlikely to be implemented by a lab that has zero prior experience with custom optics or can hire someone who does. Altair-LSFM may not be as easily adaptable or implementable as the authors describe or perceive in any lab that is interested even if they can afford it. Claims on how easy it may be to align the system for a "Novice" in supplementary table 5, appear to be unsubstantiated and should be removed unless a Novice was indeed able to assemble and validate the system in 2 weeks. It seems that these numbers were just arbitrarily proposed in the current version without any testing. In our experience it's hard to predict how long an alignment will take for a novice.

      (4) There is no quantification on field uniformity and the tunability of the light sheet parameters (FOV, thickness, PSF, uniformity). There is no quantification on how much improvement is offered by the resonant and how its operation may alter the light-sheet power, uniformity and the measured PSF.

    3. Reviewer #3 (Public review):

      Summary:

      This manuscript introduces a high-resolution, open-source light-sheet fluorescence microscope optimized for sub-cellular imaging.

      The system is designed for ease of assembly and use, incorporating a custom-machined baseplate and in silico optimized optical paths to ensure robust alignment and performance.

      The important feature of the microscope is the clever and elegant adaptation of simple gaussian beams, smart beam shaping, galvo pivoting and high NA objectives to ensure a uniform thin light-sheet of around 400 nm in thickness, over a 266 micron wide Field of view, pushing the axial resolution of the system beyond the regular diffraction limited-based tradeoffs of light-sheet fluorescence microscopy.

      Compelling validation using fluorescent beads multicolor cellular imaging and dual-color live-cell imaging highlights the system's performance. Moreover, a very extensive and comprehensive manual of operation is provided in the form of supplementary materials. This provides a DIY blueprint for researchers that want to implement such a system, providing also estimate costs and a detailed description of needed expertises.

      Strengths:

      - Strong and accessible technical innovation.

      With an elegant combination of beam shaping and optical modelling, the authors provide a high resolution light-sheet system that overcomes the classical light-sheet tradeoff limit of thin light-sheet and small field of view. In addition, the integration of in silico modelling with a custom-machined baseplate is very practical and allows for ease of alignment procedures. Combining these features with the solid and super-extensive guide provided in the supplementary information, this provides a protocol for replicating the microscope in any other lab.

      - Impeccable optical performances and ease of mounting of samples

      The system takes advantage of the same sample-holding method seen already in other implementations, but reduces the optical complexity. At the same time, the authors claim to achieve similar lateral and axial resolution to Lattice-light-sheet microscopy (although without a direct comparison (see below in the "weaknesses" section). The optical characterization of the system is comprehensive and well-detailed. Additionally, the authors validate the system imaging sub-cellular structures in mammalian cells.

      -Transparency and comprehensiveness of documentation and resources.

      A very detailed protocol provides detailed documentation about the setup, the optical modeling and the total cost.

      Conclusion:

      Altair-LSFM represents a well-engineered and accessible light-sheet system that addresses a longstanding need for high-resolution, reproducible, and affordable sub-cellular light-sheet imaging. At this stage, I believe the manuscript makes a compelling case for Altair-LSFM as a valuable contribution to the open microscopy scientific community.

      Comments on revisions:

      I appreciate the details and the care expressed by the authors in answering all my concerns, both the bigger ones (lack of live cell imaging demonstration) and to the smaller ones (about data storage, costs, expertise needed, and so on). The manuscript has been greatly improved, and I have no other comments to make.

    1. Reviewer #1 (Public review):

      Summary:

      ZMAT3 is a p53 target gene that the Lal group and others have shown is important for p53-mediated tumor suppression, and which plays a role in the control of RNA splicing. In this manuscript Lal and colleagues perform quantitative proteomics of cells with ZMAT3 knockout and show that the enzyme hexokinase HKDC1 is the most upregulated protein. Mechanistically, the authors show that ZMAT3 does not appear to directly regulate the expression of HKDC1; rather, they show that the transcription factor c-JUN was strongly enriched in ZMAT3 pull-downs in IP-mass spec experiments, and they perform IP-western to demonstrate an interaction between c-JUN and ZMAT3. Importantly, the authors demonstrate, using ChIP-qPCR, that JUN is present at the HKDC1 gene (intron 1) in ZMAT3 WT cells, and showed markedly enhanced binding in ZMAT3 KO cells. The data best fit a model whereby p53 transactivates ZMAT3, leading to decreased JUN binding to the HKDC1 promoter, and altered mitochondrial respiration. The data are novel, compelling and very interesting.

      Comments on revisions:

      The authors have done a thorough job addressing my comments. This manuscript is quite strong and will be highly cited for its novelty and rigor.

    2. Reviewer #2 (Public review):

      Summary:

      The study elucidates the role of the recently discovered mediator of p53 tumor suppressive activity, ZMAT3. Specifically, the authors find that ZMAT3 negatively regulates HKDC1, a gene involved in the control of mitochondrial respiration and cell proliferation.

      Comments on revisions:

      The authors have mostly addressed to the concerns raised previously by this reviewer. The lack of functional assays made the reported findings mostly mechanistic with no clear biological context.

      The present manuscript is certainly improved compared to the previous version.

    1. Reviewer #2 (Public review):

      In the original review of this manuscript, I noted that this study provides the first evidence that alteration of the Hox code in neck lateral plate mesoderm is sufficient for ectopic forelimb budding. Their finding that ectopic expression of Hoxa6 or Hoxa7 induces wing budding at neck level, a demonstration of sufficiency, is of major significance. The experiments used to test the necessity of specific Hox genes for limb budding involved overexpression of dominant negative constructs, and there were questions about whether the controls were well designed. The reviewers made several suggestions for additional experiments that would address their concerns. In their responses to those comments, the authors indicated that they would conduct those experiments, and they acknowledged the requests for further discussion of a few points.

      In the revised version of the manuscript, the authors have provided additional RNA-seq data in Table 3, which lists 221 genes that are shared between the Hoxa6-induced limb bud and normal wing bud but not the neck. This shows that the ectopic limb bud has a limb-like character. The authors also expanded the discussion of their results in the context of previous work on the mouse. These changes have improved the paper.

      The authors elected not to conduct the co-transfection experiments that were suggested to test the ability of Hoxa4/a5 to block the limb-inducing ability of Hoxa6/a7. They also chose not to conduct the additional control experiments that were suggested for the dominant negative studies. The authors' justification for not conducting these experiments is provided in the responses to reviewers.

      The paper is improved over the previous version, but the conclusions, particularly regarding the dominant negative experiments, would have been strengthened by the additional experiments that were recommended by the reviewers. Under the current publishing model for eLife, it is the authors' prerogative to decide whether to revise in accordance with the reviewers' suggestions. Therefore, it seems to me that this version of the manuscript is the definitive version that the authors want to publish, and that eLife should publish it together with the reviewers' comments and the authors' responses.

    1. Reviewer #1 (Public review):

      Summary:

      In the manuscript submission by Zhao et al. entitled, "Cardiac neurons expressing a glucagon-like receptor mediate cardiac arrhythmia induced by high-fat diet in Drosophila" the authors assert that cardiac arrhythmias in Drosophila on a high fat diet is due in part to adipokinetic hormone (Akh) signaling activation. High fat diet induces Akh secretion from activated endocrine neurons, which activate AkhR in posterior cardiac neurons. Silencing or deletion of Akh or AkhR blocks arrhythmia in Drosophila on high fat diet. Elimination of one of two AkhR expressing cardiac neurons results in arrhythmia similar to high fat diet.

      Strengths:

      The authors propose a novel mechanism for high fat diet induced arrhythmia utilizing the Akh signaling pathway that signals to cardiac neurons.

    2. Reviewer #3 (Public review):

      Zhao et al. provide new insights into the mechanism by which a high-fat diet (HFD) induces cardiac arrhythmia employing Drosophila as a model. HFD induces cardiac arrhythmia in both mammals and Drosophila. Both glucagon and its functional equivalent in Drosophila Akh are known to induce arrhythmia. The study demonstrates that Akh mRNA levels are increased by HFD and both Akh and its receptor are necessary for high-fat diet-induced cardiac arrhythmia, elucidating a novel link. Notably, Zhao et al. identify a pair of AKH receptor-expressing neurons located at the posterior of the heart tube. Interestingly, these neurons innervate the heart muscle and form synaptic connections, implying their roles in controlling the heart muscle. The study presented by Zhao et al. is intriguing, and the rigorous characterization of the AKH receptor-expressing neurons would significantly enhance our understanding of the molecular mechanism underlying HFD-induced cardiac arrhythmia.

      Many experiments presented in the manuscript are appropriate for supporting the conclusions while additional controls and precise quantifications should help strengthen the authors' arguments. The key results obtained by loss of Akh (or AkhR) and genetic elimination of the identified AkhR-expressing cardiac neurons do not reconcile, complicating the overall interpretation.

      The most exciting result is the identification of AkhR-expressing neurons located at the posterior part of the heart tube (ACNs). The authors attempted to determine the function of ACNs by expressing rpr with AkhR-GAL4, which would induce cell death in all AkhR-expressing cells, including ACNs. The experiments presented in Figure 6 are not straightforward to interpret. Moreover, the conclusion contradicts the main hypothesis that elevated Akh is the basis of HFD-induced arrhythmia. The results suggest the importance of AkhR-expressing cells for normal heartbeat. However, elimination of Akh or AkhR restores normal rhythm in HFD-fed animals, suggesting that Akh and AkhR are not important for maintaining normal rhythms. If Akh signaling in ACNs is key for HFD-induced arrhythmia, genetic elimination of ACNs should unalter rhythm and rescue the HFD-induced arrhythmia. An important caveat is that the experiments do not test the specific role of ACNs. ACNs should be just a small part of the cells expressing AkhR. Specific manipulation of ACNs will significantly improve the study. Moreover, the main hypothesis suggests that HFD may alter the activity of ACNs in a manner dependent on Akh and AkhR. Testing how HFD changes calcium, possibly by CaLexA (Figure 2) and/or GCaMP, in wild-type and AkhR mutant could be a way to connect ACNs to HFD-induced arrhythmia. Moreover, optogenetic manipulation of ACNs may allow for specific manipulation of ACNs.

      Interestingly, expressing rpr with AkhR-GAL4 was insufficient to eliminate both ACNs. It is not clear why it didn't eliminate both ACNs. Given the incomplete penetrance, appropriate quantifications should be helpful. Additionally, the impact on other AhkR-expressing cells should be assessed. Adding more copies of UAS-rpr, AkhR-GAL4, or both may eliminate all ACNs and other AkhR-expressing cells. The authors could also try UAS-hid instead of UAS-rpr.

    1. Reviewer #1 (Public review):

      Summary:

      This work revisits a substantial part of the published literature in the field of Drosophila innate immunity from 1959 to 2011. The strategy has been to restrain the analysis to some 400 articles and then to extract a main claim, two to four major claims and up to four minor claims totaling some 2000 claims overall. The consistency of these claims with the current state-of-the-art has been evaluated and reported on a dedicated Web site known as ReproSci and also in the text as well as in the 28 Supplements that report experimental verification, direct or indirect, e.g., using novel null mutants unavailable at the time, of a selected set of claims made in several articles. Of note, this review is mostly limited to the manuscript and its associated supplements and does not integrally cover the ReproSci website.

      Strengths:

      One major strength of this article is that it tackles the issue of reproducibility/consistency on a large scale. Indeed, while many investigators have some serious doubts about some results found in the literature, few have the courage, or the means and time, to seriously challenge studies, especially if published by leaders in the field. The Discussion adequately states the major limitations of the ReproSci approach, which should be kept in mind by the reader to form their own opinion.

      This study also allows investigators not familiar with the field to have a clearer understanding of the questions at stake and to derive a more coherent global picture that allows them to better frame their own scientific questions. Besides a thorough and up-to-date knowledge of the literature used to assess the consistency of the claims with our current knowledge, a merit of this study is the undertaking of independent experiments to address some puzzling findings and the evidence presented is often convincing, albeit one should keep in mind the inherent limitations as several parameters are difficult to control, especially in the field of infections, as underlined by the authors themselves. Importantly, some work of the lead author has also been re-evaluated (Supplements S2-S4). Thus, while utmost caution should be exerted, and often is, in challenging claims, even if the challenge eventually proves to be not grounded, it is valuable to point out potential controversial issues to the scientific community.

      While this is not a point of this review, it should be acknowledged that the possibility to post comments on the ReproSci website will allow further readjustments by the community in the appreciation of the literature and also of the ReproSci assessments themselves and of its complementary additional experiments.

      Weaknesses:

      Challenging the results from articles is, by its very nature, a highly sensitive issue, and utmost care should be taken when challenging claims. While the authors generally acknowledge the limitations of their approach in the main text and Supplements, there are a few instances where their challenges remain questionable and should be reassessed. This is certainly the case for Supplement S18, for which the ReproSci authors make a claim for a point that was not made in the publication under scrutiny. The authors of that study (Ramet et al., Immunity, 2001) never claimed that scavenger receptor SR-CI is a phagocytosis receptor, but that it is required for optimal binding of S2 cells to bacteria. Westlake et al. here have tested for a role of this scavenger receptor in phagocytosis, which had not been tested by Ramet et al. Thus, even though the ReproSci study brings additional knowledge to our understanding of the function of SR-CI by directly testing its involvement in phagocytosis by larval hemocytes, it did not address the major point of the Ramet et al. study, SR-CI binding to bacteria, and thus inappropriately concludes in Supplement S18 that "Contrary to (Ramet et al., 2001, Saleh et al., 2006), we find that SR-CI is unlikely to be a major Drosophila phagocytic receptor for bacteria in vivo." It follows that the results of Ramet et al. cannot be challenged by ReproSci as it did not address this program. Of note, Saleh et al. (2006) also mistakenly stated that SR-CI impaired phagocytosis in S2 cells and could be used as a positive control to monitor phagocytosis in S2 cells. Their assay appears to have actually not monitored phagocytosis but the association of FITC-labeled bacteria to S2 cells by FACS, as they did not mention quenching the fluorescence of bacteria associated with the surface with Trypan blue.

      The inference method to assess the consistency of results with current knowledge also has limitations that should be better acknowledged. At times, the argument is made that the gene under scrutiny may not be expressed at the right time according to large-scale data or that the gene product was not detected in the hemolymph by a mass-spectrometry approach. While being in theory strong arguments, some genes, for instance, those encoding proteases at the apex of proteolytic activation cascades, need not necessarily be strongly expressed and might be released by a few cells. In addition, we are often lacking relevant information on the expression of genes of interest upon specific immune challenges such as infections with such and such pathogens.

      As regards mass spectrometry, there is always the issue of sensitivity that limits the force of the argument. Our understanding of melanization remains currently limited, and methods are lacking to accurately measure the killing activity associated with the triggering of the proPO activation cascade. In this study, the authors monitor only the blackening reaction of the wound site based on a semi-quantitative measurement. They are not attempting to use other assays, such as monitoring the cleavage of proPOs into active POs or measuring PO enzymatic activity. These techniques are sometimes difficult to implement, and they suffer at times from variability. Thus, caution should be exerted when drawing conclusions from just monitoring the melanization of wounds.

      Likewise, the study of phagocytosis is limited by several factors. As most studies in the field focus on adults, the potential role of phagocytosis in controlling Gram-negative bacterial infections is often masked by the efficiency of the strong IMD-mediated systemic immune response mediated by AMPs (Hanson et al, eLife, 2019). This problem can be bypassed in rare instances of intestinal infections by Gram-negative bacteria such as Serratia marcescens (Nehme et al., PLoS Pathogens, 2007) or Pseudomonas aeruginosa (Limmer et al. PNAS, 2011), which escape from the digestive tract into the hemocoel without triggering, at least initially, the systemic immune response. It is technically feasible to monitor bacterial uptake in adults by injecting fluorescently labeled bacteria and subsequently quenching the signal from non-ingested bacteria. Nonetheless, many investigators prefer to resort to ex vivo assays starting from hemocytes collected from third-instar wandering larvae as they are easier to collect and then to analyze, e.g., by FACS. However, it should be pointed out that these hemocytes have been strongly exposed to a peak of ecdysone, which may alter their properties. Like for S2 cells, it is thus not clear whether third-instar larval hemocytes faithfully reproduce the situation in adults. The phagocytic assays are often performed with killed bacteria. Evidence with live microorganisms is better, especially with pathogens. Assays with live bacteria require however, an antibody used in a differential permeabilization protocol. Furthermore, the killing method alters the surface of the microorganisms, a key property for phagocytic uptake. Bacterial surface changes are minimal when microorganisms are killed by X-ray or UV light. These limitations should be kept in mind when proceeding to inference analysis of the consistency of claims. Eater illustrates this point well. Westlake et al. state that:" [...] subsequent studies showed that a null mutation of eater does not impact phagocytosis". The authors refer here to Bretscher et al., Biology Open, 2015, in which binding to heat-killed E. coli was assessed in an ex vivo assay in third instar larvae. In contrast, Chung and Kocks (JBC, 2011) tested whether the recombinant extracellular N-terminal ligand-binding domain was able to bind to bacteria. They found that this domain binds to live Gram-positive bacteria but not to live Gram-negative bacteria. For the latter, killing bacteria with ethanol or heating, but not by formaldehyde treatment, allowed binding. More importantly, Chung and Kocks documented a complex picture in which AMPs may be needed to permeabilize the Gram-negative bacterial cell wall that would then allow access of at least the recombinant secreted Eater extracellular domain to peptidoglycan or peptidoglycan-associated molecules. Thus, the systemic Imd-dependent immune response would be required in vivo to allow Eater-dependent uptake of Gram-negative bacteria by adult hemocytes. In ex vivo assays, any AMPs may be diluted too much to effectively attack the bacterial membrane. A prediction is then that there should be an altered phagocytosis of Gram-negative bacteria in IMD-pathway mutants, e.g., an imd null mutant but not the hypomorphic imd[1] allele. This could easily be tested by ReproSci using the adult phagocytosis assay used by Kocks et al, Cell, 2005. At the very least, the part on the role of Eater in phagocytosis should take the Chung &Kocks study into account, and the conclusions modulated.

      Another point is that some mutant phenotypes may be highly sensitive to the genetic background, for instance, even after isogenization in two different backgrounds. In the framework of a Reproducibility project, there might be no other option for such cases than direct reproduction of the experiment as relying solely on inference may not be reliable enough.

      With respect to the experimental part, some minor weaknesses have been noted. The authors rely on survival to infection experiments, but often do not show any control experiments with mock-challenged or noninfected mutant fly lines. In some cases, monitoring the microbial burden would have strengthened the evidence. For long survival experiments, a check on the health status of the lines (viral microbiota, Wolbachia) would have been welcome. Also, the experimental validation of reagents, RNAi lines, or KO lines is not documented in all cases.

    2. Reviewer #2 (Public review):

      Summary:

      The authors present an ambitious and large-scale reproducibility analysis of 400 articles on Drosophila immunity published before 2011. They extract major and minor claims from each article, assess their verifiability through literature comparison and, when possible, through targeted experimental re-testing, and synthesize their findings in an openly accessible online database. The goal is to provide clarity to the community regarding claims that have been contradicted, incompletely supported, or insufficiently followed up in the literature, and to foster broader community participation in evaluating historical findings. The manuscript summarizes the major insights emerging from this systematic effort.

      Strengths:

      (1) Novelty and community value: This work represents a rare example of a systematic, transparent, and community-facing reproducibility project in a specific research domain. The creation of a dedicated public platform for disseminating and discussing these assessments is particularly innovative.

      (2) Breadth and depth: The authors analyze an impressive number of publications spanning multiple decades, and they couple literature-based assessments with new experimental data where follow-up is missing.

      (3) Clarity of purpose: The manuscript carefully distinguishes between assessing evidential support for claims and judging the scientific merit of historical work. This helps frame the project as constructive rather than punitive.

      (4) Metascientific relevance: The analysis identifies methodological and contextual factors that commonly underlie irreproducible claims, providing a useful guide for future study design and interpretation.

      (5) Transparency: Supplementary datasets and the public website provide an exceptional degree of openness, which should facilitate community engagement and further refinement.

      Weaknesses:

      (1) Subjectivity in selection: Despite the authors' efforts, the choice of which papers and claims to highlight cannot be entirely objective. This is an inherent limitation of any retrospective curation effort, but it remains important to acknowledge explicitly.

      (2) Emphasis on irreproducible claims: The manuscript focuses primarily on claims that are challenged or found to be weakly supported. While understandable from the perspective of novelty, this emphasis may risk overshadowing the value of claims that are well supported and reproducible.

      (3) Framing and language: Certain passages could benefit from more neutral phrasing and avoidance of binary terms such as "correct" or "incorrect," in keeping with the open-ended and iterative nature of scientific progress.

      (4) Community interaction with the dataset: While the website is an excellent resource, the manuscript could further clarify how the community is expected to contribute, challenge, or refine the annotations, especially given the large volume of supplementary data.

      (5) Minor inconsistency: The manuscript states that papers from 1959-2011 were included, but the Methods section mentions a range beginning in 1940. This should be aligned for clarity.

      Impact and significance:

      This contribution is likely to have a meaningful impact on both the Drosophila immunity community and the broader scientific ecosystem. It highlights methodological pitfalls, encourages transparent post-publication evaluation, and offers a reusable framework that other fields could adopt. The work also has pedagogical value for early-career researchers entering the field, who often struggle to navigate contradictory or outdated claims. By centralizing and contextualizing these discussions, the manuscript should help accelerate more robust and reproducible research.

    3. Reviewer #3 (Public review):

      Summary:

      In this ambitious study, the authors set out to analyse the validity of a number of claims, both minor and major, from 400 published articles within the field of Drosophila immunity that were published before 2011. The authors were able to determine initially if claims were supported by comparing them to other published literature in the field and, if required, by experimentally testing 'unchallenged' claims that had not been followed up in subsequent published literature. Using this approach, the authors identified a number of claims that had contradictory evidence using new methods or taking into account developments within the field post-initial publication. They put their findings on a publicly available website designed to enable the research community to assess published work within the field with greater clarity.

      Strengths:

      The work presented is rigorous and methodical, the data presentation is high quality, and importantly, the data presented support the conclusions. The discussion is balanced, and the study is written considerately and respectfully, highlighting that the aim of the study is not to assign merit to individual scientists or publications but rather to improve clarity for scientists across the field. The approach carried out by the researchers focuses on testing the validity of the claims made in the original papers rather than testing whether the original experimental methods produced reproducible results. This is an important point since there are many reasons why the original interpretation of data may have understandably led to the claims made. These potential explanations for irreproducible data or conclusions are discussed in detail by the authors for each claim investigated.

      The authors have generated an accompanying website, which provides a valuable tool for the Drosophila Immunity research community that can be used to fact-check key claims and encourages community engagement. This will achieve one important goal of this study - to prevent time loss for scientists who base their research on claims that are irreproducible. The authors rightly point out that it is impossible (and indeed undesirable) to avoid publication of irreproducible results within a field since science is 'an exploratory process where progress is made by constant course correction'. This study is, however, an important piece of work that will make that course correction more efficient.

      Weaknesses:

      I have little to recommend for the improvement of this manuscript. As outlined in my comments above, I am very supportive of this manuscript and think it is a bold and ambitious body of work that is important for the Drosophila immunity field and beyond.

    4. Reviewer #4 (Public review):

      This is an important paper that can do much to set an example for thoughtful and rigorous evaluation of a discipline-wide body of literature. The compiled website of publications in Drosophila immunity is by itself a valuable contribution to the field. There is much to praise in this work, especially including the extensive and careful evaluation of the published literature. However, there are also cautions.

      One notable concern is that the validation experiments are generally done at low sample sizes and low replication rates, and often lack statistical analysis. This is slippery ground for declaring a published study to be untrue. Since the conclusions reported here are nearly all negative, it is essential that the experiments be performed with adequate power to detect the originally described effects. At a minimum, they should be performed with the same sample size and replication structure as the originally reported studies.

      The first section of Results should be an overview of the general accuracy of the literature. Of all claims made in the 400 evaluated papers, what proportion fell into each category of "verified", "unchallenged", "challenged", "mixed", or "partially verified"? This summary overview would provide a valuable assessment of the field as a whole. A detailed dispute of individual highlighted claims could follow the summary overview.

      Section headings are phrased as declarative statements, "Gene X is not involved in process Y", which is more definitive phrasing than we typically use in scientific research. It implies proving a negative, which is difficult and rare, and the evidence provided in the present manuscript generally does not reach that threshold. A more common phrasing would be "We find no evidence that gene X contributes to process Y". A good model for this more qualified phrasing is the "We conclude that while Caspar might affect the Imd pathway in certain tissue-specific contexts, it is unlikely to act as a generic negative regulator of the Imd pathway," concluding the section on the role of Caspar. I am sure the authors feel that the softer, more qualified phrasing would undermine their article's goal of cleansing the literature of inaccuracies, but the hard declarative 'never' statements are difficult to justify unless every validation experiment is done with a high degree of rigor under a variety of experimental conditions. This caveat is acknowledged in the 3rd paragraph of the Discussion, but it is not reflected in the writing of the Results. The caveat should also appear in the Introduction.

      The article is clear that "Claims were assessed as verified, unchallenged, challenged, mixed, or partially verified," but the project is called "reproducibility project" in the 7th line of the abstract, and the website is "ReproSci". The fourth line of the abstract and the introduction call some published research "irreproducible". Most of the present manuscript does not describe reproduction or replication. It describes validation, or independent experimental tests for consistency. Published work is considered validated if subsequent studies using distinct approaches yielded consistent results. For work that the authors consider suspicious, or that has not been subsequently tested, the new experiments provided here do not necessarily recreate the published experiment. Instead, the published result is evaluated with experiments that use different tools or methods, again testing for consistency of results. This is an important form of validation, but it is not reproduction, and it should not be referred to as such. I strongly suggest that variations of the words "reproducible" or "replication" be removed from the manuscript and replaced with "validation". This will be more scientifically accurate and will have the additional benefit of reducing the emotional charge that can be associated with declaring published research to be irreproducible.

      The manuscript includes an explanatory passage in the Results section, "Our project focuses on assessing the strength of the claims themselves (inferential/indirect reproducibility) rather than testing whether the original methods produce repeatable results (results/direct reproducibility). Thus, our conclusions do not directly challenge the initial results leading to a claim, but rather the general applicability of the claim itself." Rather than first appearing in Results, this statement should appear prominently in the abstract and introduction because it is a core element of the premise of the study. This can be combined with the content of the present Disclaimer section into a single paragraph in the Introduction instead of appearing in two redundant passages. I would again encourage the authors to substitute the word validation for reproduction, which would eliminate the need for the invented distinction between indirect versus direct reproduction. It is notable that the authors have chosen to title the relevant Methods section "Experimental Validation" and not "Replication".

      Experimental data "from various laboratories" in the last paragraph of the Introduction and the first paragraph of the Results are ambiguous. Since these new experiments are part of the central core of the manuscript, the specific laboratories contributing them should be named in the two paragraphs. If experiments are being contributed by all authors on the manuscript, it would suffice to say "the authors' laboratories". The attribution to "various labs" appears to be contradicted by the Discussion paragraph 2, which states "the host laboratory has expertise in" antibacterial and antifungal defense, implying a single lab. The claim of expertise by the lead author's laboratory is unnecessary and can be deleted if the Lemaitre lab is the ultimate source of all validation experiments.

      The passage on the controversial role of Duox in the gut is balanced and scholarly, and stands out for its discussion of multiple alternative lines of evidence in the published literature and supplement. This passage may benefit from research by multiple groups following up on the original claims that are not available for other claims, but the tone of the Duox section can be a model for the other sections.

      Comments on other sections and supplements:

      I understand the desire to explain how original results may have been obtained when they are not substantiated by subsequent experiments. However, statements such as "The initial results may have been obtained due to residual impurities in preparations of recombinant GNBP1" and "Non-replicable results on the roles of Spirit, Sphinx and Spheroide in Toll pathway activation may be due to off-target effects common to first-generation RNAi tools" are speculation. No experimental data are presented to support these assertions, so these statements and others like them (currently at the end of most "insights" sections) should not appear in Results. I recognize that the authors are trying to soften their criticism of prior studies by providing explanations for how errors may have occurred innocently. If they wish to do so, the speculative hypotheses should appear in the Discussion.

      The statement in Results that "The initial claim concerning wntD may be explained by a genetic background effect independent of wntD" similarly appears to be a speculation based on the reading of the main text Results. However, the Discussion clarifies that "Here, we obtained the same results as the authors of the claim when using the same mutant lines, but the result does not stand when using an independent mutant of the same gene, indicating the result was likely due to genetic background." That additional explanation in the Discussion greatly increases reader confidence in the Result and should be explained with reference to S5 in the Results. Such complete explanations should be provided everywhere possible without requiring the reader to check the Supplement in each instance.

      In some cases, such as "The results of the initial papers are likely due to the use of ubiquitous overexpression of PGRP-LE, resulting in melanization due to overactivation of the Imd pathway and resulting tissue damage", the claim to explain the original finding would be easy to test. The authors should perform those tests where they can, if they wish to retain the statements in the manuscript. Similarly, the claim "The published data are most consistent with a scenario in which RNAi generated off-target knockdown of a protein related to retinophilin/undertaker, while Undertaker itself is unlikely to have a role in phagocytosis" would be stronger if the authors searched the Drosophila genome for a plausible homolog that might have been impacted by the RNAi construct, and then put forth an argument as to why the off-target gene is more likely to have generated the original phenotype than the nominally targeted gene. There is a brief mention in S19 that junctophilin is the authors' preferred off-target candidate, but no evidence or rationale is presented to support that assertion. If the original RNAi line is still available, it would be easy enough to test whether junctophilin is knocked down as an off-target, and ideally then to use an independent knockdown of junctophilin to recapitulate the original phenotype. Otherwise, the off-target knockdown hypothesis is idle speculation.

      A good model is the passage on extracellular DNA, which states, "experiments performed for ReproSci using the original DNAse IIlo hypomorph show that elevated Diptericin expression in the hypomorph is eliminated by outcrossing of chromosome II, and does not occur in an independent DNAse II null mutant, indicating that this effect is due to genetic background (Supplementary S11)." In this case, the authors have performed a clear experiment that explains the original finding, and inclusion of that explanation is warranted. Similar background replacement experiments in other validations are equally compelling.

      The statement "Analysis of several fly stocks expected to carry the PGRP-SDdS3 mutation used in the initial study revealed the presence of a wild-type copy PGRP-SD, suggesting that either the stock used in this study did not carry the expected mutation, or that the mutation was lost by contamination prior to sharing the stock with other labs" provides a documentable explanation of a potential error in the original two manuscripts, but the subsequent "analysis of several fly stocks" needs citations to published literature or explanation in the supplement. It is unclear from this passage how the wildtype allele in the purportedly mutant stocks could have led to the misattribution of function to PGRP-SD, so that should be explained more clearly in the manuscript.

      The originally claimed anorexia of the Gr28b mutation is explained as having been "likely obtained due to comparison to a wild-type line with unusually high feeding rates". This claim would be stronger if the wildtype line in question were named and data showing a high rate of feeding were presented in the supplement or cited from published literature. Otherwise, this appears to be speculation.

      In the section "The Toll immune pathway is not negatively regulated by wntD", FlyAtlas is cited as evidence that wntD is not expressed in adult flies. However, the FlyAtlas data is not adequately sensitive to make this claim conclusively. If the present authors wish to state that wntD is not expressed in adults, they should do a thorough test themselves and report it in the Supplement.

      Alternatively, the statement "data from FlyAtlas show that wntD is only expressed at the embryonic stage and not at the adult stage at which the experiments were performed by (Gordon et al., 2005a)" could be rephrased to something like "data from FlyAtlas show strong expression of wntD in the embryo but not the adult" and it should be followed by a direct statement that adult expression was also found to be near-undetectable by qPCR in supplement S5. That data is currently "not shown" in the supplement, but it should be shown because this is a central result that is being used to refute the original claim. This manuscript passage should also describe the expression data described in Gordon et al. (2005), for contrast, which was an experimental demonstration of expression in the embryo and a claim "RT-PCR was used to confirm expression of endogenous wntD RNA in adults (data not shown)."

      Inclusion of the section on croquemort is curious because it seems to be focused exclusively on clearance of apoptotic cells in the embryo, not on anything related to immunity. The subsection is titled "Croquemort is not a phagocytic engulfment receptor for apoptotic cells or bacteria", but the text passage contains no mention of phagocytosis of bacteria, and phagocytosis of bacteria is not tested in the S17 supplement. I would suggest deleting this passage entirely if there is not going to be any discussion of the immune-related phenotypes.

      The claim "Toll is not activated by overexpression of GNBP3 or Grass: Experiments performed for ReproSci find that contrary to previous reports, overexpression of GNBP3 (Gottar et al., 2006) or<br /> Grass (El Chamy et al., 2008) in the absence of immune challenge does not effectively activate Toll signaling (Supplementaries S6, S7)" is overly strongly stated unless the authors can directly repeat the original published studies with identical experimental conditions. In the absence of that, the claim in the present manuscript needs to be softened to "we find no evidence that..." or something similar. The definitive claim "does not" presumes that the current experiments are more accurate or correct than the published ones, but no explanation is provided as to why that should be the case. In the absence of a clear and compelling argument as to why the current experiment is more accurate, it appears that there is one study (the original) that obtained a certain result and a second study (the present one) that did not. This can be reported as an inconsistency, but the second experiment does not prove that the first was an error. The same comment applies to the refutation of the roles for Edin and IRC. Even though the current experiments are done in the context of a broader validation study, this does not automatically make them more correct. The present work should adhere to the same standards of reporting that we expect in any other piece of science.

      The statement "Furthermore, evidence from multiple papers suggests that this result, and other instances where mutations have been found to specifically eliminate Defensin expression, is likely due to segregating polymorphisms within Defensin that disrupt primer binding in some genetic backgrounds and lead to a false negative result (Supplementary S20)" should include citations to the multiple papers being referenced. This passage would benefit from a brief summary of the logic presented in S20 regarding the various means of quantifying Defensin expression.

      In S22 Results, the statement "For general characterization of the IrcMB11278 mutant, including developmental and motor defects and survival to septic injury, see additional information on the ReproSci website" is not acceptable. All necessary information associated with the paper needs to be included in the Supplement. There cannot be supporting data relegated to an independent website with no guaranteed stability or version control. The same comment applies to "Our results show that eiger flies do not have reduced feeding compared to appropriate controls (See ReproSci website)" in S25.

      Supplement S21 appears to show a difference between the wildtype and hemese mutants in parasitoid encapsulation, which would support the original finding. However, the validation experiment is performed at a small sample size and is not replicated, so there can be no statistical analysis. There is no reported quantification of lamellocytes or total hemocytes. The validation experiment does not support the conclusion that the original study should be refuted. The S21 evaluation of hemese must either be performed rigorously or removed from the Supplement and the main text.

      In S22, the second sentence of the passage "Due to the fact that IrcMB11278 flies always survived at least 24h prior to death after becoming stuck to the substrate by their wings, we do not attribute the increased mortality in Ecc15-fed IrcMB11278 flies primarily to pathogen ingestion, but rather to locomotor defects. The difference in survival between sucrose-fed and Ecc15-fed IrcMB11278 flies may be explained by the increased viscosity of the Ecc15-containing substrate compared to the sucrose-containing substrate" is quite strange. The first sentence is plausible and a reasonable interpretation of the observations. But to then conclude that the difference between the bacterial treatment versus the control is more plausibly due to substrate viscosity than direct action of the bacteria on the fly is surprising. If the authors wish to put forward that interpretation, they need to test substrate viscosity and demonstrate that fly mortality correlates with viscosity. Otherwise, they must conclude that the validation experiment is consistent with the original study.

      In S27, the visualization of eiger expression using a GFP reporter is very non-standard as a quantitative assay. The correct assay is qPCR, as is performed in other validation experiments, and which can easily be done on dissected fat body for a tissue-specific analysis. S27 Figure 1 should be replaced with a proper experiment and quantitative analysis. In S27 Figure 2, the authors should add a panel showing that eiger is successfully knocked down with each driver>construct combination. This is important because the data being reported show no effect of knockdown; it is therefore imperative to show that the knockdown is actually occurring. The same comment applies everywhere there is an RNAi to demonstrate a lack of effect.

      The Drosomycin expression data in S3 Figure 2A look extremely noisy and are presented without error bars or statistical analysis. The S4 claim that sphinx and spheroid are not regulators of the Toll pathway because quantitative expression levels of these genes do not correlate with Toll target expression levels is an extremely weak inference. The RNAi did not work in S4, so no conclusion should be inferred from those experiments. Although the original claims in dispute may be errors in both cases, the validation data used to refute the original claims must be rigorous and of an acceptable scientific standard.

      In S6 Figure 1, it is inappropriate to plot n=2 data points as a histogram with mean and standard errors. If there are fewer than four independent points, all points should be plotted as a dot plot. This comment applies to many qPCR figures throughout the supplement. In S7 Figure 1, "one representative experiment" out of two performed is shown. This strongly suggests that the two replicates are noisy, and a cynical reader might suspect that the authors are trying to hide the variance. This also applies to S5 Fig 3. Particularly in the context of a validation study, it is imperative to present all data clearly and objectively, especially when these are the specific data that are being used to refute the claim.

      Other comments:

      In S26, the authors suggest that much of the observed melanization arises from excessive tissue damage associated with abdominal injection contrasted to the lesser damage associated with thoracic injection. I believe there may be a methodological difference here. The Methods of S27 are not entirely clear, but it appears that the validation experiment was done with a pinprick, whereas the original Mabary and Schneider study was done with injection via a pulled capillary. My lab group (and I personally) have extensive experience with both techniques. In our hands, pinpricks to the abdomen do indeed cause substantial injury, and the physically less pliable thorax is more robust to pinpricks. However, capillary injections to the abdomen do virtually no tissue damage - very probably less than thoracic injections - and result in substantially higher survivals of infection even than thoracic injections. Thus, the present manuscript may infer substantial tissue damage in the original study because they are employing a different technique.

    1. Reviewer #1 (Public review):

      Summary:

      This paper describes an application of the high-resolution cryo-EM 2D template matching technique to sub-50kDa complexes. The paper describes how density for ligands can be reconstructed without having to process cryo-EM data through the conventional single particle analysis pipelines.

      Strengths:

      This paper contributes additional data (alongside other papers by the same authors) to convey the message that high-resolution 2D template matching is a powerful alternative for cryo-EM structure determination. The described application to ligand density reconstruction, without the need for extensive refinements, will be of interest to the pharmaceutical industry, where often multiple structures of the same protein in complex with different ligands are solved as part of their drug development pipelines. Improved insights into which particles contribute to the best ligand density are also highly valuable and transferable to other applications of the same technique.

      Weaknesses:

      Although the convenient visualisation of small molecules bound to protein targets of a known structure would be relevant for the pharmaceutical industry, the evidence described for the claim that this technique "significantly" improves alignment of reconstruction of small complexes is incomplete. The authors are encouraged to better evaluate the effects of model bias on the reconstructed densities in a revised paper.

    2. Reviewer #2 (Public review):

      In this manuscript, Zhang et al describe a method for cryo-EM reconstruction of small (sub-50kDa) complexes using 2D template matching. This presents an alternative, complementary path for high-resolution structure determination when there is a prior atomic model for alignment. Importantly, regions of the atomic model can be deleted to avoid bias in reconstructing the structure of these regions, serving as an important mechanism of validation.

      The manuscript focuses its analysis on a recently published dataset of the 40kDa kinase complex deposited to EMPIAR. The original processing workflow produced a medium resolution structure of the kinase (GSFSC ~4.3A, though features of the map indicate ~6-7A resolution); at this resolution, the binding pocket and ligand were not resolved in the original published map. With 2DTM, the authors produce a much higher resolution structure, showing clear density for the ATP binding pocket and the bound ATP molecule. With careful curation of the particle images using statistically derived 2DTM p-values, a high-resolution 2DTM structure was reconstructed from just 8k particles (2.6A non-gold standard FSC; ligand Q-score of 0.6), in contrast to the 74k particles from the original publication. This aligns with recent trends that fewer, higher-quality particles can produce a higher-quality structure. The authors perform a detailed analysis of some of the design choices of the method (e.g., p-value cutoff for particle filtering; how large a region of the template to delete).

      Overall, the workflow is a conceptually elegant alternative to the traditional bottom-up reconstruction pipeline. The authors demonstrate that the p-values from 2DTM correlations provide a principled way to filter/curate which particle images to extract, and the results are impressive. There are only a few minor recommendations that I could make for improvement.

    3. Reviewer #3 (Public review):

      Summary:

      Due to the low SNR of cryo-EM micrographs necessitated by radiation damage, determining the structure of proteins smaller than 50 kDa is exceedingly challenging, such that only a handful have been solved to date. This work aims to improve the reconstruction of small proteins in single-particle cryo-EM by using high-resolution 2D template matching, an algorithm previously used to locate and align macromolecules in situ, to align and reconstruct small proteins. This approach uses an existing macromolecular structure, either experimentally determined or predicted by AlphaFold, to simulate a noise-free 3D reference and generates whitened projections, crucially including high-spatial-frequency information, to align particles by the orientation with maximal cross-correlation. They demonstrate the success of this approach by generating a 3D reconstruction from an existing dataset of a 41.3 kDa protein kinase that had previously evaded attempts at high-resolution structure determination. To alleviate concerns that this is purely from template bias, they demonstrate clear density at two regions that were not present in the template: 6 residues in an alpha helix and an ATP in the ligand binding pocket. The latter is particularly important for its implications in determining structures of ligand-bound proteins for drug discovery. Additionally, the authors provide an update to the classic calculation in Henderson 1995 to predict the minimum molecular mass of a protein that can be solved by single-particle cryo-EM.

      Strengths:

      I am in no doubt that this technique can be used to gain valuable insights into the structures of small proteins, and this is an important advancement for the field. The ability to determine the structure of ligands in a binding site is particularly important, and this paper provides a method of doing that which outperforms traditional single-particle cryo-EM processing workflows.

      The claim that using high-spatial frequency information is essential for aligning small proteins is a valuable insight. A recent pre-print published at a similar time to this manuscript used high-resolution information in standard ab-initio reconstruction to generate a high-resolution reconstruction from the same dataset, supporting the claims made in the manuscript.

      The theoretical section outlined in the appendix is also theoretically sound. It uses the same logic as Henderson, but applies more up-to-date knowledge, such as incorporating dose-weighting and altering the cross-correlation-based noise estimation. This update is valuable for understanding factors preventing us from reaching the theoretical limit.

      Weaknesses:

      Given that this technique creates template bias, only parts of the reconstruction not in the template can be trusted, unlike standard single-particle processing, where the independent half-maps from separate, ab initio templates are used to generate a 3D reconstruction. Although, in principle, one could perform the search many times such that every residue has been omitted in at least one search, this will be extremely computationally intensive and was not demonstrated in this manuscript. It is therefore currently only realistically applicable when only a small portion of the sub-50 kDa protein is of interest.

      The applicability of this technique to more than a single target was also not demonstrated, and there are concerns that it may not work effectively in many cases. The authors note in the results that "the ATP density was consistently recovered more robustly than nearby residues" and speculate that this may be because misalignments disproportionately blur peripheral residues. Since the region of interest in a structure is not necessarily in the center, this may need further investigation. The implications of this statement may also be unclear to the reader. For example, can this issue be minimized by having the region of interest centered in the simulated volume?

      In Figure 3, the authors demonstrate that it is not solely improved particle filtering and a noise-free reference that improves alignment, but that the high spatial frequency information is important. This information is very valuable since it can be applied to other, more standard methods. However, this key figure is not as clear or convincing as it could be. The FSC curves are possibly misleading, since the reduced resolution could be explained by reduced template bias when auto-refining with a map initially low-pass filtered to 10 Å. Moreover, although the helix reconstruction does look slightly better using the 2DTM angles, the improvement in density for ATP in the binding pocket is not clear. A qualitative argument only clear in one out of two cases is not as convincing as a quantitative metric across more examples.

    1. Reviewer #1 (Public review):

      Summary:

      Fungal survival and pathogenicity rely on the ability to undergo reversible morphological transitions, which is often linked to nutrient availability. In this study, the authors uncover a conserved connection between glycolytic activity and sulfur amino acid biosynthesis that drives morphogenesis in two fungal model systems. By disentangling this process from canonical cAMP signaling, the authors identify a new metabolic axis that integrates central carbon metabolism with developmental plasticity and virulence.

      Strengths:

      The study integrates different experimental approaches, including genetic, biochemical, transcriptomic and morphological analyses and convincingly demonstrates that perturbations in glycolysis alters sulfur metabolic pathways and thus impacts pseudohyphal and hyphal differentiation. Overall, this work offers new and important insights into how metabolic fluxes are intertwined with fungal developmental programs and therefore opens new perspectives to investigate morphological transitioning in fungi.

      Importantly, in the revised version the authors now substantiate the transcriptomic findings by RT-qPCR analyses in the pfk1ΔΔ and adh1ΔΔ strains, demonstrating that genetic disruption of glycolytic flux generally mirrors the effects of 2-deoxyglucose treatment. The manuscript's discussion has also been strengthened by explicitly addressing why cysteine and methionine differ in their ability to rescue filamentation in S. cerevisiae versus C. albicans, highlighting species-specific differences in sulfur uptake and transsulfuration pathways.

      Overall, this revised manuscript provides compelling evidence for a previously unrecognized coupling between glycolysis and sulfur metabolism that shapes fungal morphogenesis and virulence. It opens new perspectives on metabolic control of fungal development and raises interesting mechanistic questions for future work.

      Comments on revisions:

      The authors have incorporated all of my suggested changes and addressed all raised concerns.

    2. Reviewer #2 (Public review):

      Summary:

      The manuscript investigates the interplay between glycolysis and sulfur metabolism in regulating fungal morphogenesis and virulence. Using both Saccharomyces cerevisiae and Candida albicans, the authors demonstrate that glycolytic flux is essential for morphogenesis under nitrogen-limiting conditions, acting independently of the established cAMP-PKA pathway. Transcriptomic and genetic analyses reveal that glycolysis influences the de novo biosynthesis of sulfur-containing amino acids, specifically cysteine and methionine. Notably, supplementation with sulfur sources restores morphogenetic and virulence defects in glycolysis-deficient mutants, thereby linking core carbon metabolism with sulfur assimilation and fungal pathogenicity.

      Strengths:

      The work identifies a previously uncharacterized link between glycolysis and sulfur metabolism in fungi, bridging metabolic and morphogenetic regulation which is an important conceptual advance and fungal pathogenicity. Demonstrating that adding cysteine supplementation rescues virulence defects in animal model connects basic metabolism to infection outcomes that add on biomedical importance.

      Comments on revisions:

      The authors have sufficiently addressed my concern and provided a clear justification for their proposed model including the limitations of performing the mechanistic assays at this stage. I am satisfied with the response and have no further comments

    3. Reviewer #3 (Public review):

      This study investigates the connection between glycolysis and the biosynthesis of sulfur-containing amino acids in controlling fungal morphogenesis, using Saccharomyces cerevisiae and C. albicans as model organisms. The authors identify a conserved metabolic axis that integrates glycolysis with cysteine/methionine biosynthetic pathways to influence morphological transitions. This work broadens the current understanding of fungal morphogenesis, which has largely focused on gene regulatory networks and cAMP-dependent signaling pathways, by emphasizing the contribution of metabolic control mechanisms.

      Strengths:

      The delineation of how glycolytic flux regulates fungal morphogenesis through a cAMP-independent mechanism is an advancement. The coupling of glycolysis with the de novo biosynthesis of sulfur-containing amino acids, a requirement for morphogenesis, introduces a novel and unexpected layer of regulation.

      Demonstrating this mechanism in both S. cerevisiae and C. albicans strengthens the argument for its evolutionary conservation and biological importance.

      The ability to rescue the morphogenesis defect through supplementation of sulfur-containing amino acids provides a functional validation.

      Weaknesses:

      cAMP addition rescued the pseudohyphal differentiation defect exhibited by the ΔΔgpa2 strain. More clarity is needed on how this mechanism is mechanistically distinct from the metabolic control - whether cAMP acts in parallel or downstream to sulfur-containing amino acids biosynthesis has to be characterized. Supplementation of cysteine and methionine bypasses glycolytic regulation; the link between these amino acids and their role in fungal morphogenesis is not completely characterized.

      The demonstrated link between glycolysis and sulfur amino acid biosynthesis, along with its implications for virulence in C. albicans, is important for understanding fungal adaptation, as mentioned in the article; however, the downstream effects of Met4 activation were not fully characterized. How does Cysteine/Methionine rescue morphogenesis? The author's response figure 1 shows that there are no significant transcriptional changes in the expression of cAMP-PKA pathway-associated genes, which alone could not completely explain the role of gpa2 in morphogenesis, because exogenous cAMP can restore pseudohyphal differentiation in the ΔΔgpa2 background (Revised Fig. 1L). This implies that gpa2's function in morphogenesis is an additional, or possibly a metabolic or post-transcriptional, layer of regulation, and its connection to sulfur-containing amino acids remains to be elucidated.

    1. Joint Public Review:

      In this work, the authors present DeepTX, a computational tool for studying transcriptional bursting using single-cell RNA sequencing (scRNA-seq) data and deep learning. The method aims to infer transcriptional burst dynamics-including key model parameters and the associated steady-state distributions-directly from noisy single-cell data. The authors apply DeepTX to datasets from DNA damage experiments, revealing distinct regulatory patterns: IdU treatment in mouse stem cells increases burst size, promoting differentiation, while 5FU alters burst frequency in human cancer cells, driving apoptosis or survival depending on dose. These findings underscore the role of burst regulation in mediating cell fate responses to DNA damage.

      The main strength of this study lies in its methodological contribution. DeepTX integrates a non-Markovian mechanistic model with deep learning to approximate steady-state mRNA distributions as mixtures of negative binomial distributions, enabling genome-scale parameter inference with reduced computational cost. The authors provide a clear discussion of the framework's assumptions, including reliance on steady-state data and the inherent unidentifiability of parameter sets, and they outline how the model could be extended to other regulatory processes.

      The revised manuscript addresses the original concerns raised by the reviewers, particularly those related to sample size requirements, distributional assumptions, and the biological interpretation of the inferred parameters. The authors have also included an extensive discussion of the limitations of the methodological framework, including the constraints associated with relying on snapshot data, as well as a broader contextualisation of DeepTX within the landscape of existing tools that link mechanistic modelling and single-cell transcriptomics.

      Overall, this work represents a valuable contribution to the integration of mechanistic models with high-dimensional single-cell data. It will be of interest to researchers in systems biology, bioinformatics, and computational modelling.

      Comments on revisions:

      We thank the authors for their thorough revision and for carefully addressing the points raised in the previous review. At this stage, the reviewers have no further concerns.

    1. Reviewer #1 (Public review):

      Summary:

      The paper by Graff et al. investigates the function of foxf2 in zebrafish to understand the progression of cerebral small vessel disease. The authors use a partial loss of foxf2 (zebrafish possess two foxf2 genes, foxf2a and foxf2b, and the authors mainly analyze homozygous mutants in foxf2a) to investigate the role of foxf2 signaling in regulating pericyte biology. The find that the number of pericytes is reduced in foxf2a mutants and that the remaining pericytes display alterations in their morphologies. The authors further find that mutant animals can develop to adulthood but that in adult animals, both endothelial and pericyte morphologies are affected. They also show that mutant pericytes can partially repopulate the brain after genetic ablation.

      Strengths:

      The paper is well written and easy to follow. The authors now include pericyte marker gene analysis and solid quantifications of the observed phenotypes.

      Weaknesses:

      None left.

    2. Reviewer #2 (Public review):

      Summary:

      This study investigates the developmental and lifelong consequences of reduced foxf2 dosage in zebrafish, a gene associated with human stroke risk and cerebral small vessel disease (CSVD). The authors show that a ~50% reduction in foxf2 function through homozygous loss of foxf2a leads to a significant decrease in brain pericyte number, along with striking abnormalities in pericyte morphology-including enlarged soma and extended processes-during larval stages. These defects are not corrected over time but instead persist and worsen with age, ultimately affecting the surrounding endothelium. The study also makes an important contribution by characterizing pericyte behavior in wild-type zebrafish using a clever pericyte-specific Brainbow approach, revealing novel interactions such as pericyte process overlap not previously reported in mammals.

      Strengths:

      This work provides mechanistic insight into how subtle, developmental changes in mural cell biology and coverage of the vasculature can drive long-term vascular pathology. The authors make strong use of zebrafish imaging tools, including longitudinal analysis in transgenic lines to follow pericyte number and morphology over larval development and then applied tissue clearing and whole brain imaging at 3 and 11 months to further dissect the longitudinal effects of foxf2a loss. The ability to track individual pericytes in vivo reveals cell-intrinsic defects and process degeneration with high spatiotemporal resolution. Their use of a pericyte-specific Zebrabow line also allows, for the first time, detailed visualization of pericyte-pericyte interactions in the developing brain, highlighting structural features and behaviors that challenge existing models based on mouse studies. Together, these findings make the zebrafish a valuable model for studying the cellular dynamics of CSVD.

      Weaknesses:

      I originally suggested quantifying pericyte coverage across brain regions to address potential lineage-specific effects due to the distinct developmental origins of forebrain (neural crest-derived) and hindbrain (mesoderm-derived) pericytes. However, I appreciate the authors' response referencing recent work from their lab (Ahuja, 2024), which demonstrates that both neural crest and mesoderm contribute to pericyte lineages in the midbrain and hindbrain. The convergence of these lineages into a shared transcriptional state by 30 hpf, as shown by their single-cell RNA-seq data, makes it unlikely that regional quantification would provide meaningful lineage-specific insight. I agree with the authors that lineage tracing experiments often suffer from low sample sizes, and their updated findings challenge earlier compartmental models of pericyte origin. I therefore appreciate their rationale for not pursuing regional quantification and consider this concern addressed. Furthermore, my other two points regarding quantification of foxf2 levels and overall vascular changes have been thoroughly addressed in the revised manuscript. These additions significantly strengthen the paper's conclusions and improve the overall rigor of the study.

    3. Reviewer #3 (Public review):

      Summary:

      The goal of the work by Graff, et al. is to model CSVD in the zebrafish using foxf2a mutants. The mutants show loss of cerebral pericyte coverage that persists through adulthood, but it seems foxf2a does not regulate the regenerative capacity of these cells. The findings are interesting and build on previous work from the group. Limitations of the work include little mechanistic insight into how foxf2a alters pericyte recruitment/differentiation/survival/proliferation in this context, and the overlap of these studies with previous work in fox2a/b double mutants. However, the data analysis is clean and compelling and the findings will contribute to the field.

      Comments on revisions:

      The authors have addressed all of my original concerns.

    1. Reviewer #1 (Public review):

      Summary:

      The manuscript presents a robust set of experiments that provide new insights into the role of STN neurons during active and passive avoidance tasks. These forms of avoidance have received comparatively less attention in the literature than the more extensively studied escape or freezing responses, despite being extremely relevant to human behaviour and more strongly influenced by cognitive control.

      Strengths:

      Understanding the neural infrastructure supporting avoidance behaviour would be a fundamental milestone in neuroscience. The authors employ sophisticated methods to delineate the role of STN neurons during avoidance behaviours. The work is thorough and the evidence presented is compelling. Experiments are carefully constructed, well-controlled, and the statistical analyses are appropriate.

    2. Reviewer #2 (Public review):

      Summary:

      Zhou, Sajid et al. present a study investigating the STN involvement in signaled movement. They use fiber photometry, implantable lenses, and optogenetics during active avoidance experiments to evaluate this. The data are useful for the scientific community and the overall evidence for their claims is solid, but many aspects of the findings are confusing. The authors present a huge collection of data, it is somewhat difficult to extract the key information and the meaningful implications resulting from these data.

      Strengths:

      The study is comprehensive in using many techniques and many stimulation powers and frequencies and configurations.

    3. Reviewer #3 (Public review):

      Summary:

      The authors use calcium recordings from STN to measure STN activity during spontaneous movement and in a multi-stage avoidance paradigm. They also use optogenetic inhibition and lesion approaches to test the role of STN during the avoidance paradigm. The paper reports a large amount of data and makes many claims, some seem well supported to this Reviewer, others not so much.

      Strengths:

      Well-supported claims include data showing that during spontaneous movements, especially contraversive ones, STN calcium activity is increased using bulk photometry measurements. Single-cell measures back this claim but also show that it is only a minority of STN cells that respond strongly, with most showing no response during movement, and a similar number showing smaller inhibitions during movement.

      Photometry data during cued active avoidance procedures show that STN calcium activity sharply increases in response to auditory cues, and during cued movements to avoid a footshock. Optogenetic and lesion experiments are consistent with an important role for STN in generating cue-evoked avoidance. And a strength of these results is that multiple approaches were used.

      [Editors' note: The authors provided a good explanation regarding the difference between interpreting 'caution' in the healthy vs impaired situation, and this addressed one of the remaining major concerns from the last round of review.]

    1. Reviewer #1 (Public review):

      This manuscript used deep learning to highlight the role of inhibition in shaping selectivity in primary and higher visual cortex. The findings hint at hitherto unknown axes of structured inhibition operating in cortical networks with a potentially key role in object recognition.

      The multi-species approach of testing the model in macaque and mouse is excellent, as it improves the chances that the observed findings are a general property of mammalian visual cortex. However, it would be useful to delineate any notable differences between these species, which are to be expected given their lifestyle.

      The overall performance of the model appears to be excellent in V1, with over 80% performance, but it falls substantially in V4. It would be important to consider the implications of this finding; for example, in the context of studying temporal lobe structures that are central to recognizing objects. Would one expect that model performance decreases further here, and what measures could be taken to avoid this? Or is this type of model better restricted to V1 or even LGN?

      While the manuscript delineates novel axes of inhibitory interactions, it remains unclear what exactly these axes are and how they arise. What are the steps that need to be taken to make progress along these lines?

    2. Reviewer #2 (Public review):

      The classic view of sensory coding states that (excitatory) neurons are active to some preferred stimuli and otherwise silent. In contrast, inhibitory neurons are considered broadly tuned. Due to the gigantic potential image space, it is hard to comprehensively map the tuning of individual neurons. In this tour de force study, Franke et al. combine electrophysiological recordings in macaque (V1, V4) and mouse (V1, LM, LI) visual cortex with large-scale screens based on digital twin models, as well as beautiful systems identification (most/least activating stimuli). Based on these digital twins, they discover dual-feature selectivity (which they validate both in macaques and mice). Dual-feature selectivity involves a bidirectional modulation of firing rates around an elevated baseline. Neurons are excited by specific preferred features and systematically suppressed by distinct, non-preferred features. This tuning was identified by excellently combining advances in AI & high-throughput ephys.

      The study is comprehensive and convincing. Overall, this work showcases how in silico experiments can generate concrete hypotheses about neuronal coding that are difficult to discover experimentally, but that can be experimentally validated! I think this work is of substantial interest to the neuroscience community. I'm sure it will motivate many future experimental and computational studies. In particular, it will be of great interest to understand when and how the brain leverages dual-feature selectivity. The discussion of the article is already an interesting starting point for these considerations.

      Strengths:

      (1) Using computational models to predict neuronal responses allowed them to go through millions of images, which may not be possible in vivo.

      (2) The cross-species and cross-area consistency of the results is another major strength. Pointing out that the results may be a fundamental strategy of mammalian cortical processing.

      (3) They show that the feature causing peak excitation in one neuron often drives suppression in another. This may be an efficient coding scheme where the population covers the visual manifold. I'd like to understand better why the authors believe that this shows that there are low-dimensional subspaces based on preferred and non-preferred stimulus features (vs. many more, but some axes are stronger).

    1. Reviewer #1 (Public review):

      Wang, Zhou et al. investigated coordination between the prefrontal cortex (PFC) and the hippocampus (Hp), during reward delivery, by analyzing beta oscillations. Beta oscillations are associated with various cognitive functions, but their role in coordinating brain networks during learning is still not thoroughly understood. The authors focused on the changes in power, peak frequencies, and coherence of beta oscillations in two regions when rats learn a spatial task over days. Inconsistent with the authors' hypothesis, beta oscillations in those two regions during reward delivery were not coupled in spectral or temporal aspects. They were, however, able to show reverse changes in beta oscillations in PFC and Hp as the animal's performance got better. The authors were also able to show a small subset of cell populations in PFC that are modulated by both beta oscillations in PFC and sharp wave ripples in Hp. A similarly modulated cell population was not observed in Hp. These results are valuable in pointing out distinct periods during a spatial task when two regions modulate their activity independently from each other.

      The authors included a detailed analysis of the data to support their conclusions. However, some clarifications would help their presentation, as well as help readers to have a clear understanding.

      (1) The crucial time point of the analysis is the goal entry. However, it needs a better explanation in the methods or in figures of what a goal entry in their behavioral task means.

      (2) Regarding Figure 2, the authors have mentioned in the methods that PFC tetrodes have targeted both hemispheres. It might be trivial, but a supplementary graph or a paragraph about differences or similarities between contralateral and ipsilateral tetrodes to Hp might help readers.

      (3) The authors have looked at changes in burst properties over days of training. For the coincidence of beta bursts between PFC and Hp, is there a change in the coincidence of bursts depending on the day or performance of the animal?

      (4) Regarding the changes in performance through days as well as variance of the beta burst frequency variance (Figures 3C and 4C); was there a change in the number of the beta bursts as animals learn the task, which might affect variance indirectly?

      (5) In the behavioral task, within a session, animals needed to alternate between two wells, but the central arm (1) was in the same location. Did the authors alternate the location of well number 1 between days to different arms? It is possible that having well number 1 in the same location through days might have an effect on beta bursts, as they would get more rewards in well number 1?

      (6) The animals did not increase their performance in the F maze as much as they increased it in the Y maze. It would be more helpful to see a comparison between mazes in Figure 5 in terms of beta burst timing. It seems like in Y maze, unrewarded trials have earlier beta bursts in Y maze compared to F maze. Also, is there a difference in beta burst frequencies of rewarded and unrewarded trials?

      (7) For individual cell analysis, the authors recorded from Hp and the behavioral task involved spatial learning. It would be helpful to readers if authors mention about place field properties of the cells they have recorded from. It is known that reward cells firing near reward locations have a higher rate to participate in a sharp wave ripple. Factoring in the place field properties of the cells into the analysis might give a clearer picture of the lack of modulation of HP cells by beta and sharp wave ripples.

    2. Reviewer #2 (Public review):

      (1) When presenting the power spectra for the representative example (Figure 1), it would be appropriate to display a broader frequency band-including delta, theta, and gamma (up to ~100 Hz), rather than only the beta band. What was the rat's locomotor state (e.g., running speed) after entering the reward location, during which the LFPs were recorded? If the rats stopped at the goal but still consumed the reward (i.e., exhibited very low running speed), theta rhythms might still occasionally occur, and sharp-wave ripples (SWRs) could be observed during rest. Do beta bursts also occur during navigation prior to goal entry? It would be beneficial to display these rhythmic activities continuously across both the navigation and goal entry phases. Additionally, given that the hippocampal theta rhythm is typically around 7-8 Hz, while a peak at approximately 15-16 Hz is visible in the power spectra in Figure 1C, the authors should clarify whether the 22 Hz beta activity represents a genuine oscillation rather than a harmonic of the theta rhythm.

      (2) The authors claim that beta activity is independent between CA1 and PFC, based on the low coherence between these regions. However, it is challenging to discern beta-specific coherence in CA1; instead, coherence appears elevated across a broader frequency band (Figure 2 and Figure 2-1D). An alternative explanation could be that the uncoupled beta between CA1 and PFC results from low local beta coherence within CA1 itself.

      (3) In Figure 2-1E-F, visual inspection of the box plots reveals minimal differences between PFC-Ind and PFC-Coin/CA1-Coin conditions, despite reported statistical significance. It may be necessary to verify whether the significance arises from a large sample size.

      (4) In Figure 3 and Figure 4, although differences in power and frequency appear to change significantly across days, these changes are not easily discernible by visual inspection. It is worth considering whether these variations are related to increased task familiarity over days, potentially accompanied by higher running speeds.

      (5) The stronger spiking modulation by local beta oscillations shown in Figure 6 could also be interpreted in the context of uncoupled beta between CA1 and PFC. In this analysis, only spikes occurring during beta bursts should be included, rather than all spikes within a trial. The authors should verify the dataset used and consider including a representative example illustrating beta modulation of single-unit spiking.

      (6) As observed in Figure 7D, CA1 beta bursts continue to occur even after 2.5 seconds following goal entry, when SWRs begin to emerge. Do these oscillations alternate over time, or do they coexist with some form of cross-frequency coupling?

    3. Reviewer #3 (Public review):

      Summary:

      This paper explored the role of beta rhythms in the context of spatial learning and mPFC-hippocampal dynamics. The authors characterized mPFC and hippocampal beta oscillations, examining how their coordination and their spectral profiles related to learning and prefrontal neuronal firing. Rats performed two tasks, a Y-maze and an F-maze, with the F-maze task being more cognitively demanding. Across learning, prefrontal beta oscillation power increased while beta frequency decreased. In contrast, hippocampal beta power and beta frequency decreased. This was particularly the case for the well-performed and well-learned Y-maze paradigm. The authors identified the timing of beta oscillations, revealing an interesting shift in beta burst timing relative to reward entry as learning progressed. They also discovered an interesting population of prefrontal neurons that were tuned to both prefrontal beta and hippocampal sharp-wave ripple events, revealing a spectrum of SWR-excited and SWR-inhibited neurons that were differentially phase locked to prefrontal beta rhythms.

      In sum, the authors set out to examine how beta rhythms and their coordination were related to learning and goal occupancy. The authors identified a set of learning and goal-related correlates at the level of LFP and spike-LFP interactions, but did not report on spike-behavioral correlates.

      Strengths:

      Pairing dual recordings of medial prefrontal cortex (mPFC) and CA1 with learning of spatial memory tasks is a strength of this paper. The authors also discovered an interesting population of prefrontal neurons modulated by both beta and CA1 sharp-wave ripple (SWR) events, showing a relationship between SWR-excited and SWR-inhibited neurons and beta oscillation phase.

      Weaknesses:

      The authors report on a task where rats were performing sub-optimally (F-maze), weakening claims. Likewise, it is questionable as to whether mPFC and hippocampus are dually required to perform a no-delay Y-maze task at day 5, where rats are performing near 100%. There would be little reason to suspect strong oscillatory coupling when task performance is poor and/or independent of mPFC-HPC communication (Jones and Wilson, 2005), potentially weakening conclusions about independent beta rhythms. Moreover, there is little detail provided about sample sizes and how data sampling is being performed (e.g., rats, sessions, or trials), raising generalizability concerns.

    1. Reviewer #1 (Public review):

      Summary:

      The authors set out to understand how animals respond to visible light in an animal without eyes. To do so they used the C. elegans model, which lacks eyes, but nonetheless exhibits robust responses to visible light at several wavelengths. Here, the authors report a promoter that is activated by visible light and independent of known pathways of light resposnes.

      Strengths:

      The authors convincingly demonstrate that visible light activates the expression of the cyp-14A5 promoter driven gene expression in a variety of contexts and report the finding that this pathway is activated via the ZIP-2 transcriptionally regulated signaling pathway.

      Weaknesses:

      Because the ZIP-2 pathway has been reported to activated predominantly by changes in the bacterial food source of C. elegans -- or exposure of animals to pathogens -- it remains unclear if visible light activates a pathway in C. elegans (animals) or if visible light potentially is sensed by the bacteria on the plate which also lack eyes. Specifically, it is possible that the the plates are seeded with excess E. coli, that E. coli is altered by light in some way and in this context alters its behavior in such a way that activates a known bacterially responsive pathway in the animals. Consistent with this possibility the authors found that heat-killed bacteria prevented the reporter activation in animals. This weakness would not affect the ability to use this novel discovery as a tool, which would still be useful to the field.

    2. Reviewer #2 (Public review):

      Summary:

      Ji, Ma and colleagues report the discovery of a mechanism in C. elegans that mediates transcriptional responses to low intensity light stimuli. They find that light-induced transcription requires a pair of bZIP transcription factors and induces expression of a cytochrome P450 effector. This unexpected light-sensing mechanism is required for physiologically relevant gene expression that controls behavioral plasticity. The authors further show that this mechanism can be co-opted to create light-inducible transgenes.

      Strengths:

      The authors rigorously demonstrate that ambient light stimuli regulate gene expression via a mechanism that requires the bZIP factors ZIP-2 and CEBP-2. Transcriptional responses to light stimuli are measured using transgenes and using measurements of endogenous transcripts. The study shows proper genetic controls for these effects. The study shows that this light-response does not require known photoreceptors, is tuned to specific wavelengths, and is highly unlikely to be an artifact of temperature-sensing. The study further shows that the function of ZIP-2 and CEBP-2 in light-sensing can be distinguished from their previously reporter role in mediating transcriptional responses to pathogenic bacteria. The study includes experiments that demonstrate that regulatory motifs from a known light-response gene can be used to confer light-regulated gene expression, demonstrating sufficiency and suggesting an application of these discoveries in engineering inducible transgenes. Finally, the study shows that ambient light and the transcription factors that transduce it into gene expression changes are required to stabilize a learned olfactory behavior, suggesting a physiological function for this mechanism.

      Weaknesses:

      The study implies but does not show that the effects of ambient light on stabilizing a learned olfactory behavior are through the described pathway. To show this clearly, the authors should determine whether ambient light has any further effects on learning in mutants lacking CYP-14A5, ZIP-2, or CEBP-2.

    3. Reviewer #3 (Public review):

      Ji et al. report a novel and interesting light-induced transcriptional response pathway in the eyeless roundworm Caenorhabditis elegans that involves a cytochrome P450 family protein (CYP-14A5) and functions independently from previously established photosensory mechanisms. The authors also demonstrate the potential for this pathway to enable robust light-induced control of gene expression and behavior, albeit with some restrictions. Despite the limitations of this tool, including those presented by the authors, it could prove useful for the community. Overall, the evidence supporting the claims of the authors is convincing, and the authors' work suggests numerous interesting lines of future inquiry.

      (1) Although the exact mechanisms underlying photoactivation of this pathway remain unclear, light-dependent induction of CYP-14A5 requires bZIP transcription factors ZIP-2 and CEBP-2 that have been previously implicated in worm responses to pathogens. Notably, this light response requires live food bacteria, suggesting a microbial contribution to this phenomenon. The nature of the microbial contribution to the light response is unknown but very interesting.

      (2) The authors suggest that light-induced CYP-14A5 activity in the C. elegans hypoderm can unexpectedly and cell-non-autonomously contribute to retention of an olfactory memory. How retention of the olfactory memory is enhanced by light generally remains unclear. Additional experiments, including verification of light-dependent changes in CYP-14A5 levels in the olfactory memory behavioral setup, appropriate would help further interpret these otherwise interesting results.

    1. Reviewer #1 (Public review):

      Aw et al. have proposed that utilizing stability analysis can be useful for fine-mapping of cross populations. In addition, the authors have performed extensive analyses to understand the cases where the top eQTL and stable eQTL are the same or different via functional data.

      Comments on revisions:

      The authors have answered all my concerns.

    2. Reviewer #2 (Public review):

      Aw et al presents a new stability-guided fine-mapping method by extending the previously proposed PICS method. They applied their stability-based method to fine-map cis-eQTLs in the GEUVADIS dataset and compared it against residualization-based approaches. They evaluated the performance of the proposed method using publicly available functional annotations and demonstrated that the variants identified by their stability-based method show enrichment for these functional annotations.

      The authors have substantially strengthened the manuscript by addressing the major concerns raised in the initial review. I acknowledge that they have conducted comprehensive simulation studies to show the performance of their proposed approach and that they have extended their approach to SuSiE ("Stable SuSiE") to demonstrate the broader applicability of the stability-guided principle beyond PICS.

      One remaining question is the interpretation of matching variants with very low stable posterior probabilities (~0), which the authors have analyzed in detail but without fully conclusive findings. I agree with the authors that this event is relatively rare and the current sample size is limited but this might be something to keep in mind for future studies.

    1. Reviewer #1 (Public review):

      Summary:

      This is a careful and comprehensive study demonstrating that effector-dependent conformational switching of the MT lattice from compacted to expanded deploys the alpha tubulin C-terminal tails so as to enhance their ability to bind interactors.

      Strengths:

      The authors use 3 different sensors for the exposure of the alpha CTTs. They show that all 3 sensors report exposure of the alpha CTTs when the lattice is expanded by GMPCPP, or KIF1C, or a hydrolysis-deficient tubulin. They demonstrate that expansion-dependent exposure of the alpha CTTs works in tissue culture cells as well as in vitro.

      Appraisal:

      The authors have gone to considerable lengths to test their hypothesis that microtubule expansion favours deployment of the alpha tubulin C-terminal tail, allowing its interactors, including detyrosinase enzymes, to bind. There is a real prospect that this will change thinking in the field. One very interesting possibility, touched on by the authors, is that the requirement for MAP7 to engage kinesin with the MT might include a direct effect of MAP7 on lattice expansion.

      Impact:

      The possibility that the interactions of MAPS and motors with a particular MT or region feed forward to determine its future interaction patterns is made much more real. Genuinely exciting.

    2. Reviewer #2 (Public review):

      The unstructured α- and β-tubulin C-terminal tails (CTTs), which differ between tubulin isoforms, extend from the surface of the microtubule, are post-translationally modified, and help regulate the function of MAPs and motors. Their dynamics and extent of interactions with the microtubule lattice are not well understood. Hotta et al. explore this using a set of three distinct probes that bind to the CTTs of tyrosinated (native) α-tubulin. Under normal cellular conditions, these probes associate with microtubules only to a limited extent, but this binding can be enhanced by various manipulations thought to alter the tubulin lattice conformation (expanded or compact). These include small-molecule treatment (Taxol), changes in nucleotide state, and the binding of microtubule-associated proteins and motors. Overall, the authors conclude that microtubule lattice "expanders" promote probe binding, suggesting that the CTT is generally more accessible under these conditions. Consistent with this, detyrosination is enhanced. Mechanistically, molecular dynamics simulations indicate that the CTT may interact with the microtubule lattice at several sites, and that these interactions are affected by the tubulin nucleotide state.

      Strengths and weaknesses:

      Key strengths of the work include the use of three distinct probes that yield broadly consistent findings, and a wide variety of experimental manipulations (drugs, motors, MAPs) that collectively support the authors' conclusions, alongside a careful quantitative approach.

      The challenges of studying the dynamics of a short, intrinsically disordered protein region within the complex environment of the cellular microtubule lattice, amid numerous other binders and regulators, should not be understated. While it is very plausible that the probes report on CTT accessibility as proposed, the possibility of confounding factors (e.g., effects on MAP or motor binding) cannot be ruled out. Sensitivity to the expression level clearly introduces additional complications. Likewise, for each individual "expander" or "compactor" manipulation, one must consider indirect consequences (e.g., masking of binding sites) in addition to direct effects on the lattice; however, this risk is mitigated by the collective observations all pointing in the same direction.

      The discussion does a good job of placing the findings in context and acknowledging relevant caveats and limitations. Overall, this study introduces an interesting and provocative concept, well supported by experimental data, and provides a strong foundation for future work. This will be a valuable contribution to the field.

    3. Reviewer #3 (Public review):

      Summary:

      In this study, the authors investigate how the structural state of the microtubule lattice influences the accessibility of the α-tubulin C-terminal tail (CTT). By developing and applying new biosensors, they reveal that the tyrosinated CTT is largely inaccessible under normal conditions but becomes more accessible upon changes to the tubulin conformational state induced by taxol treatment, MAP expression, or GTP-hydrolysis-deficient tubulin. The combination of live imaging, biochemical assays, and simulations suggests that the lattice conformation regulates the exposure of the CTT, providing a potential mechanism for modulating interactions with microtubule-associated proteins. The work addresses a highly topical question in the microtubule field and proposes a new conceptual link between lattice spacing and tail accessibility for tubulin post-translational modification. Future work is required to distinguish CTT exposure in the microtubule lattice is sensitive to additional factors present in vivo but not in vitro.

      Strengths:

      (1) The study targets a highly relevant and emerging topic-the structural plasticity of the microtubule lattice and its regulatory implications.

      (2) The biosensor design represents a methodological advance, enabling direct visualization of CTT accessibility in living cells.

      (3) Integration of imaging, biochemical assays, and simulations provides a multi-scale perspective on lattice regulation.

      (4) The conceptual framework proposed lattice conformation as a determinant of post-translational modification accessibility is novel and potentially impactful for understanding microtubule regulation.

      [Editors' note: the authors have responded to the reviewers and this version was assessed by the editors.]

    1. Reviewer #1 (Public review):

      While the structure of the melibiose permease in both outward and inward-facing forms has been solved previously, there remains unanswered questions regarding its mechanism. Hariharan et al set out to address this with further crystallographic studies complemented with ITC and hydrogen deuterium exchange (HDX) mass spectrometry. They first report 4 different crystal structures of galactose derivatives to explore molecular recognition showing that the galactose moiety itself is the main source of specificity. Interestingly, they observe a water-mediated hydrogen bonding interaction with the protein and suggest that this water molecule may be important in binding.

      The results from the crystallography appear sensible, though the resolution of the data is low with only the structure with NPG better than 3Å. Support for the conclusion of the water molecule in the binding site, as interpreted from the density, is given by MD studies.

      The HDX also appears to be well done and is explained reasonably well in the revision.

    2. Reviewer #3 (Public review):

      Summary:

      The melibiose permease from Salmonella enterica serovar Typhimurium (MelBSt) is a member of the Major Facilitator Superfamily (MFS). It catalyzes the symport of a galactopyranoside with Na⁺, H⁺, or Li⁺, and serves as a prototype model system for investigating cation-coupled transport mechanisms. In cation-coupled symporters, a coupling cation typically moves down its electrochemical gradient to drive the uphill transport of a primary substrate; however, the precise role and molecular contribution of the cation in substrate binding and translocation remain unclear. In a prior study, the authors showed that the binding affinity for melibiose is increased in the presence of Na+ by about 8-fold, but the molecular basis for the cooperative mechanism remains unclear. The objective of this study was to better understand the allosteric coupling between the Na+ and melibiose binding sites. To verify the sugar-recognition specific determinants, the authors solved the outward-facing crystal structures of a uniport mutant D59C with four sugar ligands containing different numbers of monosaccharide units (α-NPG, melibiose, raffinose, or α-MG). The structure with α-NPG bound has improved resolution (2.7 Å) compared to a previously published structure and to those with other sugars. These structures show that the specificity is clearly directed toward the galactosyl moiety. However, the increased affinity for α-NPG involves its hydrophobic phenyl group, positioned at 4 Å-distance from the phenyl group of Tyr26 forms a strong stacking interaction. Moreover, a water molecule bound to OH-4 in the structure with α-NPG was proposed to contribute to the sugar recognition and appears on the pathway between the two specificity-determining pockets. Next, the authors analyzed by hydrogen-to-deuterium exchange coupled to mass spectrometry (HDX-MS) the changes in structural dynamics of the transporter induced by melibiose, Na+, or both. The data support the conclusion that the binding of the coupling cation at a remote location stabilizes the sugar-binding residues to switch to a higher-affinity state. Therefore, the coupling cation in this symporter was proposed to be an allosteric activator.

      Strengths:

      (1) The manuscript is generally well written.

      (2) This study builds on the authors' accumulated knowledge of the melibiose permease and integrates structural and HDX-MS analyses to better understand the communication between the sodium ion and sugar binding sites. A high sequence coverage was obtained for the HDX-MS data (86-87%), which is high for a membrane protein.

      The revised manuscript shows clear improvement, and the authors have addressed my concerns in a satisfactory manner. Of note, I noticed two mistakes that should be corrected:

      - page 11. Unless I am mistaken, the sentence "In contrast, Na+ alone or with melibiose primarily caused deprotections" should be corrected with "protections". The authors may wish to verify this sentence and also the previous one in the main text.

      - Figure 8 displays two cytoplasmic gates (one of them should be periplasmic)

    1. Reviewer #1 (Public review):

      Summary:

      The authors analyze transcription in single cells before and after 4000 rads of ionizing radiation. They use Seuratv5 for their analyses, which allows them to show that most of the genes cluster along the proximal-distal axis. Due to the high heterogeneity in the transcripts, they use the Herfindahl-Hirschman index (HHI) from Economics, which measures market concentration. Using the HHI, they find that genes involved in several processes (like cell death, response to ROS, DNA damage response (DDR)) are relatively similar across clusters. However, ligands activating the JAK/STAT, Pvr, and JNK pathways and transcription factors Ets21C and dysf are upregulated regionally. The JAK/STAT ligands Upd1,2,3 require p53 for their upregulation after irradiation, but the normal expression of Upd1 in unirradiated discs is p53-independent. This analysis also identified a cluster of cells that expressed tribbles, encoding a factor that downregulates mitosis-promoting String and Twine, that appears to be G2/M arrested and expressed numerous genes involved in apoptosis, DDR, the aforementioned ligands and TFs. As such, the tribbles-high cluster contains much of the heterogeneity.

      Strengths:

      (1) The authors have used robust methods for rearing Drosophila larvae, irradiating wing discs and analyzing the data with Seurat v5 and HHI.<br /> (2) These data will be informative for the field.<br /> (3) Most of the data is well-presented.<br /> (4) The literature is appropriately cited.

      Weaknesses

      The authors have addressed my concerns in the revised article.

    2. Reviewer #2 (Public review):

      This manuscript investigates the question of cellular heterogeneity using the response of Drosophila wing imaginal discs to ionizing radiation as a model system. A key advance here is the focus on quantitatively expressing various measures of heterogeneity, leveraging single-cell RNAseq approaches. To achieve this goal, the manuscript creatively uses a metric from the social sciences called the HHI to quantify the spatial heterogeneity of expression of individual genes across the identified cell clusters. Inter- and intra-regional levels of heterogeneity are revealed. Some highlights include identification of spatial heterogeneity in expression of ligands and transcription factors after IR. Expression of some of these genes shows dependence on p53. An intriguing finding, made possible by using an alternative clustering method focusing on cell cycle progression, was the identification of a high-trbl subset of cells characterized by concordant expression of multiple apoptosis, DNA damage repair, ROS related genes, certain ligands and transcription factors, collectively representing HIX genes. This high-trbl set of cells may correspond to an IR-induced G2/M arrested cell state.

      Overall, the data presented in the manuscript are of high quality but are largely descriptive. This study is therefore perceived as a resource that can serve as an inspiration for the field to carry out follow-up experiments.

      The authors responded well to my suggestions for improvement, which were incorporated in the revised version of the manuscript.

    3. Reviewer #3 (Public review):

      Summary:

      Cruz and colleagues report a single cell RNA sequencing analysis of irradiated Drosophila larval wing discs. This is a pioneering study because prior analyses used bulk RNAseq analysis so differences at single cell resolution were not discernable. To quantify heterogeneity in gene expression, the authors make clever use of a metric used to study market concentration, the Herfindahl-Hirschman Index. They make several important observations including region-specific gene expression coupled with heterogeneity within each region and the identification of a cell population (high Trbl) that seems disproportionately responsible for radiation-induced gene expression.

      Strengths:

      Overall, the manuscript makes a compelling case for heterogeneity in gene expression changes that occurs in response to uniform induction of damage by X-rays in a single layer epithelium. This is an important finding that would be of interest to researchers in the field of DNA damage responses, regeneration and development.

      Weaknesses:

      The authors have addressed my concerns adequately with changes made in the revised version.

    1. Reviewer #1 (Public review):

      Summary:

      Negreira, G. et al clearly presented the challenges of conducting genomic studies in unicellular pathogens and of addressing questions related to the balance between genome integrity and instability, pivotal for survival under the stressful conditions these organisms face and for their evolutionary success. This underlies the need for powerful approaches to perform single-cell DNA analyses suited to the small and plastic Leishmania genome. Accordingly, their goal was to develop such a novel method and demonstrate its robustness.

      In this study, the authors combined semi-permeable capsules (SPCs) with primary template-directed amplification (PTA) and adapted the system to the Leishmania genome, which is about 100 times smaller than the human genome and exhibits remarkable plasticity and mosaic aneuploidy. Given the size and organization of the Leishmania genome, the challenges were substantial; nevertheless, the authors successfully demonstrated that PTA not only works for Leishmania but also represents a significantly improved whole-genome amplification (WGA) method compared with standard approaches. They showed that SPCs provide a superior alternative for cell encapsulation, increasing throughput. The methodology enabled high-resolution karyotyping and the detection of fine-scale copy number variations (CNVs) at the single-cell level. Furthermore, it allowed discrimination between genotypically distinct cells within mixed populations.

      Strengths:

      This is a high-impact study that will likely contribute to our understanding of DNA replication and the genetic plasticity of Leishmania, including its well-documented aneuploidy, somy variations, CNVs, and SNPs - all key elements for elucidating various aspects of the parasite's biology, such as genome evolution, genetic exchange, and mechanisms of drug resistance.

      Overall, the authors clearly achieved their objectives, providing a solid rationale for the study and demonstrating how this approach can advance the investigation of Leishmania's small, plastic genome and its frequent natural strain mixtures within hosts. This methodology may also prove valuable for genomic studies of other single-celled organisms.

      Weaknesses:

      The discussion section could be enriched to help readers understand the significance of the work, for instance, by more clearly pointing out the obstacles to a better understanding of DNA replication in Leishmania. Or else, when they discuss the results obtained at the level of nucleotide information and the relevance of being able to compare, in their case, the two strains, they could refer to the implications of this level of precision to those studying clonal strains or field isolates, drug resistance or virulence in a more detailed way.

    2. Reviewer #2 (Public review):

      Summary:

      Negreira et al. present an application of a novel single-cell genomics approach to investigate the genetic heterogeneity of Leishmania parasites. Leishmania, while also representing a major global disease with hundreds of thousands of cases annually, serves as a model to test the rigor of the sequencing strategy. Its complex karyotypic nature necessitates a method that is capable of resolving natural variation to better understand genome dynamics. Importantly, an earlier single-cell genomics platform (10x Chromium) is no longer available, and new methods need to be evaluated to fill in this gap.

      The study was designed to evaluate whether a capsule-based cell capture method combined with primary template-directed amplification (PTA) could maintain levels of genomic heterogeneity represented in an equal mixture of two Leishmania strains. This was a high bar, given the relatively small protozoan genome and prior studies that showed limitations of single-cell genomics, especially for gene-level copy number changes. Overall, the study found that semi-permeable capsules (SPC) are an effective way to isolate high-quality single cells. Additionally, short reads from amplified genomes effectively maintained the relative levels of variation in the two strains on the chromosome, gene copy, and individual base level. Thus, this method will be useful to evaluate adaptive strategies of Leishmania. Many researchers will also refer to these studies to set up SPC collection and PTA methods for their organism of choice.

      Strengths:

      (1) The use of SPC and PTA in a non-bacterial organism is novel. The study displays the utility of these methods to isolate and amplify single genomes to a level that can be sequenced, despite being a motile organism with a GC-rich genome.

      (2) The authors clearly outlined their optimization strategy and provided numerous quality-control metrics that inspire confidence in the success of achieving even chromosomal coverage relative to ploidy.

      (3) The use of two distinct Leishmania strains with known clonal status provided strong evidence that PTA-based amplification could reflect genome differences and displayed the utility of the method for studies of rare genotypes.

      (4) Evaluating the SPCs pre- and post-amplification with microscopy is a practical and robust way of determining the success of SPC formation and PTA.

      (5) The authors show that the PTA-based approach easily resolved major genotypic ploidy in agreement with a prior 10x Chromium-based study. The new method had improved resolution of drug resistance genotypes in the form of both copy-number variations and single-nucleotide polymorphisms.

      (6) In general, the authors are very thorough in describing the methods, including those used to optimize PTA lysis and amplification steps (fresh vs frozen cells, naked DNA vs sorted cells, etc). This demonstrates a depth of knowledge about the procedure and leaves few unanswered questions.

      (7) The custom, multifaceted, computational assessment of coverage evenness is a major strength of the study and demonstrates that the authors acknowledge potential computational factors that could impact the analysis.

      Weaknesses:

      (1) The rationale behind some experimental/analysis choices is not well-described. For example, the rationale behind methanol fixation and heat-lysis is unclear. Additionally, the choice of various methods to assess "evenness" is not justified (e.g. why are multiple methods needed? What is the strength of each method?). Also, there is no justification for using 100k reads for subsampling. Finally, what exactly constitutes a "confidently-called SNP"?

      (2) In the methods, the STD protocol lists a 15-minute amplification at 45C whereas the PTA protocol involves 10h at 37C. This is a dramatic difference in incubation time and should be addressed when comparing results from the two methods. It is not really a fair comparison when you look at coverage levels; of course, a 10-hour incubation is going to yield more reads than a 15-minute incubation.

      (3) There is a lack of quantitative evaluations of the SPCs. e.g. How many capsules were evaluated to assess doublets? How many capsules were detected as Syto5 positive in a successful vs an unsuccessful experiment?

      (4) The authors do not address some of the amplification results obtained under various conditions. For example, why did temperature-based lysis of STD4 lead to amplification failure? Also, what is the reason for fewer "true" cells (higher background) in the PTA samples compared to the STD samples? Is this related to issues with barcoding or, alternatively, substandard amplification as indicated by lower read amounts in some capsules (knee plots in Figure 1C)?

      (5) The paper presents limited biological relevance. Without this, the paper describes an improvement in genome amplification methods and some proof-of-concept analyses. Using a 1:1 mixture of parasites with different genotypes, the authors display the utility of the method to resolve genetic diversity, but they don't seek to understand the limits of detecting this diversity. For some, the authors do not comment on the mixed karyotypes from the HU3 cells (Figure 3F) other than to state that this line was not clonal. For CNVs, the two loci evaluated were detected at relatively high copy number (according to Figure 4C, they are between 4 and 20 copies). Thus, the sensitivity of CNV detection from this data remains unclear; can this approach detect lower-level CNVs like duplications, or minor CNVs that do not show up in every cell?

      (6) The authors state that Leishmania can carry extrachromosomal copies of important genes. There is no discussion about how the presence of these molecules would affect the amplification steps and CNV detection. For example, the phi29 enzyme is very processive with circular molecules; does its presence lead to overamplification and overrepresentation in the data? Is this evident in the current study? This information would be useful for organisms that carry this type of genetic element.

      (7) The manuscript is missing a comparison with other similar studies in the field. For example, how does this coverage level compare to those achieved for other genomes? Can this method achieve amplification levels needed to assess larger genomes? Has there been any evaluation of base composition effects since Leishmania is a GC-rich genome?

      (8) Cost is mentioned as a benefit of the SPC platform, and savings are achieved when working in a plate format, but no details are included on how this was evaluated.

      (9) The Zenodo link for custom scripts does not exist, and code cannot be evaluated.

    3. Reviewer #3 (Public review):

      In this manuscript, Negreira et al. propose a new scDNAseq method, using semi-permeable capsules (SPCs) and primary template-directed amplification (PTA). The authors optimize several metrics to improve their predictions, such as determining GC bias, Intra-Chromosomal fluctuation (ICF -metric to differentiate replicative and non-replicative cells) and Intra-chromosomal coefficient of variation (ICCV - chromosome read distribution). The coverage evenness was evaluated using the fini index and the median absolute pairwise difference between the counts of two consecutive bins. They validate the proposed method using two Leishmania donovani strains isolated from different countries, BPK081 (low genomic variability) and HU3 (high genomic variability). Then, they showed that the method outperforms WGA and has similar accuracy to the discontinued 10X-scDNA (10X Genomics), further improving on short CNV identification. The authors also show that the method can identify somy variations, insertions/deletions and SNP variations across cells. This is a timely and very relevant work that has a wide applicability in copy number variation assessment using single-cell data.

      I really appreciate this work. My congratulations to the authors. All my comments below only aim to improve an already solid manuscript.

      (1) Data availability: Although the authors provide a Zenodo link, the data is restricted. I also could not access the GitHub link in the Zenodo website: https://github.com/gabrielnegreira/2025_scDNA_paper. The authors should make these files available.

      (2) 2-SPC-PTA and SPC-STD cell count comparison: The authors have consistently proven that the SPC-PTA method was superior to SPC-STD. However, there are a few points that should be clarified regarding the SPC-PTA results. Is there an explanation for the lower proportion of SPC to true cells success in SPC-STD, which reflects the bimodal distribution for the reads per cell in SPC-PTA2 and a three-to-multimodal distribution in SPC-PTA1 in Figure 1B? Also, in Table 1, does the number of reads reflect the number of reads in all sequenced SPCs or only in the true cells? If it is in the SPCs, I suggest that the authors add a new column in the table with the "Number of reads in true cells" to account for this discrepancy.

      (3) The authors should evaluate the results with a higher coverage for SCP-PTA. I understand that the authors subsampled the total read to 100,000 to allow cross-sample comparisons, especially between SPC-STD and SPC-PTA. However, as they concluded that the SPC-PTA was far superior, and the samples SPC-PTA1 and SPC-PTA2 had an "elbow" of 650,493 and 448,041, respectively, it might be interesting to revisit some of the estimations using only SPC-PTA samples and a higher coverage cutoff, as 400,000.

      (4) Doublet detection: I suggest that the authors be a little more careful with their definition of doublets. The doublet detection was based on diagnostic SNPs from the two strains, BPK081 and HU3, which identify doublets between two very different and well-characterised strains. However, this method will probably not identify strain-specific doublets. This is of minor importance for cloned and stable strains with few passages, as BPK081, but might be more relevant in more heterogeneous strains, as HU3. Strain-specific doublets might also be relevant in other scenarios, as multiclonal infections with different populations from the same strain in the same geographic area. One positive point is that the "between strain doublet count" was low, so probably the within-strain doublet count should be low too. The manuscript would benefit from a discussion on this regard.

      (5) Nucleotide sequence variants and phylogeny: I believe that a more careful description of the phylogenetic analysis and some limitations of the sequence variant identification would benefit the manuscript.

      (5.1) As described in the methods, the authors intentionally selected two fairly different Leishmania donovani strains, HU3 and BPK081, and confirmed that the sequent variant methodology can separate cells from each strain. It is a solid proof of concept. However, most of the multiclonal infections in natural scenarios would be caused by parasite populations that diverge by fewer SNPs, and will be significantly harder to detect. Hence, I suggest that a short discussion about this is important.

      (5.2) The authors should expand on the description of the phylogenetic tree. In the HU3 on Figure 5F left panel, most of the variation is observed in ~8 cells, which goes from position 0 to position ~28.000.

      Most of the other cells are in very short branches, from ~29.000 to 30.4000 (5F right panel). Assuming that this representation is a phylogram, as the branches are short, these cells diverge by approximately 100-2000 SNPs. It is unexpected (but not impossible) that such ~8 divergent cells be maintained uniquely (or in very low counts) in the culture, unless this is a multiclonal infection. I would carefully investigate these cells. They might be doublets or have more missing data than other cells. I would also suggest that a quick discussion about this should be added to the manuscript.

    1. Reviewer #1 (Public review):

      Summary:

      This study is an evaluation of patient variants in the kidney isoform of AE1 linked to distal renal tubular acidosis. Drawing on observations in the mouse kidney, this study extends findings to autophagy pathways in a kidney epithelial cell line.

      Strengths:

      Experimental data are convincing and nicely done.

      The revised manuscript incorporates most of the reviewer recommendations and presents a more cohesive story that is easier to read and assess. The data are convincing, of suitable quality and nicely presented. Statistical evaluation is rigorous. The link between kAE1 mutants and cell metabolism and autophagy is novel and provides insights on pathological observations in dRTA.

    2. Reviewer #2 (Public review):

      Context and significance:

      Distal renal tubular acidosis (dRTA) can be caused by mutations in a Cl-/HCO3- exchanger (kAE1) encoded by the SLC4A1 gene. The precise mechanisms underlying the pathogenesis of the disease due to these mutations is unclear, but it is thought that loss of the renal intercalated cells (ICs) that express kAE1 and/or aberrant autophagy pathway function in the remaining ICs may contribute to the disease. Understanding how mutations in SLC4A1 affect cell physiology and cells within the kidney, a major goal of this study, is an important first step to unraveling the pathophysiology of this complex heritable kidney disease.

      Summary:

      The authors identify a number of new mutations in the SLC4A1 gene in patients with diagnosed dRTA that they use for heterologous experiments in vitro. They also use a dRTA mouse model with a different SLC4A1 mutation for experiments in mouse kidneys. Contrary to previous work that speculated dRTA was caused mainly by trafficking defects of kAE1, the authors observe that their new mutants (with the exception of Y413H) traffic and localize at least partly to the basolateral membrane of polarized heterologous mIMCD3 cells, an immortalized murine collecting duct cell line. They go on to show that the remaining mutants induce abnormalities in the expression of autophagy markers and increased numbers of autophagosomes, along with an alkalinized intracellular pH. They also reported that cells expressing the mutated kAE1 had increased mitochondrial content coupled with lower rates of ATP synthesis. The authors also observed a partial rescue of the effects of kAE1 variants through artificially acidifying the intracellular pH. Taken together, this suggests a mechanism for dRTA independent of impaired kAE1 trafficking and dependent on intracellular pH changes that future studies should explore.

      Strengths:

      The authors corroborate their findings in cell culture with a well characterized dRTA KI mouse and provide convincing quantification of their images from the in vitro and mouse experiments. The data largely support the claims as stated. Some of the mutants induce different strengths of effects on autophagy and the various assays than others, and it is not clear why this is from the data in the manuscript. The authors provide discussion of potential reasons for these differences that future studies could explore.

      Weaknesses:

      The pH effects of their mutants are only explored in vitro, and the in vitro system has a number of differences from a living mouse kidney or ex vivo kidney slice.

    3. Reviewer #3 (Public review):

      Summary:

      The authors have identified novel dRTA causing SLC4A1 mutations and studied the resulting kAE1 proteins to determine how they cause dRTA. Based on a previous study on mice expressing the dRTA kAE1 R607H variant, the authors hypothesize that kAE1 variants cause an increase in intracellular pH which disrupts autophagic and degradative flux pathways. The authors clone these new kAE1 variants and study their transport function and subcellular localization in mIMCD cells. The authors show increased abundance of LC3B II in mIMCD cells expressing some of the kAE1 variants, as well as reduced autophagic flux using eGFP-RFP-LC3. These data, as well as the abundance of autophagosomes, serve as the key evidence that these kAE1 mutants disrupt autophagy. Furthermore, the authors demonstrate that decreasing the intracellular pH abrogates the expression of LC3B II in mIMCD cells expressing mutant SLC4A1. Lastly, the authors argue that mitochondrial function, and specifically ATP synthesis, is suppressed in mIMCD cells expressing dRTA variants and that mitochondria are less abundant in AICs from the kidney of R607H kAE1 mice. Overall, the authors provide evidence about how new kAE1 mutants may cause dRTA.

      Strengths:

      The authors cloned novel dRTA causing kAE1 mutants into expression vectors to study the subcellular localization and transport properties of the variants. The immunofluorescence images are generally of high quality and the authors do well to include multiple samples for all of their western blots.

    1. Reviewer #1 (Public review):

      Lahtinen et al. evaluated the association between polygenic scores and mortality. This question has been intensely studied (Sakaue 2020 Nature Medicine, Jukarainen 2022 Nature Medicine, Argentieri 2025 Nature Medicine), where most studies use PRS as an instrument to attribute death to different causes. The presented study focuses on polygenic scores of non-fatal outcomes and separates the cause of death into "external" and "internal". The majority of the results are descriptive, and the data doesn't have the power to distinguish effect sizes of the interesting comparisons: (1) differences between external vs. internal (2) differences between PGI effect and measured phenotype.

      Comments on revised version:

      The authors answered my concerns well. I don't have any further comments.

    2. Reviewer #2 (Public review):

      Summary:

      This study provides a comprehensive evaluation of the association between polygenic indices (PGIs) for 35 lifestyle and behavioral traits and all-cause mortality, using data from Finnish population- and family-based cohorts. The analysis was stratified by sex, cause of death (natural vs. external), age at death, and participants' educational attainment. Additional analyses focused on the six most predictive PGIs, examining their independent associations after mutual adjustment and adjustment for corresponding directly measured baseline risk factors.

      Strengths:

      Large sample size with long-term follow-up.

      Use of both population- and family-based analytical approaches to evaluate associations.

      Comments on revised version:

      I am happy with the revision. No further comments.

    1. Reviewer #1 (Public review):

      Summary:

      Carloni et al. comprehensively analyze which proteins bind repetitive genomic elements in Trypanosoma brucei. For this, they perform mass spectrometry on custom-designed, tagged programmable DNA-binding proteins. After extensively verifying their programmable DNA-binding proteins (using bioinformatic analysis to infer target sites, microscopy to measure localization, ChIP-seq to identify binding sites), they present, among others, two major findings: 1) 14 of the 25 known T. brucei kinetochore proteins are enriched at 177bp repeats. As T. brucei's 177bp repeat-containing intermediate-sized and mini-chromosomes lack centromere repeats but are stable over mitosis, Carloni et al. use their data to hypothesize that a 'rudimentary' kinetochore assembles at the 177bp repeats of these chromosomes to segregate them. 2) 70bp repeats are enriched with the Replication Protein A complex, which, notably, is required for homologous recombination. Homologous recombination is the pathway used for recombination-based antigenic variation of the 70bp-repeat-adjacent variant surface glycoproteins.

      Strengths and Weaknesses:

      The manuscript was previously reviewed through Review Commons. As noted there, the experiments are well controlled, the claims are well supported, and the methods are clearly described. The conclusions are convincing. All concerns I raised have been addressed except one (minor point #8):

      "The way the authors mapped the ChIP-seq data is potentially problematic when analyzing the same repeat type in different genomic regions. Reads with multiple equally good mapping positions were assigned randomly. This is fine when analyzing repeats by type, independent of genomic position, which is what the authors do to reach their main conclusions. However, several figures (Fig. 3B, Fig. 4B, Fig. 5B, Fig. 7) show the same repeat type at specific genomic locations." Due to the random assignment, all of these regions merely show the average signal for the given repeat. I find it misleading that this average is plotted out at "specific" genomic regions.<br /> Initially, I suggested a workaround, but the authors clarified why the workaround was not feasible, and their explanation is reasonable to me. That said, the figures still show a signal at positions where they can't be sure it actually exists. If this cannot be corrected analytically, it should at least be noted in the figure legends, Results, or Discussion.

      Importantly, the authors' conclusions do not hinge on this point; they are appropriately cautious, and their interpretations remain valid regardless.

      Significance:

      This work is of high significance for chromosome/centromere biology, parasitology, and the study of antigenic variation. For chromosome/centromere biology, the conceptual advancement of different types of kinetochores for different chromosomes is a novelty, as far as I know. It would certainly be interesting to apply this study as a technical blueprint for other organisms with mini-chromosomes or chromosomes without known centromeric repeats. I can imagine a broad range of labs studying other organisms with comparable chromosomes to take note of and build on this study. For parasitology and the study of antigenic variation, it is crucial to know how intermediate- and mini-chromosomes are stable through cell division, as these chromosomes harbor a large portion of the antigenic repertoire. Moreover, this study also found a novel link between the homologous repair pathway and variant surface glycoproteins, via the 70bp repeats. How and at which stages during the process, 70bp repeats are involved in antigenic variation is an unresolved, and very actively studied, question in the field. Of course, apart from the basic biological research audience, insights into antigenic variation always have the potential for clinical implications, as T. brucei causes sleeping sickness in humans and nagana in cattle. Due to antigenic variation, T. brucei infections can be chronic.

      Comments on revised version:

      All my recommendations have been addressed.

    2. Reviewer #2 (Public review):

      The Trypanosoma brucei genome, like that of other eukaryotes, contains diverse repetitive elements. Yet, the chromatin-associated proteome of these regions remains largely unexplored. This study represents a very important conceptual and technical advancement by employing synthetic TALE DNA-binding proteins fused to YFP to selectively capture proteins associated with specific repetitive sequences in T. brucei chromatin. The data presented here are convincing, supported by appropriate controls and a well-validated methodology, aligned with current state-of-the-art approaches.

      The authors used synthetic TALE DNA binding proteins, tagged with YFP, which were designed to target five specific repeat elements in T. brucei genome, including centromere and telomeres-associated repeats and those of a transposon element. This is in order to identify specific proteins that bind to these repetitive sequences in T. brucei chromatin. Validation of the approach was done using a TALE protein designed to target the telomere repeat (TelR-TALE) that detected many of the proteins that were previously implicated with telomeric functions. A TALE protein designed to target the 70 bp repeats that reside adjacent to the VSG genes (70R-TALE) detected proteins that function in DNA repair and a protein designed to target the 177 bp repeat arrays (177R-TALE) identified kinetochore proteins associated T. brucei mega base chromosomes, as well as in intermediate and mini-chromosomes, which imply that kinetochore assembly and segregation mechanisms are similar in all T. brucei chromosomes.

      This study represents a significant conceptual and technical advancement. To the best of our knowledge, it is the first report of employing TALE-YFP for affinity-based detection of protein complexes bound to repetitive genomic sequences in T. brucei. This approach enhances our understanding the organization in these important regions of the trypanosomal chromatin and provides the foundation for investigating the functional roles of associated proteins in parasite biology. These findings will be of particular interest to researchers studying the molecular biology of kinetoplastid parasites and other unicellular organisms, as well as to scientists investigating the roles of repetitive genomic elements in chromatin structure and their functional role in higher eukaryotes.

      Importantly, any essential or unique interacting partners identified using the approach employed here, could serve as a potential target for therapeutic intervention in severe tropical diseases cause by kinetoplastids.

    1. Reviewer #1 (Public review):

      Summary:

      The authors assess the role of map3k1 in adult Planaria through whole body RNAi for various periods of time. The authors' prior work has shown that neoblasts (stem cells that can regenerate the entire body) for various tissues are intermingled in the body. Neoblasts divide to produce progenitors that migrate within a "target zone" to the "differentiated target tissues" where they differentiate into a specific cell type. Here the authors show that map3k1-i animals have ectopic eyes that form along the "normal" migration path of eye progenitors, ectopic neurons and glands along the AP axis and pharynx in ectopic anterior positions. The rest of the study shows that positional information is largely unaffected by loss of map3k1. However, loss of map3k1 leads to premature differentiated of progenitors along their normal migratory route. They also show that "long-term" whole body depletion of map3k1 results in mis-specified organs and teratomas. In short, this study convincingly demonstrates that in planaria, map3k1 maintains progenitor cells in an undifferentiated state, preventing premature fate commitment until they encounter the appropriate signals, either positional cues within a designated region or contact-dependent inputs from surrounding tissues.

      Strengths:

      (1) The study has appropriate controls, sample sizes and statistics.

      (2) The work is high-quality.

      (3) The conclusions are supported by the data.

      (4) Planaria is a good system to analyze the function of map3k1, which exists in mammals but not other invertebrates.

      Weaknesses:

      None noted.

    2. Reviewer #2 (Public review):

      Summary:

      The flatworm planarian Schmidtea mediterranea is an excellent model for understanding cell fate specification during tissue regeneration and adult tissue maintenance. Planarian stem cells, known as neoblasts, are continuously deployed to support cellular turnover and repair tissues damaged or lost due to injury. This reparative process requires great precision to recognize the location, timing, and cellular fate of a defined number of neoblast progeny. Understanding the molecular mechanisms driving this process could have important implications for regenerative medicine and enhance our understanding of how form and function are maintained in long-lived organisms such as humans. Unfortunately, the molecular basis guiding cell fate and differentiation remains poorly understood.

      In this manuscript, Canales et al. identified the role of the map3k1 gene in mediating the differentiation of progenitor cells at the proper target tissue. The map3k1 function in planarians appears evolutionarily conserved as it has been implicated in regulating cell proliferation, differentiation, and cell death in mammals. The results show that the downregulation of map3k1 with RNAi leads to spatial patterning defects in different tissue types, including the eye, pharynx, and the nervous system. Intriguingly, long-term map3k1-RNAi resulted in ectopic outgrowths consistent with teratomas in planarians. The findings suggest that map3k1 mediates signaling, regulating the timing and location of cellular progenitors to maintain correct patterning during adult tissue maintenance.

      Strengths:

      The authors provide an entry point to understanding molecular mechanisms regulating progenitor cell differentiation and patterning during adult tissue maintenance.

      The diverse set of approaches and methods applied to characterize map3k1 function strengthens the case for conserved evolutionary mechanisms in a selected number of tissue types. The creativity using transplantation experiments is commendable, and the findings with the teratoma phenotype are intriguing and worth characterizing.

      Weaknesses:

      The authors have satisfactorily addressed our previous concerns.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript by Joshi and colleagues demonstrates that the precise theta-phase timing of spikes is causal for CA1 hippocampal theta sequences during locomotion on a linear track and is necessary for learning the cognitively demanding outbound component of a hippocampus-dependent alternation task (W-maze), independently of replay during immobility. To reach these conclusions, the authors developed a theta-phase-specific, closed-loop manipulation that used optogenetic activation of medial septal parvalbumin (PV) interneurons at the ascending phase of theta during locomotion. This protocol preserved immobility periods, allowing a clean and elegant dissociation from SWR-associated replay.

      The manuscript is well written and was a pleasure to read. The work described is of high quality and introduces several notable advances to the field:

      (a) It extends prior studies that manipulated theta oscillations by examining precise temporal structure (specifically theta sequences) rather than only LFP features.

      (b) The closed-loop manipulation enabled dissociation between deficits in theta sequences during a behavioural task and SWR-associated replay activity.

      (c) As controls, the authors included rats with suboptimal viral transduction or optic-fibre placement, and, within subjects, both stimulation-on (stim-on) and stimulation-off (stim-off) trials. Notably, sequence disruption persisted into stim-off periods within the same session.

      Overall, this is a strong manuscript that will provide valuable insights to the field. I have only minor comments:

      (1) As the authors note, it is striking that both behavioural performance and spike patterns are altered during stim-off trials. They propose that "disruption of theta sequences during the initial experience in an environment is sufficient to have lasting effects," implying that rapid, experience-dependent plasticity is driven by sequential firing. Does this imply that if rats were previously trained on the task, subsequent stim-on and stim-off trials would yield different outcomes, with stim-off trials showing improved performance and intact theta sequences? For example, if the sequence of one-third stim-on, one-third stim-off, one-third stim-on were inverted to off-on-off, would theta sequences be expected to emerge, disappear, and potentially re-emerge? While I am not asking for additional experiments, I think the discussion could be extended in this aspect.

      Alternatively, could the number of stim-off trials (one third of the total) be insufficient to support learning/induce plasticity? In the controls, ~50-100 trials appear necessary to achieve high performance.

      (2) In line with the point above, the authors characterise the behavioural changes induced by MS optogenetic stimulation specifically as a "learning deficit," as rats failed to improve across 300 trials in an initially novel environment (W-maze). While they present this as complementary to prior demonstrations of impaired performance on previously learned tasks (Zutshi et al., 2018; Quirk et al., 2021; Etter et al., 2023; Petersen et al., 2020), an alternative interpretation is a working-memory deficit. This would produce the same behavioural pattern, with reference memory (the less cognitively demanding trials) remaining intact despite stimulation and concomitant changes in theta sequences. This interpretation would also be consistent with work in certain disease models, where reduced synaptic plasticity and working-memory deficits co-occur with preserved place coding despite impaired theta sequences (e.g., Viana da Silva et al., 2024; Donahue et al., 2025).

      (3) It was not immediately clear whether SWR-associated activity was derived from the interleaved ~15-min rest sessions in a rest box, or from periods of immobility or reward consumption in the maze (aSWR, as in Jadhav et al 2012). Regardless, it would be informative to compare aSWR events within the maze to rest-box SWRs that may occur during more prolonged slow-wave episodes (even if not full sleep). This contrasts with Liu et al. (2024), who analysed replay during ~1.5-h sleep sessions.

    2. Reviewer #2 (Public review):

      Summary:

      The authors of this study developed a closed-loop optogenetic stimulation system with high temporal precision in rats to examine the effect of medial septum (MS) stimulation on the disruption of hippocampal activity at both behavioral and compressed time scales. They found that this manipulation preserved hippocampus single-cell-level spatial coding but affected theta sequences and performance during a spatial alternation task. The performance deficits were observed during the more cognitively demanding component of the task and even persisted after the stimulation was turned off. However, the effects of this disruption were confined to locomotor periods and did not impact waking rest replay, even during the early phase of stimulation-on. Their conclusion is consistent with previous findings from the Pastalkova lab, where MS disruption (using different methods) affected theta sequences and task performance but spared replay (Wang et al., 2015; Wang et al., 2016). However, it differs from a recent study in which optogenetic disruption of EC inputs during running affected both theta sequences and replay (Liu et al., 2023).

      Strengths:

      The experiments were well designed and controlled, and the results were generally well presented.

      Weaknesses:

      Major concerns are primarily technical but also conceptual. To further increase the impact of this study by contrasting findings from different disruptions, it is necessary to better align the analysis and detection methods.

      Major concerns:

      (1) To show that MS disruption does not affect spatial tuning, the authors computed the KL divergence of tuning curves between stimulation-on and stimulation-off conditions. I have two main questions about this analysis:

      (1.1) The authors seem to impose stringent inclusion criteria requiring a large number of spikes and a strong concentration of tuning curves. These criteria may have selected strongly spatially tuned cells, which are typically more stable and potentially less vulnerable to perturbations. Based on the Figure 2 caption, it seems that fewer than 10% of cells were included in the KL divergence analysis, which is lower than the usual proportion of place cells reported in the literature. What is the rationale for using such strict inclusion criteria? What happens to the cells that are not as strongly tuned but are still identified as significant place cells?

      (1.2) The KL divergence was computed between stimulation-on and stimulation-off conditions within the same animal group. However, the authors also showed that MS stimulation had lasting effects on theta sequences and performance even during stimulation-off periods. Would that lasting effect also influence spatial tuning? Based on these questions, the authors should perform additional analyses that directly measure spatial tuning quality and compare results across control and experimental groups - for example, spatial information of spikes (Skaggs et al., 1996), tuning stability, field length, and decoding error during running.

      (2) The authors compared their results with those from Liu et al. (2023) and proposed that the different outcomes could be explained by different sites of disruption. However, the detection and quantification methods for theta sequences and replay differ substantially between the two studies, emphasizing different aspects of the phenomenon. I am not suggesting that either method is superior, but providing additional analyses using aligned detection methods would better support the authors' interpretations and benefit the field by enabling clearer comparisons across studies. In the current analysis, the power spectrum of the decoded ahead/behind distance only indicates that there is a rhythmic pattern, without specifying the decoding features at different theta phases. Moreover, the continuous non-local representations during ripples could include stationary representations of a location or zigzag representations that do not exhibit a linear sequential trace. Given that, the authors should show averaged decoding results corrected by the animal's actual position within theta cycles and compute a quadrant ratio. For replay analysis, they could use a linear fit (as in Liu et al., 2023) and report the proportion of significant replay events.

      (3) The finding that theta sequences and performance were impaired even during stimulation-off periods is particularly interesting and warrants deeper exploration. In the Discussion, the authors claim that this may arise from "the rapid plasticity engaged during early learning." However, this explanation does not fully account for the observation. Previous studies have shown that theta sequences can develop very rapidly (Feng et al., Foster lab, 2015; Zhou et al., Dragoi lab, 2025). If the authors hypothesize that rapid plasticity during early stimulation-on disrupts the theta sequence, then the plasticity window must also be short and terminate during the subsequent stimulation-off period. Otherwise, why can't animals redevelop theta sequences during stimulation-off? The authors should conduct additional analyses during the stimulation-off periods of the W-maze task. For example:

      (3.1) What is the spike-theta phase relationship? Do the phases return to normal or remain altered as during stimulation-on?

      (3.2) Is there a significant place-field remapping from stimulation-on to stimulation-off? (Supplementary Figure 3F includes only a small subset of cells; what if population vector correlations are computed across all cells, or Bayesian decoding of stimulation-on spikes is performed using stimulation-off tuning curves?)

      (3.3) The authors should also discuss why the stimulation-off epochs were not sufficient to support learning, and if the stimulation-off place cell sequences could have supported replay.

      (4) Citations and/or discussion of key studies relevant to the current work are missing: Wang et al. in Pastalkova lab 2015-2016 studies for disruption of theta sequence (but not place cell sequence) disrupting learning but not replay, Drieu et al. in Zugaro lab 2018 study on disruption of theta sequence affecting sleep replay, Farooq and Dragoi 2019 for association between a lack of theta sequence and presence of waking rest replay during postnatal development, etc. The authors should discuss what the conceptually new findings in the current study are, given the findings of the previous literature above.

      (5) The assessment of theta sequence is not state-of-the-art:

      (5.1) Detecting the peak of cross-correlograms between neurons (CCG) relates to behavioral timescale CCG, not the theta sequence one; for the theta sequence, the closest to zero local peak should be used instead.

      (5.2) How were other methods of detecting theta sequences performing on the stimulation-on/stimulation-off data: Bayesian decoding, firing sequences?

      (5.3) How was phase precession during stimulation-on/stimulation-off?

      (6) It would be important to calculate additional variables in the replay part of the study to compare the quality of replay across the 2 groups:

      (6.1) Proportion of significant replay events out of the detected multiunit events.

      (6.2) The average extent of trajectory depicted by the significant replay events in the targeted compared to the control, stimulation-on/stimulation-off.

    3. Reviewer #3 (Public review):

      Joshi et al. present an elegant and technically rigorous study examining how the temporal structure of hippocampal spiking during locomotion contributes to spatial learning. Using a closed-loop, theta phase-specific optogenetic manipulation of medial septal parvalbumin-expressing neurons in rats, the authors demonstrate that disrupting theta-timescale coordination impairs performance on the cognitively demanding component outbound trajectory of a spatial alternation task, while sparing hippocampal replay, place coding, and the simpler inbound learning. The work aims to dissociate the role of theta-associated temporal organization during navigation from sharp-wave ripple-associated replay during subsequent rest periods, providing a mechanistic link between theta sequences and learning. The findings have important implications for models of septo-hippocampal coordination and the functional segregation between online (theta) and offline (SWR) network states. That said, there are a few conceptual and methodological issues that need to be addressed.

      One concern is the overall novelty of this work; the dissociation between online temporal sequence and offline replay events following memory deficits has previously been shown by Wang et al., 2016 elife. While the authors discuss Lui et al., 2023, which demonstrates MEC activation of inhibitory neurons at gamma frequencies during locomotion disrupts theta sequences, subsequent replay and learning (line 65-66), they do not reference Wang et al., 2016 who performed a very similar study with MS pharmacological inactivation, and report large decreases in theta power, attenuated theta frequencies together with behavioural deficits but SWR replay persisted. Given strong similarities in the manipulation and findings, this study should be discussed.

      Along the same lines, it should be noted that Brandon et al. (2014, Neuron) demonstrated that hippocampal place codes can still form in novel environments despite MS inactivation and loss of theta, indicating that spatial representations can emerge without intact septal drive. Referencing this study would strengthen the discussion of how temporal coordination, rather than spatial coding per se, underlies the learning deficits observed here.

      The conclusion that disrupting "theta microstructure" impairs learning relies on the assumption that the observed behavioral deficits arise from altered temporal coding from within hippocampal CA1 only. However, optogenetic modulation of medial septal PV neurons influences multiple downstream regions (entorhinal cortex, retrosplenial cortex) via widespread GABAergic projections. While the authors do touch on this, their discussion should expand to include the network-level consequences of entorhinal grid-cell disruption and how this could affect temporal coding both online and offline.

      The finding that replay content, rate, and duration are unchanged is critical to the paper's claim of dissociation. However, the analysis is restricted to immobility on the track. Given evidence for distinct awake vs. sleep replay, confirming that off-track rest and post-session sleep replays are similarly unaffected would confirm the conclusions of the paper. If these data are unavailable, the limitation should be acknowledged explicitly. Moreover, statistical power for detecting subtle differences in replay organization or spatial bias should be added to the supplement (n of events per animal, variability across sessions).

      The exact protocol for optogenetic stimulation is a bit confusing. For the task, the first and final third (66%) of trials were disrupted and were only stimulated when away from the reward well and only when the animal was moving. What proportion of time within "stimulated" trials remained unstimulated? Why were only 66% of trials stimulated?

    1. Reviewer #1 (Public review):

      Summary:

      This study examines the role of the long non-coding RNA Dreg1 in regulating Gata3 expression and ILC2 development. Using Dreg1-deficient mice, the authors show a selective loss of ILC2s but not T or NK cells, suggesting a lineage-specific requirement for Dreg1. By integrating public chromatin and TF-binding datasets, they propose a Tcf1-Dreg1-Gata3 regulatory axis. The topic is relevant for understanding epigenetic regulation of ILC differentiation.

      Strengths:

      (1) Clear in vivo evidence for a lineage-specific role of Dreg1.

      (2) Comprehensive integration of genomic datasets.

      (3) Cross-species comparison linking mouse and human regulatory regions.

      Weaknesses:

      (1) Mechanistic conclusions remain correlative, relying on public data.

      (2) Lack of direct chromatin or transcriptional validation of Tcf1-mediated regulation.

      (3) Human enhancer function is not experimentally confirmed.

      (4) Insufficient methodological detail and limited mechanistic discussion.

    2. Reviewer #2 (Public review):

      The authors investigate the role of the long non-coding RNA Dreg1 for the development, differentiation, or maintenance of group 2 ILC (ILC2). Dreg1 is encoded close to the Gata3 locus, a transcription factor implicated in the differentiation of T cells and ILC, and in particular of type 2 immune cells (i.e., Th2 cells and ILC2). The center of the paper is the generation of a Dreg1-deficient mouse. While Dreg1-/- mice did not show any profound ab T or gd T cell, ILC1, ILC3, and NK cell phenotypes, ILC2 frequencies were reduced in various organs tested (small intestine, lung, visceral adipose tissue). In the bone marrow, immature ILC2 or ILC2 progenitors were reduced, whereas a common ILC progenitor was overrepresented, suggesting a differentiation block. Using ATAC-seq, the authors find that the promoter of Dreg1 is open in early lymphoid progenitors, and the acquisition of chromatin accessibility downstream correlates with increased Dreg1 expression in ILC2 progenitors. Examining publicly available Tcf1 CUT&Run data, they find that Tcf1 was specifically bound to the accessible sites of the Dreg1 locus in early innate lymphoid progenitors. Finally, the syntenic region in the human genome contains two non-coding RNA genes with an expression pattern resembling mouse Dreg1.

      The topic of the manuscript is interesting. However, there are various limitations that are summarized below.

      (1) The authors generated a new mouse model. The strategy should be better described, including the genetic background of the initially microinjected material. How many generations was the targeted offspring backcrossed to C57BL/6J?

      (2) The data is obtained from mice in which the Dreg1 gene is deleted in all cells. A cell-intrinsic role of Dreg1 in ILC2 has not been demonstrated. It should be shown that Dreg1 is required in ILC2 and their progenitors.

      (3) The data on how Dreg1 contributes to the differentiation and or maintenance of ILC2 is not addressed at a very definitive level. Does Dreg1 affect Gata3 expression, mRNA stability, or turnover in ILC2? Previous work of the authors indicated that knockdown of Dreg1 does not affect Gata3 expression (PMID: 32970351).

      (4) How Dreg1 exactly affects ILC2 differentiation remains unclear.

    1. Reviewer #1 (Public review):

      Summary:

      The authors provide a resource to the systems neuroscience community by offering their Python-based CLoPy platform for closed-loop feedback training. In addition to using neural feedback, as is common in these experiments, they include a capability to use real-time movement extracted from DeepLabCut as the control signal. The methods and repository are detailed for those who wish to use this resource. Furthermore, they demonstrate the efficacy of their system through a series of mesoscale calcium imaging experiments. These experiments use a large number of cortical regions for the control signal in the neural feedback setup, while the movement feedback experiments are analyzed more extensively. The revised preprint has improved substantially upon the previous submission.

      Strengths:

      The primary strength of the paper is the availability of their CLoPy platform. Currently, most closed-loop operant conditioning experiments are custom built by each lab, and carry a relatively large startup cost to get running. This platform lowers the barrier to entry for closed-loop operant conditioning experiments, in addition to making the experiments more accessible to those with less technical expertise.

      Another strength of the paper is the use of many different cortical regions as control signals for the neurofeedback experiments. Rodent operant conditioning experiments typically record from the motor cortex, and maybe one other region. Here, the authors demonstrate that mice can volitionally control many different cortical regions not limited to those previously studied, recording across many regions in the same experiment. This demonstrates the relative flexibility of modulating neural dynamics, including in non-motor regions.

      Finally, adapting the closed-loop platform to use real-time movement as a control signal is a nice addition. Incorporating movement kinematics into operant conditioning experiments has been a challenge due to the increased technical difficulties of extracting real-time kinematic data from video data at a latency where it can be used as a control signal for operant conditioning. In this paper, they demonstrate that the mice can learn the task using their forelimb position, at a rate that is quicker than the neurofeedback experiments.

      Weaknesses:

      Many of the original weaknesses have been addressed in the revised preprint.

      While the dataset contains an impressive amount of animals and cortical regions for the neurofeedback experiment, my excitement for these experiments is tempered by the relative incompleteness of the dataset.

      Additionally, adoption of the platform may be hindered by the absence of a tutorial on how to run a session.

    2. Reviewer #2 (Public review):

      Summary:

      In this work, Gupta & Murphy present several parallel efforts. On one side, they present the hardware and software they use to build a head-fixed mouse experimental setup that they use to track in "real-time" the calcium activity in one or two spots at the surface of the cortex. On the other side, they present another setup that they use to take advantage of the "real-time" version of DeepLabCut with their mice. The hardware and software that they used/develop is described at length, both in the article and in a companion GitHub repository. Next, they present experimental work that they have done with these two setups, training mice to max out a virtual cursor to obtain a reward, by taking advantage of auditory tone feedback that is provided to the mice as they modulate either (1) their local cortical calcium activity, or (2) their limb position.

      Strengths:

      This work illustrates the fact that thanks to readily available experimental building blocks, body movement and calcium imaging can be carried out using readily available components, including imaging the brain using an incredibly cheap consumer electronics RGB camera (RGB Raspberry Pi Camera). It is a useful source of information for researchers that may be interested in building a similar setup, given the highly detailed overview of the system. Finally, it further confirms previous findings regarding the operant conditioning of the calcium dynamics at the surface of the cortex (Clancy et al. 2020) and suggests an alternative based on deeplabcut to the motor tasks that aim to image the brain at the mesoscale during forelimb movements (Quarta et al. 2022).

      Weaknesses:

      This work covers 3 separate research endeavors: (1) The development of two separate setups, their corresponding software. (2) A study that is highly inspired from the Clancy et al. 2021 paper on the modulation of the local cortical activity measured through a mesoscale calcium imaging setup. (3) A study of the mesoscale dynamics of the cortex during forelimb movements learning. Sadly, the analyses of the physiological data appears incomplete, and more generally, the paper shows weaknesses regarding several points:

      The behavioral setups that are presented are representative of the state of the art in the field of mesoscale imaging/head fixed behavior community, rather than a highly innovative design. Still, they definitely have value as a starting point for laboratories interested in implementing such approaches.

      Throughout the paper, there are several statements that point out how important it is to carry out this work in a closed-loop setting with an auditory feedback, but sadly there is no "no feedback" control in cortical conditioning experiments, while there is a no-feedback condition in the forelimb movement study, which shows that learning of the task can be achieved in the absence of feedback.

      The analysis of the closed-loop neuronal data behavior lacks controls. Increased performance can be achieved by modulating actively only one of the two ROIs, this is not really analyzed, while this finding which does not match previous reports (Clancy et al. 2020) would be important to further examine.

    3. Reviewer #3 (Public review):

      Summary:

      The study demonstrates the effectiveness of a cost-effective closed-loop feedback system for modulating brain activity and behavior in head-fixed mice. Authors have tested real-time closed-loop feedback system in head-fixed mice two types of graded feedback: 1) Closed-loop neurofeedback (CLNF), where feedback is derived from neuronal activity (calcium imaging), and 2) Closed-loop movement feedback (CLMF), where feedback is based on observed body movement. It is a python based opensource system, and the authors call it CLoPy. Authors also claim to provide all software, hardware schematics, and protocols to adapt it to various experimental scenarios. This system is capable and can be adapted for a wide use case scenarios.

      Authors have shown that their system can control both positive (water drop) and negative reinforcement (buzzer-vibrator). This study also shows that using the closed-loop system, mice have shown to better performance, learnt arbitrary tasks and can adapt to changes in the rules as well. By integrating real-time feedback based on cortical GCaMP imaging and behavior tracking authors have provided strong evidence that such closed-loop systems can be instrumental in exploring the dynamic interplay between brain activity and behavior.

      Strengths:

      Simplicity of feedback systems design. Simplicity of implementation and potential adoption.

      Weaknesses:

      Long latencies, due to slow Ca2+ dynamics and slow imaging (15 FPS), may limit the application of the system.

    1. Reviewer #1 (Public review):

      Summary:

      Here Bansal et al., present a study on the fundamental blood and nectar feeding behaviors of the critical disease vector, Anopheles stephensi. The study encompasses not just the fundamental changes in blood feeding behaviors of the crucially understudied vector, but then use a transcriptomic approach to identify candidate neuromodulation path ways which influence blood feeding behavior in this mosquito species. The authors then provide evidence through RNAi knockdown of candidate pathways that the neuromodulators sNPF and Rya modulate feeding either via their physiological activity in the brain alone or through joint physiological activity along the brain-gut axis (but critically not the gut alone). Overall, I found this study to be built on tractable, well-designed behavioral experiments.

      Their study begins with a well-structured experiment to assess how the feeding behaviors of A. stephensi changes over the course of its life history and in response to its age, mating and oviposition status. The authors are careful and validate their experimental paradigm in the more well-studied Ae. aegypti, and are able to recapitulate the results of prior studies which show that mating is pre-requisite for blood feeding behaviors in Ae. aegypt. Here they find A. stephensi like another Anopheline mosquitoes has a more nuanced regulation of its blood and nectar feeding behaviors.

      The authors then go on to show in a Y- maze olfactometer that to some degree, changes in blood feeding status depend on behavioral modulation to host-cues, and this is not likely to be a simple change to the biting behaviors alone. I was especially struck by the swap in valence of the host-cues for the blood-fed and mated individuals which had not yet oviposited. This indicates that there is a change in behavior that is not simply desensitization to host-cues while navigating in flight, but something much more exciting happening.

      The authors then use a transcriptomic approach to identify candidate genes in the blood feeding stages of the mosquito's life cycle to identify a list of 9 candidates which have a role in regulating the host-seeking status of A. stephensi. Then through investigations of gene knockdown of candidates they identify the dual action of RYa and sNPF and candidate neuromodulators of host-seeking in this species. Overrall, I found the experiments to be well-designed. I found the molecular approach to be sound. While I do not think the molecular approach is necessarily an all-encompassing mechanism identification (owing mostly to the fact that genetic resources are not yet available in A. stephensi as they are in other dipteran models), I think it sets up a rich lines of research questions for the neurobiology of mosquito behavioral plasticity and comparative evolution of neuromodulator action.

      Strengths:

      I am especially impressed by the authors' attention to small details in the course of this article. As I read and evaluated this article I continued to think how many crucial details I may have missed if I were the scientist conducting these experiments. That attention to detail paid off in spades and allowed the authors to carefully tease apart molecular candidates of blood-seeking stages. The authors top down approach to identifying RYamide and sNPF starting from first principles behavioral experiments is especially comprehensive. The results from both the behavioral and molecular target studies will have broad implications for the vectorial capacity of this species and comparative evolution of neural circuit modulation.

      I believe the authors have adequately addressed all of my concerns; however, I think an accompanying figure to match the explained methods of the tissue-specific knockdown would help readers. The methods are now explicitly written for the timing and concentrations required to achieve tissue-specific knockdown, but seeing the data as a supplement would be especially reassuring given the critical nature of tissue-specific knockdown to the final interpretations of this paper.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Bansal et al examine and characterize feeding behaviour in Anopheles stephensi mosquitoes. While sharing some similarities to the well-studied Aedes aegypti mosquito, the authors demonstrate that mated-females, but not unmated (virgin) females, exhibit suppression in their blood-feeding behaviour. Using brain transcriptomic analysis comparing sugar fed, blood fed and starved mosquitoes, several candidate genes potentially responsible for influencing blood-feeding behaviour were identified, including two neuropeptides (short NPF and RYamide) that are known to modulate feeding behaviour in other mosquito species. Using molecular tools including in situ hybridization, the authors map the distribution of cells producing these neuropeptides in the nervous system and in the gut. Further, by implementing systemic RNA interference (RNAi), the study suggests that both neuropeptides appear to promote blood-feeding (but do not impact sugar feeding) although the impact was observed only after both neuropeptide genes underwent knockdown.

      While the authors have addressed most of the concerns of the original manuscript, a few issues remain. Particularly, the following two points:

      (5) Figure 4

      The authors state that there is more efficient knockdown in the head of unfed females; however, this is not accurate since they only get knockdown in unfed animals, and no evidence of any knockdown in fed animals (panel D). This point should be revised in the results test as well.

      Perhaps we do not understand the reviewer's point or there has been a misunderstanding. In Figure 4D, we show that while there is more robust gene knockdown in unfed females, blood-fed females also showed modest but measurable knockdowns ranging from 5-40% for RYamide and 2-21% for sNPF.

      NEW-

      In both the dsRNA treatments where animals were fed, neither was significantly different from control. Therefore, there is no change, and indeed this is confirmed by the author's labelling of the figure stats in panel 4D.

      In addition, do the uninjected and dsGFP-injected relative mRNA expression data reflect combined RYa and sNPF levels? Why is there no variation in these data,...

      In these qPCRs, we calculated relative mRNA expression using the delta-delta Ct method (see line 975). For each neuropeptide its respective control was used. For simplicity, we combined the RYa and sNPF control data into a single representation. The value of this control is invariant because this method sets the control baseline to a value of 1.

      NEW-

      The authors are claiming that there is no variation between individual qPCR experiments (particularly in their controls)? Normally, one uses a known standard value (or calibrator) across multiple experiments/plates so that variation across biological replicates can be assessed. This has an impact on statistical analyses since there is no variation in the control data. Indeed, this impacts all figures/datasets in the manuscript where qPCR data is presented. All the controls have zero variation!

    3. Reviewer #3 (Public review):

      Summary:

      This manuscript investigates the regulation of host-seeking behavior in Anopheles stephensi females across different life stages and mating states. Through transcriptomic profiling, the authors identify differential gene expression between "blood-hungry" and "blood-sated" states. Two neuropeptides, sNPF and RYamide, are highlighted as potential mediators of host-seeking behavior. RNAi knockdown of these peptides alters host-seeking activity, and their expression is anatomically mapped in the mosquito brain (sNPF and RYamide) and midgut (sNPF only).

      Strengths:

      (1) The study addresses an important question in mosquito biology, with relevance to vector control and disease transmission.

      (2) Transcriptomic profiling is used to uncover gene expression changes linked to behavioral states.

      (3) The identification of sNPF and RYamide as candidate regulators provides a clear focus for downstream mechanistic work.

      (3) RNAi experiments demonstrate that these neuropeptides are necessary for normal host-seeking behavior.

      (4) Anatomical localization of neuropeptide expression adds depth to the functional findings.

      Weaknesses:

      (1) The title implies that the neuropeptides promote host-seeking, but sufficiency is not demonstrated and some conclusions appear premature based on the current data. The support for this conclusion would be strengthened with functional validation using peptide injection or genetic manipulation.

      (2) The identification of candidate receptors is promising, but the manuscript would be significantly strengthened by testing whether receptor knockdowns phenocopy peptide knockdowns. Without this, it is difficult to conclude that the identified receptors mediate the behavioral effects.

      (3) Some important caveats, such as variation in knockdown efficiency and the possibility of off-target effects, are not adequately discussed.

    1. Reviewer #1 (Public review):

      Bredenberg et al. aim to model some of the visual and neural effects of psychedelics via the Wake-Sleep algorithm. This is an interesting study with findings that challenge certain mainstream ideas in psychedelic neuroscience.

      While some of my concerns have been addressed in revision, I am still not convinced that this model applies to 5-HT2A hallucinogens, as opposed to a pharmacologically distinct hallucinogen. I think it is important to justify which class of hallucinogens this model applies to and distinguish it from other hallucinogens. While some researchers tend to group several hallucinogens together (e.g., 5-HT2A agonists, NMDA antagonists, kappa-opioids agonists), I'm not convinced this is warranted, when they have distinct subjective and cognitive effects (including quite different visual distortions, and again I point out that the kappa-opioid agonist salvinorin A, which is referred to as an "oneirogen," has been described as particularly dream-like, perhaps more so than 5-HT2A hallucinogens), as well as some differences in therapeutic outcomes (ketamine seems to not have as persisting of therapeutic effects, and kappa-opioid agonist have yet to be shown to be therapeutic). Their use patterns highlight this (e.g., 5-HT2A drugs are used less in non-festival/rave social settings compared to NMDA drugs like ketamine, which can be used frequently enough to the point of abuse; kappa-opioid agonists have quite mixed effects in terms of pleasurable outcomes, thereby rarely being used/abused and almost never to my knowledge being used recreationally).

      In sum, more is needed to justify the claim that this work applies to 5-HT2A drugs in particular.

    2. Reviewer #2 (Public review):

      This work is a nice contribution to the literature in articulating a specific, testable theory of how psychedelics act to generate hallucinations and plasticity.

      I believe my concerns from the first round of review have been addressed in this version.

    1. Reviewer #1 (Public review):

      Summary:

      By using an established NAFLD model, choline-deficient high-fat diet, Barros et al show that LPS challenge causes excessive IFN-γ production by hepatic NK cells which further induces recruitment and polarization of a PD-L1 positive neutrophil subset leading to massive TNFα production and increased host mortality. Genetic inhibition of IFN-γ or pharmacological blockade of PD-L1 decreases recruitment of these neutrophils and TNFα release, consequently preventing liver damage and decreasing host death.

      Since NAFLD is often accompanied by chronic, low-grade inflammation, it can lead to an overactive but dysfunctional immune response and increase the body's overall susceptibility to infections, therefore this is very important research question.

      Strengths:

      The biggest strength of the manuscript is vast number of mouse strains used.

      Weaknesses:

      After the review, there are still some open questions from my side:

      (1) I would like the authors to defend their choice of diet type since this has not been done in the review/response to authors. In case they cannot, we need additional proof (HFD or WD model).

      (2) Since the authors used same control groups (chow and HFCD), as required by the animal ethics committee, they must have power analysis test to show that the number of controls (but also in other groups) they used is enough to see the effect. Please provide it.

    2. Reviewer #2 (Public review):

      Summary:

      This is an extremely interesting mouse study, trying to understand how sepsis is tolerated during obesity/NAFLD. The researchers combine a well-established model of NASH (Choline-deficiency with High Fat Diet) with a sepsis model (IP injection of 10mg/kg LPS), leading to dramatic mortality in mice. Using this model, they characterize the complex contributions of immune cells. Specifically, they find that NK-cells and Neutrophils contribute the most to mortality in this model due to IFNG and PD-L1+ Neutrophils.

      Strengths:

      The biggest strength of the manuscript is how clear the primary phenotypes/endpoints of their model are. Within 6 hours of LPS injection, there is a stark elevation of liver inflammation and damage, which is exacerbated by a High Fat/CholineDeficient diet (HFCD). And after 1 day, almost all of the mice die. Using these endpoints, the authors were able to identify which cells were critical for mortality in the model and the specific mediators involved.

      Comments on revisions:

      I have no further comments.

    1. Reviewer #1 (Public review):

      Summary:

      The image analysis pipeline is tested in analysing microscopy imaging data of gastruloids of varying sizes, for which an optimised protocol for in toto image acquisition is established based on whole mount sample preparation using an optimal refractive index matched mounting media, opposing dual side imaging with two-photon microscopy for enhaced laser penetration, dual view registration and weighted fusion for improved in toto sample data representation. For enhanced imaging speed in a two-photon microscope, parallel imaging was used and the authors performed spectral unmixing analysis to avoid issues of signal cross-talk.

      In the image analysis pipeline image, different pre-treatments are done dependent on the analysis to be performed (for nuclear segmentation - contrast enhancement and normalisation; for quantitative analysis of gene expression - corrections for optical artifacts inducing signal intensity variations). Stardist3D was used for the nuclear segmentation. The study analyses in toto properties of gastruloid nuclear density, patterns of cell division, morphology, deformation and gene expression.

      Strengths:

      The methods developed are sound, well described and well validated, using a sample challenging for microscopy, gastruloids. Many of the established methods are very useful (e.g. registration, corrections, signal normalisation, lazy loading bioimage visualisation, spectral decomposition analysis), facilitate the development of quantitative research and would be of interest to the wide scientific community.

      Comments on revisions:

      I am happy with the job the authors have done with the revision. No further comments.

    2. Reviewer #2 (Public review):

      Summary:

      This study presents an integrated experimental and computational pipeline for high-resolution, quantitative imaging and analysis of gastruloids. The experimental module employs dual-view two-photon spectral imaging combined with optimized clearing and mounting techniques, enabling improved deep-tissue visualization compared with conventional methods. This advanced approach allows comprehensive 3D imaging of whole-mount immunostained gastruloids, capturing both tissue-scale architecture and single-cell-level information.

      The computational module encompasses both pre-processing of acquired images and downstream analysis, providing quantitative insights into the structural and molecular characteristics of gastruloids. The pre-processing pipeline, tailored for dual-view two-photon microscopy, includes spectral unmixing of fluorescence signals using depth-dependent spectral profiles, as well as image fusion via rigid 3D transformation based on content-based block-matching algorithms. Nuclei segmentation was performed using a custom-trained StarDist3D model, validated against 2D manual annotations, and achieving an F1 score of 85+/-3% at a 50% intersection-over-union (IoU) threshold. Another custom-trained StarDist3D model enabled accurate detection of proliferating cells and the generation of 3D spatial maps of nuclear density and proliferation probability. Moreover, the pipeline facilitates detailed morphometric analysis of cell density and nuclear deformation, revealing pronounced spatial heterogeneities during early gastruloid morphogenesis.

      All computational tools developed in this study are released as open-source, Python-based software.

      Strengths:

      The authors applied two-photon microscopy to whole-mount deep imaging of gastruloids, achieving in toto visualization at single-cell resolution. By combining spectral imaging with an unmixing algorithm, they successfully separated four fluorescent signals, enabling spatial analysis of gene expression patterns.

      The image analysis method for nuclei segmentation was thoroughly benchmarked against existing methods, demonstrating advantages over conventional approaches, and its applicability across diverse datasets was convincingly established. The authors also evaluated the state-of-the-art Cellpose-SAM framework, showing that it performs well on their data and that the authors' preprocessing strategy can further enhance Cellpose-SAM's segmentation performance in deep tissues.<br /> The entire computational workflow, from image pre-processing to segmentation with a custom-trained StarDist3D model and subsequent quantitative analysis, is made available as open-source software. In addition, user-friendly interfaces are provided through the open-source, community-driven napari platform, facilitating interactive exploration and analysis.

      Weaknesses:

      In my initial review, I noted that the developed image analysis pipeline lacked benchmarking against existing methods and provided only a limited demonstration of its applicability to other datasets. These points have been appropriately addressed in the revised manuscript, and I have no further weaknesses to note.

      Appraisal:

      The authors set out to establish a quantitative imaging and analysis pipeline for gastruloids using dual-view two-photon microscopy, spectral unmixing, and a custom computational framework for 3D segmentation and gene expression analysis. This aim was compellingly achieved. The integration of experimental and computational modules enables high-resolution in toto imaging and robust quantitative analysis at the single-cell level. The data presented support the authors' conclusions regarding the ability to capture spatial patterns of gene expression and cellular morphology across developmental stages.

      Impact and utility:

      This work presents a compelling and broadly applicable methodological advance. The approach is particularly impactful for the developmental biology community, as it allows researchers to extract quantitative information from high-resolution images to better understand morphogenetic processes. The data are publicly available on Zenodo, and the software is released on GitHub, making them highly valuable resources for the community. Given that suitable datasets for developing advanced 3D cell segmentation methods remain scarce in biological image analysis, the public release of these data is significant and is expected to stimulate further advances in the development of sophisticated computational approaches.

      Comments on revisions:

      The authors have addressed the previous revision thoroughly and appropriately. I have no further suggestions or additional recommendations at this time.

    3. Reviewer #3 (Public review):

      Summary

      The paper presents a imaging and analysis pipeline for whole-mount gastruloid imaging with two-photon microscopy. The presented pipeline includes spectral unmixing, registration, segmentation, and a wavelength-depended intensity normalization step, followed by quantitative analysis of spatial gene expression patterns and nuclear morphometry on a tissue level. The utility of the approach is demonstrated by several experimental findings such as establishing spatial correlations between local nuclear deformation and tissue density changes, as well as radial distribution pattern of mesoderm markers. The pipeline is distributed as a Python package, notebooks and multiple napari plugins.

      Strengths

      The paper is well-written with detailed methodological descriptions, which I think would make it a valuable reference for researchers performing similar volumetric tissue imaging experiments (gastruloids/organoids). The pipeline itself addresses many practical challenges including resolution loss within tissue, registration of large volumes, nuclear segmentation, and intensity normalization. Especially the intensity decay measurements and wavelength-dependent intensity normalization approach using nuclear (Hoechst) signal as reference is very interesting and should be applicable to other imaging contexts. The morphometric analysis is equally well done with the correlation between nuclear shape deformation and tissue density changes being a interesting finding. The paper is quite thorough in its technical description of the methods (which are a lot) and their experimental validation is appropriate. Finally, the provided code and napari plugins seem to be well done (I installed a selected list of the plugins and they ran without issues) and should be very helpful for the community.

      Comments on revisions:

      The minor issues that I originally raised in my first review have been fully resolved in the revised version.

    1. Reviewer #1 (Public review):

      Summary:

      This paper investigates the potential link between amygdala volume and social tolerance in multiple macaque species. Through a comparative lens, the authors considered tolerance grade, species, age, sex, and other factors that may contribute to differing brain volumes. They found that amygdala, but not hippocampal, volume differed across tolerance grades such that high-tolerance species showed larger amygdala than low-tolerance species of macaques. They also found that less tolerant species exhibited increases in amygdala volume with age, while more tolerant species showed the opposite. Given their wide range of species with varied biological and ecological factors, the authors' findings provide new, important evidence for changes in amygdala volume in relation to social tolerance grades. Contributions from these findings will greatly benefit future efforts in the field to characterize brain regions critical for social and emotional processing across species.

      (1) This study demonstrates a concerted and impressive effort to comparatively examine neuroanatomical contributions to sociality in monkeys. The authors impressively collected samples from 12 macaque species with multiple datapoints across species age, sex, and ecological factors. Species from all four social tolerance grades were present. Further, the age range of the animals is noteworthy, particularly the inclusion of individuals over 20 years old.

      (2) This work is the first to report neuroanatomical correlates of social tolerance grade in macaques in one coherent study. Given the prevalence of macaques as a model of social neuroscience, considerations of how socio-cognitive demands are impacted by the amygdala are highly important. The authors' findings will certainly inform future studies on this topic.

      (3) The methodology and supplemental figures for acquiring brain MRI images are nicely detailed. Clear information on these parameters is crucial for future comparative interpretations of sociality and brain volume, and the authors do an excellent job of describing this process in full.

      (4) The following comments were brought up during the review. In their revision, the authors have sufficiently addressed all of these comments by providing detailed responses and updating their manuscript. First, the revision clarified how much one could draw conclusions about "nature vs. nurture" from this study. Second, the revision also clarified the contributions of very young and very old animals in their correlations. Third, in their revision, the authors expanded on how their results could be interpreted in the context of multiple behavioral traits by Thierry (2021) by providing more detailed descriptions. Finally, during the revision, the authors clarified that both intolerant and tolerant species experience complex socio-cognitive demands and highlighted that socio-cognitive challenges arise across the tolerance spectrum under different behavioral demands.

    2. Reviewer #2 (Public review):

      Summary:

      This comparative study of macaque species and type of social interaction is both ambitious and inevitably comes with a lot of caveats. The overall conclusion is that more intolerant species have a larger amygdala. There are also opposing development profiles regarding amygdala volume depending on whether it is a tolerant or intolerant species.

      To achieve any sort of power they have combined data from 4 centres - that have all used different scanning methods and there are some resolution differences. The authors have also had to group species into 4 classifications - again to assist with any generalisations and power. They have focussed on the volumes of two structures, the amygdala and the hippocampus, which seems appropriate. Neither structure is homogeneous and so it may well be that a targeted focus on specific nuclei or subfields would help (the authors may well do this next) - but as the variables would only increase further along with the number of potential comparisons, alongside small group numbers, it seems only prudent to treat these findings are preliminary. That said, it is highly unlikely that large numbers of macaque brains will become available in the near future.

      This introduction is by way of saying that the study achieves what it sets out to do, but there are many reasons to see this study as preliminary. The main message seems to be twofold: 1) that more intolerant species have relatively larger amygdalae, and 2) that with development there is an opposite pattern of volume change (increasing with age in intolerant sp and decreasing with age in tolerant species). Finding 1 is the opposite of that predicted in Table 1 - this is fine, but it should be made clearer in the Discussion that this is the case otherwise the reader may feel confused. As I read it, the authors have switched their prediction in the Discussion, which feels uncomfortable.

      It is inevitable that the data in a study of this complexity are all too prone to post hoc considerations, to which the authors indulge. I suspect I would end up doing the same but it feels a bit like 'heads I win, tails you lose'. In the case of Grade 1 species, the individuals have a lot to learn especially if they are not top of the hierarchy, but at the same time there are fewer individuals in the troop, making predictions very tricky. As noted above, I am concerned by the seemingly opposite predictions in Table 1 and those in the Discussion regarding tolerance and amygdala volume. (It may be that the predictions in Table 1 are the opposite to how I read them, in which case the Table and preceding text needs to align.)

      Comments on revisions:

      I am happy with all of the revisions and the care shown by the authors.

    3. Reviewer #3 (Public review):

      Summary:

      In this study, the authors were looking at neurocorrelates of behavioural differences within the genus Macaca. To do so, they engaged in real-world dissection of dead animals (unconnected to the present study) coming from a range of different institutions. They subsequently compare different brain areas, here the amygdala and the hippocampus, across species. Crucially, these species have been sorted according to different levels of social tolerance grades (from 1 to 4). 12 species are represented across 42 individuals. The sampling process has weaknesses ("only half" of the species contained by the genus, and Macaca mulatta, the rhesus macaque, representing 13 of the total number of individuals), but also strengths (the species are decently well represented across the 4 grades) for the given purpose and for the amount of work required here. I will not judge the dissection process as I am not a neuroanatomist, and I will assume that the different interventions do not alter volume in any significant ways / or that the different conditions in which the bodies were kept led to the documented differences across species.

      There are two main results of the study. First, in line with their predictions, the authors find that more tolerant macaque species have larger amygdala, compared to the hippocampus that remains undifferentiated across species. Second, they also identify developmental effects, although with different trends: in tolerant species, the amygdala relative volume decreases across the lifespan, while in intolerant species, the contrary occurs. The modifications brought up between the two versions of the article have answered my remarks regarding age/grade/brain area differences.

      As such, I think the results are holding strong, but maybe more work is needed with respect to interpretation.<br /> Classification of the social grade, as well as the issue of nature vs nurture have been addressed by the authors, I thank them for this.<br /> I still feel the integration of the amygdala as a common cognitive & emotional center could be possibly more pushed in the discussion, although I acknowledge that it would be complicated to do without knowing how the emotional and social lives of these animals impacted the growth of their amygdala...

      Strengths:

      Methods & breadth of species tested

      Weaknesses:

      Interpretations, which, although softened, could still be more integrated with the literature on emotion

    1. Reviewer #1 (Public review):

      Summary:

      The manuscript by Hensley and Yildez studies the mechanical behavior of kinesin under conditions where the z-component of the applied force is minimized. This is accomplished by tethering the kinesin to the trapped bead with a long double stranded DNA segment as opposed to directly binding the kinesin to the large bead. It complements several recent studies that have used different approaches to looking at the mechanical properties of kinesin under low z-force loads. The study shows that much of the mechanical information gleaned from the traditional "one bead" with attached kinesin approach was probably profoundly influenced by the direction of the applied force. The authors speculate that when moving small vesicle cargos (particularly membrane bound ones) the direction of resisting force on the motor has much less of a z-component than might be experience if the motor were moving large organelles like mitochondria.

      Strengths:

      The approach is sound and provides an alternative method to examine the mechanics of kinesin under conditions where the z-component of the force is lessened. The data show that kinesin has very different mechanical properties compared to those extensively reported with using the "single-bead" assay where the molecule is directly coupled to a large bead which is then trapped.

      Weaknesses:

      The sub stoichiometry binding of kinesins to the multivalent DNA complicates the interpretation of the data.

      Comments on revisions:

      The authors have addressed my concerns.

    2. Reviewer #2 (Public review):

      This short report by Hensley and Yildiz explores kinesin-1 motility under more physiological load geometries than previous studies. Large Z-direction (or radial) forces are a consequence of certain optical trap experimental geometries, and likely do not occur in the cell. Use of a long DNA tether between the motor and the bead can alleviate Z-component forces. The authors perform three experiments. In the first, they use two assay geometries - one with kinesin attached directly to a bead and the other with kinesin attached via a 2 kbp DNA tether - with a constant-position trap to determine that reducing the Z component of force leads to a difference in stall time but not stall force. In the second, they use the same two assay geometries with a constant-force trap to replicate the asymmetric slip bond of kinesin-1; reducing the Z component of force leads to a small but uniform change in the run lengths and detachment rates under hindering forces but not assisting forces. In the third, they connect two or three kinesin molecules to each DNA, and measure a stronger scaling in stall force and time when the Z component of force is reduced. They conclude that kinesin-1 is a more robust motor than previously envisaged, where much of its weakness came from the application of axial force. If forces are instead along the direction of transport, kinesin can hold on longer and work well in teams. The experiments are rigorous, and the data quality is very high. There is little to critique or discuss. The improved dataset will be useful for modeling and understanding multi-motor transport. The conclusions complement other recent works that used different approaches to low-Z component kinesin force spectroscopy, and provide strong value to the kinesin field.

      Comments on revisions:

      The authors have satisfied all of my comments. I commend them on an excellent paper.

    3. Reviewer #3 (Public review):

      Hensley et al. present an important study into the force-detachment behaviour of kinesin-1, using a newly adapted methodological approach. This new method of DNA-tethered motor trapping is effective in reducing vertical forces and can be easily optimised for other motors and protein characterisation. The major strength of the paper is characterising kinesin-1 under low z-forces, which is likely to reflect the physiological scenario. They find kinesin-1 is more robust and less prone to premature detachment. The motors exhibit higher stall rates and times. Under hindering and assisting loads, kinesin-1 detachment is more asymmetric and sensitive, and with low z-force shows that slip-behaviour kinetics prevail. Another achievement of this paper is the demonstration of the multi-motor kinesin-1 assay using their low-z force method, showing that multiple kinesin-1 motors are capable of generating higher forces (up to 15 pN, and nearly proportional to motor number), thus opening an avenue to study multiple motor coordination. Overall, the data have been collected in a rigorous manner, the new technique is sound and effective, and results presented are compelling.

    1. Reviewer #1 (Public review):

      The authors have conducted substantial additional analyses to address the reviewers' comments. However, several key points still require attention. I was unable to see the correspondence between the model predictions and the data in the added quantitative analysis. In the rebuttal letter, the delta peak speed time displays values in the range of [20, 30] ms, whereas the data were negative for the 45{degree sign} direction. Should the reader directly compare panel B of Figure 6 with Figure 1E? The correspondence between the model and the data should be made more apparent in Figure 6. Furthermore, the rebuttal states that a quantitative prediction was not expected, yet it subsequently argues that there was a quantitative match. Overall, this response remains unclear.

      A follow-up question concerns the argument about strategic slowing. The authors argue that this explanation can be rejected because the timing of peak speed should be delayed, contrary to the data. However, there appears to be a sign difference between the model and the data for the 45{degree sign} direction, which means that it was delayed in this case. Did I understand correctly? In that regard, I believe that the hypothesis of strategic slowing cannot yet be firmly rejected and the discussion should more clearly indicate that this argument is based on some, but not all, directions. I agree with the authors on the importance of the mass underestimation hypothesis, and I am not particularly committed to the strategic slowing explanation, but I do not see a strong argument against it. If the conclusion relies on the sign of the delta peak speed, then the authors' claims are not valid across all directions, and greater caution in the interpretation and discussion is warranted. Regarding the peak acceleration time, I would be hesitant to draw firm conclusions based on differences smaller than 10 ms (Figures R3 and 6D).

      The authors state in the rebuttal that the two hypotheses are competing. This is not accurate, as they are not mutually exclusive and could even vary as a function of movement direction. The abstract also claims that the data "refutes" strategic slowing, which I believe is too strong. The main issue is that, based on the authors' revised manuscript, the lack of quantitative agreement between the model and the data for the mass underestimation hypothesis is considered acceptable because a precise quantitative match is not expected, and the predictions overall agree for some (though not all) directions and phases (excluding post-in). That is reasonable, but by the same logic, the small differences between the model prediction and the strategic slowing hypothesis should not be taken as firm evidence against it, as the authors seem to suggest. In practice, I recommend a more transparent and cautious interpretation to avoid giving readers the false impression that the evidence is decisive. The mass underestimation hypothesis is clearly supported, but the remaining aspects are less clear, and several features of the data remain unexplained.

    2. Reviewer #2 (Public review):

      This study explores the underlying causes of the generalized movement slowness observed in astronauts in weightlessness compared to their performance on Earth. The authors argue that this movement slowness stems from an underestimation of mass rather than a deliberate reduction in speed for enhanced stability and safety.

      Overall, this is a fascinating and well-written work. The kinematic analysis is thorough and comprehensive. The design of the study is solid, the collected dataset is rare, and the model adds confidence to the proposed conclusions.

      Compared to the previous version, the authors have thoroughly addressed my concerns. The model is now clear and well-articulated, and alternative hypotheses have been ruled out convincingly. The paper is improved and suitable for publication in my opinion, making a significant contribution to the field.

      Strengths:

      - Comprehensive analysis of a unique data set of reaching movement in microgravity<br /> - Use of a sensible and well-thought experimental approach<br /> - State-of-the-art analyses of main kinematic parameter<br /> - Computational model simulations of arm reaching to test alternative hypotheses and support the mass underestimation one

      This work has no major weakness as it stands, and the discussion provides a fair evaluation of the findings and conclusions.

    3. Reviewer #3 (Public review):

      Summary:

      The authors describe an interesting study of arm movements carried out in weightlessness after a prolonged exposure to the so-called microgravity conditions of orbital spaceflight. Subjects performed radial point-to-point motions of the fingertip on a touch pad. The authors note a reduction in movement speed in weightlessness, which they hypothesize could be due to either an overall strategy of lowering movement speed to better accommodate the instability of the body in weightlessness or an underestimation of body mass. They conclude for the latter, mainly based on two effects. One, slowing in weightlessness is greater for movement directions with higher effective mass at the end effector of the arm. Two, they present evidence for increased number of corrective submovements in weightlessness. They contend that this provides conclusive evidence to accept the hypothesis of an underestimation of body mass.

      Strengths:

      In my opinion, the study provides a valuable contribution, the theoretical aspects are well presented through simulations, the statistical analyses are meticulous, the applicable literature is comprehensively considered and cited and the manuscript is well written.

      Weaknesses:

      I nevertheless am of the opinion that the interpretation of the observations leaves room for other possible explanations of the observed phenomenon, thus weakening the strength of the arguments.

      To strengthen the conclusions, I feel that the following points would need to be addressed:

      (1) The authors model the movement control through equations that derive the input control variable in terms of the force acting on the hand and treating the arm as a second-order low pass filter (Eq. 13). Underestimation of the mass in the computation of a feedforward command would lead to a lower-than-expected displacement to that command. But it is not clear if and how the authors account for a potential modification of the time constants of the 2nd order system. The CNS does not effectuate movements with pure torque generators. Muscles have elastic properties that depend on their tonic excitation level, reflex feedback and other parameters. Indeed, Fisk et al.* showed variations of movement characteristics consistent with lower muscle tone, lower bandwidth and lower damping ratio in 0g compared to 1g. Could the variations in the response to the initial feedforward command be explained by a misrepresentation of the limbs damping and natural frequency, leading to greater uncertainty to the consequences of the initial command. This would still be an argument for un-adapted feedforward control of the movement, leading to the need for more corrective movements. But it would not necessarily reflect an underestimation of body mass.

      *Fisk, J. O. H. N., Lackner, J. R., & DiZio, P. A. U. L. (1993). Gravitoinertial force level influences arm movement control. Journal of neurophysiology, 69(2), 504-511.

      While the authors attempt to differentiate their study from previous studies where limb neuromechanical impedance was shown to be modified in weightlessness by emphasizing that in the current study the movements were rapid and the initial movement is "feedforward". But this incorrectly implies that the limb's mechanical response to the motor command is determined only by active feedback mechanisms. In fact:

      (a) All commands to the muscle pass through the motor neurons. These neurons receive descending activations related not only to the volitional movement, but also to the dynamic state of the body and the influence of other sensory inputs, including the vestibular system. A decrease in descending influences from the vestibular organs will lower the background sensitivity to all other neural influences on the motor neuron. Thus, the motor neuron may be less sensitive to the other volitional and reflexive synaptic inputs that it may receive.

      (b) Muscle tone plays a significant role in determining the force and the time course of the muscle contraction. In a weightless environment, where tonic muscle activity is likely to be reduced, there is the distinct possibility that muscles will react more slowly and with lower amplitude to an otherwise equivalent descending motor command, particularly in the initial moments before spinal reflexes come into play. These, and other neuronal mechanisms could lead to the "under-actuation" effect observed in the current study, without necessarily being reflective of an underestimation of mass per se.

      (2) The subject's body in weightless is much more sensitive to reaction forces in interactions with the environment in the absence of the anchoring effect of gravity pushing the body into the floor and in the absence of anticipatory postural adjustments that typically accompany upper-limb motions in Earth gravity in order to maintain an upright posture. The authors dismiss this possibility because the taikonauts were asked to stabilize their bodies with the contralateral hand. But the authors present no evidence that this was sufficient to maintain the shoulder and trunk at a strictly constant position, as is supposed by the simplified biomechanical model used in their optimal control framework. Indeed, a small backward motion of the shoulder would result in a smaller acceleration of the fingertip and a smaller extent of the initial ballistic motion of the hand with respect to the measurement device (the tablet), consistent with the observations reported in the study. Note that stability of the base might explain why 45º movements were apparently less affected in weightlessness, according to many of the reported analyses, including those related to corrective movements (Fig. 5 B, C, F; Fig. 6D), than the other two directions. If the trunk is being stabilized by the left arm, the same reaction forces on the trunk due to the acceleration of the hand will result in less effective torque on the trunk, given that the reaction forces act with a much smaller moment arm with respect to the left shoulder (the hand movement axis passes approximately through the left shoulder for the 45º target) compared to either the forward or rightward motions of the hand.

      (3) The above is exacerbated by potential changes in the frictional forces between the fingertip and the tablet. The movements were measured by having the subjects slide their finger on the surface of a touch screen. In weightlessness, the implications of this contact can be expected to be quite different than on the ground. While these forces may be low on Earth, the fact is that we do not know what forces the taikonauts used on orbit. In weightlessness, the taikonauts would need to actively press downward to maintain contact with the screen, while on Earth gravity will do the work. The tangential forces that resist movement due to friction might therefore be different in 0g. . Indeed, given the increased instability of the body and the increased uncertainty of movement direction of the hand, taikonauts may have been induced to apply greater forces against the tablet in order to maintain contact in weightlessness, which would in turn slow the motion of the finger on the table and increase the reaction forces acting on the trunk. This could be particularly relevant given that the effect of friction would interact with the limb in a direction-dependent fashion, given the anisotropy of the equivalent mass at the fingertip evoked by the authors

      I feel that the authors have done an admirable job of exploring the how to explain the modifications to movement kinematics that they observed on orbit within the constraints of the optimal control theory applied to a simplified model of the human motor system. While I fully appreciate the value of such models to provide insights into question of human sensorimotor behaviour, to draw firm conclusions on what humans are actually experiencing based only on manipulations of the computational model, without testing the model's implicit assumptions and without considering the actual neurophysiological and biomechanical mechanisms, can be misleading. One way to do this could be to examine these questions through extensions to the model used in the simulations (changing activation dynamics of the torque generators, allowing for potential motion backward motion of the shoulder and trunk, etc.). A better solution would be to emulate the physiological and biomechanical conditions on Earth (supporting the arm against gravity to reduce muscle tone, placing the subject on a moveable base that requires that the body be stabilized with the other hand) in order to distinguish the hypothesis of an underestimation of mass vs. other potential sources of under-actuation and other potential effects of weightlessness on the body.

      In sum, my opinion is that the authors are relying too much on a theoretical model as a ground truth and thus overstate their conclusions. But to provide a convincing argument that humans truly underestimate mass in weightlessness, they should consider more judiciously the neurophysiology and biomechanics that fall outside the purview of the simplified model that they have chosen. If a more thorough assessment of this nature is not possible, then I would argue that a more measured conclusion of the paper should be 1) that the authors observed modifications to movement kinematics in weightlessness consistent with an under-actuation for the intended motion, 2) that a simplified model of human physiology and biomechanics that incorporates principles of optimal control suggest that the source of this under-actuation might be an underestimation of mass in the computation of an appropriate feedforward motor command, and 3) that other potential neurophysiological or biomechanical effects cannot be excluded due to limitations of the computational model.

    1. Reviewer #1 (Public review):

      Summary:

      In the manuscript "Conformational Variability of HIV-1 Env Trimer and Viral Vulnerability", the authors study the fully glycosylated HIV-1 Env protein using an all-atom forcefield. It combines long all-atom simulations of Env in a realistic asymmetric bilayer with careful data analysis. This work clarifies how the CT domain modulates the overall conformation of the Env ectodomain and characterizes different MPER-TMD conformations. The authors also carefully analyze the accessibility of different antibodies to the Env protein.

      Strengths:

      This paper is state-of-the-art, given the scale of the system and the sophistication of the methods. The biological question is important, the methodology is rigorous, and the results will interest a broad audience.

      Weaknesses:

      The manuscript lacks a discussion of previous studies. The authors should consider addressing or comparing their work with the following points:

      (1) Tilting of the Env ectodomain has also been reported in previous experimental and theoretical work:

      https://doi.org/10.1101/2025.03.26.645577

      (2) A previous all-atom simulation study has characterized the conformational heterogeneity of the MPER-TMD domain:

      https://doi.org/10.1021/jacs.5c15421

      (3) Experimental studies have shown that MPER-directed antibodies recognize the prehairpin intermediate rather than the prefusion state:

      https://doi.org/10.1073/pnas.1807259115

      (4) How does the CT domain modulate the accessibility of these antibodies studied? The authors are in a strong position to compare their results with the following experimental study:

      https://doi.org/10.1126/science.aaa9804

    2. Reviewer #2 (Public review):

      (1) Summary

      In this work, the authors aim to elucidate how a viral surface protein behaves in a membrane environment and how its large-scale motions influence the exposure of antibody-binding sites. Using long-timescale, all-atom molecular dynamics simulations of a fully glycosylated, full-length protein embedded in a virus-like membrane, the study systematically examines the coupling between ectodomain motion, transmembrane orientation, membrane interactions, and epitope accessibility. By comparing multiple model variants that differ in cleavage state, initial transmembrane configuration, and presence of the cytoplasmic tail, the authors aim to identify general features of protein-membrane dynamics relevant to antibody recognition.

      (2) Strengths

      A major strength of this study is the scope and ambition of the simulations. The authors perform multiple microsecond-scale simulations of a highly complex, biologically realistic system that includes the full ectodomain, transmembrane region, cytoplasmic tail, glycans, and a heterogeneous membrane. Such simulations remain technically challenging, and the work represents a substantial computational and methodological effort.

      The analysis provides a clear and intuitive description of large-scale protein motions relative to the membrane, including ectodomain tilting and transmembrane orientation. The finding that the ectodomain explores a wide range of tilt angles while the transmembrane region remains more constrained, with limited correlation between the two, offers useful conceptual insight into how global motions may be accommodated without large rearrangements at the membrane anchor.

      Another strength is the explicit consideration of membrane and glycan steric effects on antibody accessibility. By evaluating multiple classes of antibodies targeting distinct regions of the protein, the study highlights how membrane proximity and glycan dynamics can differentially influence access to different epitopes. This comparative approach helps place the results in a broader immunological context and may be useful for readers interested in antibody recognition or vaccine design.

      Overall, the results are internally consistent across multiple simulations and model variants, and the conclusions are generally well aligned with the data presented.

      (3) Weaknesses

      The main limitations of the study relate to sampling and model dependence, which are inherent challenges for simulations of this size and complexity. Although the simulations are long by current standards, individual trajectories explore only portions of the available conformational space, and several conclusions rely on pooling data across a limited number of replicas. This makes it difficult to fully assess the robustness of some quantitative trends, particularly for rare events such as specific epitope accessibility states.

      In addition, several aspects of the model construction, including the treatment of missing regions, loop rebuilding, and initial configuration choices, are necessarily approximate. While these approaches are reasonable and well motivated, the extent to which some conclusions depend on these modeling choices is not always fully clear from the current presentation.

      Finally, the analysis of antibody accessibility is based on geometric and steric criteria, which provide a useful first-order approximation but do not capture potential conformational adaptations of antibodies or membrane remodeling during binding. As a result, the accessibility results should be interpreted primarily as model-based predictions rather than definitive statements about binding competence.

      Despite these limitations, the study provides a valuable and carefully executed contribution, and its datasets and analytical framework are likely to be useful to others interested in protein-membrane interactions and antibody recognition.

    3. Reviewer #3 (Public review):

      Summary:

      This study uses large-scale all-atom molecular dynamics simulations to examine the conformational plasticity of the HIV-1 envelope glycoprotein (Env) in a membrane context, with particular emphasis on how the transmembrane domain (TMD), cytoplasmic tail (CT), and membrane environment influence ectodomain orientation and antibody epitope exposure. By comparing Env constructs with and without the CT, explicitly modeling glycosylation, and embedding Env in an asymmetric lipid bilayer, the authors aim to provide an integrated view of how membrane-proximal regions and lipid interactions shape Env antigenicity, including epitopes targeted by MPER-directed antibodies.

      Strengths:

      A key strength of this work is the scope and realism of the simulation systems. The authors construct a very large, nearly complete Env-scale model that includes a glycosylated Env trimer embedded in an asymmetric bilayer, enabling analysis of membrane-protein interactions that are difficult to capture experimentally. The inclusion of specific glycans at reported sites, and the focus on constructs with and without the CT, are well motivated by existing biological and structural data.

      The simulations reveal substantial tilting motions of the ectodomain relative to the membrane, with angles spanning roughly 0-30{degree sign} (and up to ~50{degree sign} in some analyses), while the ectodomain itself remains relatively rigid. This framing, that much of Env's conformational variability arises from rigid-body tilting rather than large internal rearrangements, is an important conceptual contribution. The authors also provide interesting observations regarding asymmetric bilayer deformations, including localized thinning and altered lipid headgroup interactions near the TMD and CT, which suggest a reciprocal coupling between Env and the surrounding membrane.

      The analysis of antibody-relevant epitopes across the prefusion state, including the V1/V2 and V3 loops, the CD4 binding site, and the MPER, is another strength. The study makes effective use of existing experimental knowledge in this context, for example, by focusing on specific glycans known to occlude antibody binding, to motivate and interpret the simulations.

      Weaknesses:

      While the simulations are technically impressive, the manuscript would benefit from more explicit cross-validation against prior experimental and computational work throughout the Results and Discussion, and better framing in the introduction. Many of the reported behaviors, such as ectodomain tilting, TMD kinking, lipid interactions at helix boundaries, and aspects of membrane deformation, have been described previously in a range of MD studies of HIV Env and related constructs (e.g., PMC2730987, PMC2980712, PMC4254001, PMC4040535, PMC6035291, PMC12665260, PMID: 33882664, PMC11975376). Clearly situating the present results relative to these studies would strengthen the paper by clarifying where the simulations reproduce established behavior and where they extend it to more complete or realistic systems.

      A related limitation is that the work remains largely descriptive with respect to conformational coupling. Numerous experimental studies have demonstrated functional and conformational coupling between the TMD, CT, and the antigenic surface, with effects on Env stability, infectivity, and antibody binding (e.g., PMC4701381, PMC4304640, PMC5085267). In this context, the statement that ectodomain and TMD tilting motions are independent is a strong conclusion that is not fully supported by the analyses presented, particularly given the authors' acknowledgment that multiple independent simulations are required to adequately sample conformational space. More direct analyses of coupling, rather than correlations inferred from individual trajectories, would help align the simulations with the existing experimental literature. Given the scale of these simulations, a more thorough analysis of coupling could be this paper's most seminal contribution to the field.

      The choice of membrane composition also warrants deeper discussion. The manuscript states that it relies on a plasma membrane model derived from a prior simulation-based study, which itself is based on host plasma membrane (PMID: 35167752), but experimental analyses have shown that HIV virions differ substantially from host plasma membranes (e.g., PMC46679, PMC1413831, PMC10663554, PMC5039752, PMC6881329). In particular, virions are depleted in PC, PE, and PI, and enriched in phosphatidylserine, sphingomyelins, and cholesterol. These differences are likely to influence bilayer thickness, rigidity, and lipid-protein interactions and, therefore, may affect the generality of the conclusions regarding Env dynamics and antigenicity. Notably, the citation provided for membrane composition is a laboratory self-citation, a secondary source, rather than a primary experimental study on plasma membrane composition.

      Finally, there are pervasive issues with citation and methodological clarity. Several structural models are referred to only by PDB ID without citation, and in at least one case, a structure described as cryo-EM is in fact an NMR-derived model. Statements regarding residue flexibility, missing regions in structures, and comparisons to prior dynamics studies are often presented without appropriate references. The Methods section also lacks sufficient detail for a system of this size and complexity, limiting readers' ability to assess robustness or reproducibility.

      With stronger integration of prior experimental and computational literature, this work has the potential to serve as a valuable reference for how Env behaves in a realistic, glycosylated, membrane-embedded context. The simulation framework itself is well-suited for future studies incorporating mutations, strain variation, antibodies, inhibitors, or receptor and co-receptor engagement. In its current form, the primary contribution of the study is to consolidate and extend existing observations within a single, large-scale model, providing a useful platform for future mechanistic investigations.

    1. Reviewer #1 (Public review):

      Summary:

      The authors present a novel usage of fluorescence life-time imaging microscopy (FLIM) to measure NAD(P)H autofluorescence in the Drosophila brain, as a proxy for cellular metabolic/redox states. This new method relies on the fact that both NADH and NADPH are autofluorescent, with a different excitation lifetime depending on whether they are free (indicating glycolysis) or protein-bound (indicating oxidative phosphorylation). The authors successfully use this method in Drosophila to measure changes in metabolic activity across different areas of the fly brain, with a particular focus on the main center for associative memory: the mushroom body.

      Strengths:

      The authors have made a commendable effort to explain the technical aspects of the method in accessible language. This clarity will benefit both non-experts seeking to understand the methodology and researchers interested in applying FLIM to Drosophila in other contexts.

      Weaknesses:

      Despite being statistically significant, the learning-induced change in f-free in α/β Kenyon cells is minimal (a decrease from 0.76 to 0.73, with a high variability). It is unclear whether this small effect represents a meaningful shift in neuronal metabolic state.

      Whether this method can be valuable to examine the effects of long-term memory (after spaced or massed conditioning) remains to be established.

    2. Reviewer #2 (Public review):

      This revised manuscript presents a valuable application of NAD(P)H fluorescence lifetime imaging (FLIM) to study metabolic activity in the Drosophila brain. The authors reveal regional differences in oxidative and glycolytic metabolism, with particular emphasis on the mushroom body, a key center for associative learning and memory. They also report metabolic shifts in α/β Kenyon cells following classical conditioning, in line with their known role in energy-demanding memory processes.

      The study is well-executed and the authors have added more detailed methodological descriptions in this version, which strengthen the technical contribution. The analysis pipeline is rigorous, with careful curve fitting and appropriate controls. However, the metabolic shifts observed after conditioning are small and only weakly significant, raising questions about the sensitivity of FLIM for detecting subtle physiological changes. The authors acknowledge these limitations in the revised discussion, which helps place the findings in proper context.

      Despite this, the work provides a solid foundation for future applications of label-free FLIM in vivo and serves as a valuable technical resource for researchers interested in neural metabolism. Overall, this study represents a meaningful step toward integrating metabolic imaging with the study of neural activity and cognitive function.

    3. Reviewer #3 (Public review):

      This study investigates the characteristics of the autofluorescence signal excited by 740 nm 2-photon excitation, in the range of 420-500 nm, across the Drosophila brain. The fluorescence lifetime (FL) appears bi-exponential, with a short 0.4 ns time constant followed by a longer decay. The lifetime decay and the resulting parameter fits vary across the brain. The resulting maps reveal anatomical landmarks, which simultaneous imaging of genetically encoded fluorescent proteins help identify. Past work has shown that the autofluorescence decay time course reflects the balance of the redox enzyme NAD(P)H vs. its protein bound form. The ratio of free to bound NADPH is thought to indicate relative glycolysis vs. oxidative phosphorylation, and thus shifts in the free-to-bound ratio may indicate shifts in metabolic pathways. The basics of this measure have been demonstrated in other organisms, and this study is the first to use the FLIM module of the STELLARIS 8 FALCON microscope from Leica to measure autofluorescence lifetime in the brain of the fly. Methods include registering brains of different flies to a common template and masking out anatomical regions of interest using fluorescence proteins.

      The analysis relies on fitting a FL decay model with two free parameters, f_free and T_bound. F_free is the fraction of the normalized curve contributed by a decaying exponential with a time constant 0.4 ns, thought to represent the FL of free NADPH or NADH, which apparently cannot be distinguished. T_bound is the time constant of the second exponential, with scalar amplitude = (1-f_free). The T_bound fit is thought to represent the decay time constant of protein bound NADPH, but can differ depending on the protein. The study shows that across the brain, T_bound can range from 0 to >5 ns, whereas f_free can range from 0.5 to 0.9 ns (Figure 1a). The paper beautifully lays out the analysis pipeline, providing a valuable resource. The full range of fits are reported, including maximum likelihood quality parameters, and can be benchmarks for future studies.

      The authors measure properties of NADPH related autofluorescence of Kenyon Cells (KCs) of the fly mushroom body. The somata and calyx of mushroom bodies have a longer average tau_bound than other regions (Figure 1e); the f_free fit is higher for the calyx (input synapses) region than for KC somata; and the average across flies of average f_free fits in alpha/beta KC somata decreases slightly following paired presentation of odor and shock, compared to unpaired presentation of the same stimuli. Though the change is slight, no comparable change is detected in gamma KCs, suggesting that distributions of f_free derived from FL may be sensitive enough to measure changes in metabolic pathways following conditioning.

      FLIM as a method is not yet widely prevalent in fly neuroscience, but recent demonstrations of its potential are likely to increase its use. Future efforts will benefit from the description of the properties of the autofluorescence signal to evaluate how autofluorescence may impact measures of FL of genetically engineered indicators.

    1. Reviewer #1 (Public review):

      Summary:

      This study investigates how individuals with chronic temporomandibular disorder (TMD) learn from uncertain rewards, using a probabilistic three-armed bandit task and computational modelling. The authors aim to identify whether people living with chronic pain show altered learning under uncertainty and how such differences might relate to psychological symptoms.

      Strengths:

      The work addresses an important question about how chronic pain may influence cognition and motivation. The task design is appropriate for probing adaptive learning, and the modelling approach is novel. The findings of altered uncertainty updating in the TMD group are interesting.

      Weaknesses:

      Several aspects of the paper limit the strength of the conclusions. The group differences appear only in model-derived parameters, with no corresponding behavioural differences in task performance. Model parameters do not correlate with pain severity, making the proposed mechanistic link between pain and learning speculative. Some of the interpretations extend beyond what the data can directly support.

    2. Reviewer #2 (Public review):

      Summary:

      In this paper, the authors report on a case-control study in which participants with chronic pain (TMD) were compared to controls on performance of a three-option learning task. The authors find no difference in task behavior, but fit a model to this behavior and suggest that differences in the model-derived metrics (specifically, change in learning rate/estimated volatility/model estimated uncertainty) reveal a relevant between-group effect. They report a mediation effect suggesting that group differences on self-report apathy may be partially mediated by this uncertainty adaptation result.

      Strengths:

      The role of sensitivity to uncertainty in pathological states is an interesting question and is the focus of a reasonable amount of research at present. This paper provides a useful assessment of these processes in people with chronic pain.

      Weaknesses:

      (1) The interpretation of the model in the absence of any apparent behavioral effect is not convincing. The model is quite complex with a number of free parameters (what these parameters are is not well explained in the methods, although they seem to be presented in the supplement). These parameters are fitted to participant choice behavior - that is, they explain some sort of group difference in this choice behavior. The authors haven't been able to demonstrate what this difference is. The graphs of learning rate per group (Figure 2) suggest that the control group has a higher initial learning rate and a lower later learning rate. If this were actually the case, you would expect to see it reflected in the choice data (the control group should show higher lose-shift behavior earlier on, with this then declining over time, and the TMD group should show no change). This behavior is not apparent. The absence of a clear effect on behavior suggests that the model results are more likely to be spurious.

      (2) As far as I could see, the actual parameters of the model are not reported. The results (Figure 2) illustrate the trial-level model estimated uncertainty/learning rate, etc, but these differ because the fitted model parameters differ. The graphs look like there are substantial differences in v0 (which was not well recovered), but presumably lambda, at least, also differs. The mean(SD) group values for these parameters should be reported, as should the correlations between them (it looks very much like they will be correlated).

      (3) The task used seems ill-suited to measuring the reported process. The authors report the performance of a restless bandit task and find an effect on uncertainty adaptation. The task does not manipulate uncertainty (there are no periods of high/low uncertainty) and so the only adaptation that occurs in the task is the change from what appears to be the participants' prior beliefs about uncertainty (which appear to be very different between groups - i.e. the lines in Figure 2a,b,c are very different at trial 0). If the authors are interested in measuring adaptation to uncertainty, it would clearly be more useful to present participants with periods of higher or lower uncertainty.

      (4) The main factor driving the better fit of the authors' preferred model over listed alternatives seems to be the inclusion of an additive uncertainty term in the softmax-this differentiates the chosen model from the other two Kalman filter-based models that perform less well. But a similar term is not included in the RW models-given the uncertainty of a binary outcome can be estimated as p(1-p), and the RW models are estimating p, this would seem relatively straightforward to do. It would be useful to know if the factor that actually drives better model fit is indeed in the decision stage (rather than the learning stage).

    3. Reviewer #3 (Public review):

      This paper applies a computational model to behavior in a probabilistic operant reward learning task (a 3-armed bandit) to uncover differences between individuals with temporomandibular disorder (TMD) compared with healthy controls. Integrating computational principles and models into pain research is an important direction, and the findings here suggest that TMD is associated with subtle changes in how uncertainty is represented over time as individuals learn to make choices that maximize reward. There are a number of strengths, including the comparison of a volatile Kalman filter (vKF) model to some standard base models (Rescorla Wagner with 1 or 2 learning rates) and parameter recovery analyses suggesting that the combination of task and vKF model may be able to capture some properties of learning and decision-making under uncertainty that may be altered in those suffering from chronic pain-related conditions.

      I've focused my comments in four areas: (1) Questions about the patient population, (2) Questions about what the findings here mean in terms of underlying cognitive/motivational processes, (3) Questions about the broader implications for understanding individuals with TMD and other chronic pain-related disorders, and (4) Technical questions about the models and results.

      (1) Patient population

      This is a computational modelling study, so it is light on characterization of the population, but the patient characteristics could matter. The paper suggests they were hospitalized, but this is not a condition that requires hospitalization per se. It would be helpful to connect and compare the patient characteristics with large-scale studies of TMD, such as the OPPERA study led by Maixner, Fillingim, and Slade.

      (2) What cognitive/motivational processes are altered in TMD

      The study finds a pattern of alterations in TMD patients that seems clear in Figure 2. Healthy controls (HC) start the task with high estimates of volatility, uncertainty, and learning rate, which drop over the course of the task session. This is consistent with a learner that is initially uncertain about the structure of the environment (i.e., which options are rewarded and how the contingencies change over time) but learns that there is a fixed or slowly changing mean and stationary variance. The TMD patients start off with much lower volatility, uncertainty, and learning rate - which are actually all near 0 - and they remain stable over the course of learning. This is consistent with a learner who believes they know the structure of the environment and ignores new information.

      What is surprising is that this pattern of changes over time was found in spite of null group differences in a number of aspects of performance: (1) stay rate, (2) switch rate, (3) win-stay/lose-switch behaviors, (4) overall performance (corrected for chance level), (5) response times, (6) autocorrelation, (7) correlations between participants' choice probability and each option's average reward rate, (7) choice consistency (though how operationalized is not described?), (8) win-stay-lose-shift patterns over time. I'm curious about how the patterns in Figure 2 would emerge if standard aspects of performance are essentially similar across groups (though the study cannot provide evidence in favor of the null). It will be important to replicate these patterns in larger, independent samples with preregistered analyses.

      The authors believe that this pattern of findings reveals that TMD patients "maintain a chronically heightened sensitivity to environmental changes" and relate the findings to predictive processing, a hallmark of which (in its simplest form) is precision-weighted updating of priors. They also state that the findings are not related to reduced overall attentiveness or failure to understand the task, but describe them as deficits or impairments in calibrating uncertainty.

      The pattern of differences could, in fact, result from differences in prior beliefs, conceptualization of the task, or learning. Unpacking these will be important steps for future work, along with direct measures of priors, cognitive processes during learning, and precision-weighted updating.

      (3) Implications for understanding chronic pain

      If the findings and conclusions of the paper are correct, individuals with TMD and perhaps other pain-related disorders may have fundamental alterations in the ways in which they make decisions about even simple monetary rewards. The broader questions for the field concern (1) how generalizable such alterations are across tasks, (2) how generalizable they are across patient groups and, conversely, how specific they are to TMD or chronic pain, (3) whether they are the result of neurological dysfunction, as opposed to (e.g.) adaptive strategies or assumptions about the environment/task structure.

      It will be important to understand which features of patients' and/or controls' cognition are driving the changes. For example, could the performance differences observed here be attributable to a reduced or altered understanding of the task instructions, more uncertainty about the rules of the game, different assumptions about environments (i.e., that they are more volatile/uncertain or less so), or reduced attention or interest in optimizing performance? Are the controls OVERconfident in their understanding of the environment?

      This set of questions will not be easy to answer and will be the work of many groups for many years to come. It is a judgment call how far any one paper must go to address them, but my view is that it is a collaborative effort. Start with a finding, replicate it across labs, take the replicable phenomena and work to unpack the underlying questions. The field must determine whether it is this particular task with this model that produces case-control differences (and why), or whether the findings generalize broadly. Would we see the same findings for monetary losses, sounds, and social rewards? Tasks with painful stimuli instead of rewards?

      Another set of questions concerns the space of computational models tested, and whether their parameters are identifiable. An alteration in estimated volatility or learning rate, for example, can come from multiple sources. In one model, it might appear as a learning rate change and in another as a confirmation bias. It would be interesting in this regard to compare the "mechanisms" (parameters) of other models used in pain neuroscience, e.g., models by Seymour, Mancini, Jepma, Petzschner, Smith, Chen, and others (just to name a few).

      One immediate next step here could be to formally compare the performance of both patients and controls to normatively optimal models of performance (e.g., Bayes optimal models under different assumptions). This could also help us understand whether the differences in patients reflect deficits and what further experiments we would need to pin that down.<br /> In addition, the volatility parameter in the computational model correlated with apathy. This is interesting. Is there a way to distinguish apathy as a particular clinical characteristic and feature of TMD from apathy in the sense of general disinterest in optimal performance that may characterize many groups?

      If we know this, what actionable steps does it lead us to take? Could we take steps to reduce apathy and thus help TMD patients better calibrate to environmental uncertainty in their lives? Or take steps to recalibrate uncertainty (i.e., increase uncertainty adaptation), with benefits on apathy? A hallmark of a finding that the field can build off of is the questions it raises.

      (4) Technical questions about the models and results

      Clarification of some technical points would help interpret the paper and findings further:

      (a) Was the reward probability truly random? Was the random walk different for each person, or constrained?

      (b) When were self-report measures administered, and how?

      (c) Pain assessments: What types of pain? Was a body map assessed? Widespreadness? Pain at the time of the test, or pain in general?

      (d) Parameter recovery: As you point out, r = 0.47 seems very low for recovery of the true quantity, but this depends on noise levels and on how the parameter space is sampled. Is this noise-free recovery, and is it robust to noise? Are the examples of true parameters drawn from the space of participants, or do they otherwise systematically sample the space of true parameters?

      (e) What are the covariances across parameter estimates and resultant confusability of parameter estimates (e.g., confusion matrix)?

      (f) It would be helpful to have a direct statistical comparison of controls and TMD on model parameter estimates.

      (g) Null statistical findings on differences in correlations should not be interpreted as a lack of a true effect. Bayes Factors could help, but an analysis of them will show that hundreds of people are needed before it is possible to say there are no differences with reasonable certainty. Some journals enforce rules around the kinds of language used to describe null statistical findings, and I think it would be helpful to adopt them more broadly.

      (h) What is normatively optimal in this task? Are TMD patients less so, or not? The paper states "aberrant precision (uncertainty) weighting and misestimation of environmental volatility". But: are they misestimates?

      (i) It's not clear how well the choice of prior variance for all parameters (6.25) is informed by previous research, as sensible values may be task- and context-dependent. Are the main findings robust to how priors are specified in the HBI model?

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript by Wang et al. reports the potential involvement of an asymmetric neurocircuit in the sympathetic control of liver glucose metabolism.

      Strengths:

      The concept that the contralateral brain-liver neurocircuit preferentially regulates each liver lobe may be interesting.

      Weaknesses:

      However, the experimental evidence presented did not support the study's central conclusion.

      (1) Pseudorabies virus (PRV) tracing experiment:<br /> The liver not only possesses sympathetic innervations but also vagal sensory innervations. The experimental setup failed to distinguish whether the PRV-labeling of LPGi (Lateral Paragigantocellular Nucleus) is derived from sympathetic or vagal sensory inputs to the liver.

      (2) Impact on pancreas:<br /> The celiac ganglia not only provide sympathetic innervations to the liver but also to the pancreas, the central endocrine organ for glucose metabolism. The chemogenetic manipulation of LPGi failed to consider a direct impact on the secretion of insulin and glucagon from the pancreas.

      (3) Neuroanatomy of the brain-liver neurocircuit:<br /> The current study and its conclusion are based on a speculative brain-liver sympathetic circuit without the necessary anatomical information downstream of LPGi.

      (4) Local manipulation of the celiac ganglia:<br /> The left and right ganglia of mice are not separate from each other but rather anatomically connected. The claim that the local injection of AAV in the left or right ganglion without affecting the other side is against this basic anatomical feature.

    2. Reviewer #2 (Public review):

      Summary:

      The manuscript by Wang and colleagues aims to determine whether the left and right LPGi differentially regulate hepatic glucose metabolism and to reveal decussation of hepatic sympathetic nerves.

      The authors used tissue clearing to identify sympathetic fibers in the liver lobes, then injected PRV into the hepatic lobes. Five days post-injection, PRV-labeled neurons in the LPGi were identified. The results indicated contralateral dominance of premotor neurons and partial innervation of more than one lobe. Then the authors activated each side of the LPGi, resulting in a greater increase in blood glucose levels after right-sided activation than after left-sided activation, as well as changes in protein expression in the liver lobes. These data suggested modulation of HGP (hepatic glucose production) in a lobe-specific manner. Chemical denervation of a particular lobe did not affect glucose levels due to compensation by the other lobes. In addition, nerve bundles decussate in the hepatic portal region.

      Strengths:

      The manuscript is timely and relevant. It is important to understand the sympathetic regulation of the liver and the contribution of each lobe to hepatic glucose production. The authors use state-of-the-art methodology.

      Weaknesses:

      (1) The wording/terminology used in the manuscript is misleading, and it is not used in the proper context. For instance, the goal of the study is "to investigate whether cerebral hemispheres differentially regulate hepatic glucose metabolism..." (see abstract); however, the authors focus on the brainstem (a single structure without hemispheres). Similarly, symmetric is not the best word for the projections.

      (2) Sparse labeling of liver-related neurons was shown in the LPGi (Figure 1). It would be ideal to have lower magnification images to show the area. Higher quality images would be necessary, as it is difficult to identify brainstem areas. The low number of labeled neurons in the LPGi after five days of inoculation is surprising. Previous findings showed extensive labeling in the ventral brainstem at four days post-inoculation (Desmoulins et al., 2025). Unfortunately, it is not possible to compare the injection paradigm/methods because the PRV inoculation is missing from the methods section. If the PRV is different from the previously published viral tracers, time-dependent studies to determine the order of neurons and the time course of infection would be necessary.

      (3) Not all LPGi cells are liver-related. Was the entire LPGi population stimulated, or was it done in a cell-type-specific manner? What was the strain, sex, and age of the mice? What was the rationale for using the particular viral constructs?

      (4) The authors should consider the effect of stimulation of double-labeled neurons (innervating more than one lobe) and potential confounding effects regarding other physiological functions.

      (5) The authors state that "central projections directly descend along the sympathetic chain to the celiac-superior mesenteric ganglia". What they mean is unclear. Do the authors refer to pre-ganglionic neurons or premotor neurons? How does it fit with the previous literature?

      (6) How was the chemical denervation completed for the individual lobes?

      (7) The Western Blot images look like they are from different blots, but there are no details provided regarding protein amount (loading) or housekeeping. What was the reason to switch beta-actin and alpha-tubulin? In Figures 3F -G, the GS expression is not a good representative image. Were chemiluminescence or fluorescence antibodies used? Were the membranes reused?

      (8) Key references using PRV for liver innervation studies are missing (Stanley et al, 2010 [PMID: 20351287]; Torres et al., 2021 [PMID: 34231420]; Desmoulins et al., 2025 [PMID: 39647176]).

    3. Reviewer #3 (Public review):

      Summary:

      This study found a lobe-specific, lateralized control of hepatic glucose metabolism by the brain and provides anatomical evidence for sympathetic crossover at the porta hepatis. The findings are particularly insightful to the researchers in the field of liver metabolism, regeneration, and tumors.

      Strengths:

      Increasing evidence suggests spatial heterogeneity of the liver across many aspects of metabolism and regenerative capacity. The current study has provided interesting findings: neuronal innervation of the liver also shows anatomical differences across lobes. The findings could be particularly useful for understanding liver pathophysiology and treatment, such as metabolic interventions or transplantation.

      Weaknesses:

      Inclusion of detailed method and Discussion:

      (1) The quantitative results of PRV-labeled neurons are presented, and please include the specific quantitative methods.

      (2) The Discussion can be expanded to include potential biological advantages of this complex lateralized innervation pattern.

    4. Reviewer #4 (Public review):

      Summary:

      The studies here are highly informative in terms of anatomical tracing and sympathetic nerve function in the liver related to glucose levels, but given that they are performed in a single species, it is challenging to translated them to humans, or to determine whether these neural circuits are evolutionarily conserved. Dual-labeling anatomical studies are elegant, and the addition of chemogenetic and optogenetic studies is mechanistically informative. Denervation studies lack appropriate controls, and the role of sensory innervation in the liver is overlooked.

      Specific Weaknesses - Major:

      (1) The species name should be included in the title.

      (2) Tyrosine hydroxylase was used to mark sympathetic fibers in the liver, but this marker also hits a portion of sensory fibers that need to be ruled out in whole-mount imaging data

      (3) Chemogenetic and optogenetic data demonstrating hyperglycemia should be described in the context of prior work demonstrating liver nerve involvement in these processes. There is only a brief mention in the Discussion currently, but comparing methods and observations would be helpful.

      (4) Sympathetic denervation with 6-OHDA can drive compensatory increases to tissue sensory innervation, and this should be measured in the liver denervation studies to implicate potential crosstalk, especially given the increase in LPGi cFOS that may be due to afferent nerve activity. Compensatory sympathetic drive may not be the only culprit, though it is clearly assumed to be. The sensory or parasympathetic/vagal innervation of the liver is altogether ignored in this paper and could be better described in general.

    1. Reviewer #1 (Public review):

      Summary:

      In this paper, the authors investigate the effects of Miro1 on VSMC biology after injury. Using conditional knockout animals, they provide the important observation that Miro1 is required for neointima formation. They also confirm that Miro1 is expressed in human coronary arteries. Specifically, in conditions of coronary diseases, it is localized in both media and neointima and, in atherosclerotic plaque, Miro1 is expressed in proliferating cells.

      However, the role of Miro1 in VSMC in CV diseases is poorly studied and the data available are limited; therefore, the authors decided to deepen this aspect. The evidence that Miro-/- VSMCs show impaired proliferation and an arrest in S phase is solid and further sustained by restoring Miro1 to control levels, normalizing proliferation. Miro1 also affects mitochondrial distribution, which is strikingly changed after Miro1 deletion. Both effects are associated with impaired energy metabolism due to the ability of Miro1 to participate in MICOS/MIB complex assembly, influencing mitochondrial cristae folding. Interestingly, the authors also show the interaction of Miro1 with NDUFA9, globally affecting super complex 2 assembly and complex I activity.<br /> Finally, these important findings also apply to human cells and can be partially replicated using a pharmacological approach, proposing Miro1 as a target for vasoproliferative diseases.

      Strengths:

      The discovery of Miro1 relevance in neointima information is compelling, as well as the evidence in VSMC that MIRO1 loss impairs mitochondrial cristae formation, expanding observations previously obtained in embryonic fibroblasts.<br /> The identification of MIRO1 interaction with NDUFA9 is novel and adds value to this paper. Similarly, the findings that VSMC proliferation requires mitochondrial ATP support the new idea that these cells do not rely mostly on glycolysis.

      The revised manuscript includes additional data supporting mitochondrial bioenergetic impairment in MIRO1 knockout VSMCs. Measurements of oxygen consumption rate (OCR), along with Complex I (ETC-CI) and Complex V activity, have been added and analyzed across multiple experimental conditions. Collectively, these findings provide a more comprehensive characterization of the mitochondrial functional state. Following revision, the association between MIRO1 deficiency and impaired Complex I activity is more robust.

      Although the precise molecular mechanism of action remains to be fully elucidated, in this updated version, experiments using a MIRO1 reducing agent are presented with improved clarity

      Although some limitations remain, the authors have addressed nearly all the concerns raised, and the manuscript has substantially improved

      Weaknesses:

      Figure 6: The authors do not address the concern regarding the cristae shape; however, characterization of the cristae phenotype with MIRO1 ΔTM would have strengthened the mechanistic link between MIRO1 and the MIB/MICOS complex

      Although the authors clarified their reasoning, they did not explore in vivo validation of key biochemical findings, which represents a limitation of the current study. While their justification is acknowledged, at least a preliminary exploratory effort could have been evaluated to reinforce the translational relevance of the study.

      Finally, in line with the explanations outlined in the rebuttal, the Discussion section should mention the limits of MIRO1 reducer treatment.

    2. Reviewer #2 (Public review):

      Summary:

      This study identifies the outer‑mitochondrial GTPase MIRO1 as a central regulator of vascular smooth muscle cell (VSMC) proliferation and neointima formation after carotid injury in vivo and PDGF-stimulation ex vivo. Using smooth muscle-specific knockout male mice, complementary in vitro murine and human VSMC cell models, and analyses of mitochondrial positioning, cristae architecture and respirometry, the authors provide solid evidence that MIRO1 couples mitochondrial motility with ATP production to meet the energetic demands of the G1/S cell cycle transition. However, a component of the metabolic analyses are suboptimal and would benefit from more robust methodologies. The work is valuable because it links mitochondrial dynamics to vascular remodelling and suggests MIRO1 as a therapeutic target for vasoproliferative diseases, although whether pharmacological targeting of MIRO1 in vivo can effectively reduce neointima after carotid injury has not been explored. This paper will be of interest to those working on VSMCs and mitochondrial biology.

      Strengths:

      The strength of the study lies in its comprehensive approach assessing the role of MIRO1 in VSMC proliferation in vivo, ex vivo and importantly in human cells. The subject provides mechanistic links between MIRO1-mediated regulation of mitochondrial mobility and optimal respiratory chain function to cell cycle progression and proliferation. Finally, the findings are potentially clinically relevant given the presence of MIRO1 in human atherosclerotic plaques and the available small molecule MIRO1.

      Weaknesses:

      (1) High-resolution respirometry (Oroboros) to determine mitochondrial ETC activity in permeabilized VSMCs would be informative.

      (2) Therapeutic targeting of MIRO1 failed to prevent neointima formation, however, the technical difficulties of such an experiment is appreciated.

    3. Reviewer #3 (Public review):

      Summary:

      This study addresses the role of MIRO1 in vascular smooth muscle cell proliferation, proposing a link between MIRO1 loss and altered growth due to disrupted mitochondrial dynamics and function. While the findings are useful for understanding the importance of mitochondrial positioning and function in this specific cell type, the main bioenergetic and mechanistic claims are not strongly supported.

      Strengths:

      This study focuses on an important regulatory protein, MIRO1, and its role in vascular smooth muscle cell (VSMC) proliferation, a relatively underexplored context.

      This study explores the link between smooth muscle cell growth, mitochondrial dynamics, and bioenergetics, which is a significant area for both basic and translational biology.

      The use of both in vivo and in vitro systems provides a useful experimental framework to interrogate MIRO1 function in this context.

      Weaknesses:

      The proposed link between MIRO1 and respiratory supercomplex biogenesis or function is not clearly defined.

      Completeness and integration of mitochondrial assays is marginal, undermining the strength of the conclusions regarding oxidative phosphorylation.

    1. Joint Public Review:

      Quite obviously, the brain encodes "time", as we are able to tell if something happened before or after something else. How this is done, however, remains essentially not understood. In the context of Working Memory tasks, many experiments have shown that the neural activity during the retention period "encodes" time, besides the stimulus to be remembered; that is, the time elapsed from stimulus presentation can be reliably inferred from the recordings, even if time per se is not important for the task. This implies 'mixed selectivity', in the weak sense of neural activity varying with both stimulus identity and time elapsed (since presentation).

      In this paper, the authors investigate the implications of a specific form of such mixed selectivity, that is, conjunctive coding of what (stimulus) and when (time) at the single-neuron level, on the resulting dynamics of the population activity when 'viewed' through linear dimensionality-reduction techniques, essentially Principal Component Analysis (PCA). The theoretical/modeling results presented provide a useful guide to the interpretation of the experimental results; in particular, with respect to what can, or cannot, be rightfully inferred from those experimental results (using PCA-like techniques). The results are essentially theoretical in nature; there are, however, some conclusions that require a more precise justification, in my opinion. More generally, as the authors themselves discuss in the paper, it is not clear how to generalize this coding scheme to more complicated, but behaviorally and cognitively relevant, situations, such as multi-item WM or WM for sequences.

      (1) It is unclear to me how the conjunctive code that the authors use (i.e., Equation (3)) is constrained by the theoretical desiderata (i.e., compositionality) they list, or whether it is simply an ansatz, partly motivated by theoretical considerations and experimental observations.

      The "what" part: What the authors mean by "relationships" between stimuli is never clearly defined. From their argument (and from Figure 1b), it would seem that what they mean is "angles" between population vectors for all pairs of stimuli. If this is so, then the effect of the passing time can only amount to a uniform rescaling of the components of the population vector (i.e., it must be a similarity transformation; rotations are excluded, if the linear-decoder vectors are to be time-independent); the scaling factor, then, must be a strictly monotonous function of time (increasing or decreasing), if one is to decode time. In other words, the "when" receptive fields must be the same for all neurons.

      The "when" part: The condition, \tau_3=\tau_1+\tau_2, does not appear to be used at all. In fact, it is unclear (to me at least) whether the model, as it is formulated, is able to represent time intervals between stimuli.

      (2) For the specific case considered, i.e., conjunctive coding, it would seem that one should be able to analytically work out the demixed PCA (see Kobak et al., 2016). More generally, it seems interesting to compare the results of the PCA and the demixed PCA in this specific case, even just using synthetic data.

      (3) In the Section "Dimensionality of neural trajectories...", there is some claim about how the dimensionality of the population activity goes up with the observation window T, backed up by numerical results that somehow mimic the results of Cueva et al. (2020) on experimental data. Is this a result that can be formally derived? Related to this point, it would be useful to provide a little more justification for Equation (17). Naively, one would think that the correlation matrix of the temporal component is always full-rank nominally, but that one can get excellent low-rank approximations (depending on T, following your argument).

    1. Reviewer #1 (Public review):

      Summary

      In this review paper, the authors describe the concept of neural correlates of consciousness (NCC) and explain how noninvasive neuroimaging methods fall short of being able to properly characterise an unconfounded NCC. They argue that intracranial research is a means to address this gap and provide a review of many intracranial neuroimaging studies that have sought to answer questions regarding the neural basis of perceptual consciousness.

      Strengths

      The authors have provided an in-depth, timely, and scholarly contribution to the study of NCCs. First and foremost, the review surveys a vast array of literature. The authors synthesise findings such that a coherent narrative of what invasive electrophysiology studies have revealed about the neural basis of consciousness can be easily grasped by the reader. The review is also, to the best of my knowledge, the first review to specifically target intracranial approaches to consciousness and to describe their results in a single article. This is a credit to the authors, as it becomes ever harder to apply strict tests to theories of consciousness using methods such as fMRI and M/EEG it is important to have informative resources describing the results of human intracranial research so that theorists will have to constrain their theories further in accordance with such data. As far as the authors were aiming to provide a complete and coherent overview of intracranial approaches to the study of NCCs, I believe they have achieved their aim.

      Weaknesses

      Overall, I feel positive about this paper. However, there are a couple of aspects to the manuscript that I think could be improved.

      (1) Distinguishing NCCs from their prerequisites or consequences

      This section in the introduction was particularly confusing to me. Namely, in this section, the authors' aim is to explain how intracranial recordings can help distinguish 'pure' NCCs from their antecedents and consequences. However, the authors almost exclusively describe different tasks (e.g., no-report tasks) that have been used to help solve this problem, rather than elaborating on how intracranial recordings may resolve this issue. The authors claim that no-report designs rely on null findings, and invasive recordings can be more sensitive to smaller effects, which can help in such cases. However, this motivation pertains to the previous sub-section (limits of noninvasive methods), since it is primarily concerned with the lack of temporal and spatial resolution of fMRI and M/EEG. It is not, in and of itself, a means to distinguish NCCs from their confounds.

      As such, in its current formulation, I do not find the argument that intracranial recordings are better suited to identifying pure NCCs (i.e. separating them from pre- or post-processing) convincing. To me, this is a problem solved through novel paradigms and better-developed theories. As it stands, the paper justifies my position by highlighting task developments that help to distinguish NCCs from prerequisites and consequences, rather than giving a novel argument as to why intracranial recordings outperform noninvasive methods beyond the reasons they explained in the previous section. Again, this position is justified when, from lines 505-506, the authors describe how none of the reported single-cell studies were able to dissociate NCCs from post-perceptual processing. As such, it seems as if, even with intracranial recording, NCCs and their confounds cannot be disentangled without appropriate tasks.

      The section 'Towards Better Behavioural Paradigms' is a clear attempt to address these issues and, as such, I am sure the authors share the same concerns as I am raising. Still, I remain unconvinced that the distinguishing of NCCs from pre-/post- processing is a fair motivation for using intracranial over noninvasive measures.

      (2) Drawing misleading conclusions from certain studies

      There are passages of the manuscript where the authors draw conclusions from studies that are not necessarily warranted by the studies they cite. For instance:

      Lines 265 - 271: "The results of these two studies revealed a complex pattern: on the one hand, HGA in the lateral occipitotemporal cortex and the ventral visual cortex correlated with stimulus strength. On the other hand, it also correlated with another factor that does not appear to play a role in visibility (repetition suppression), and did not correlate with a non-sensory factor that affects visibility reports (prior exposure). These results suggest that activity in occipitotemporal cortex regions reflecting higher-order visual processing may be a precursor to the NCC but not an NCC proper."

      It's possible to imagine a theory that would predict HGA could correlate with stimulus strength and repetition suppression, or that it would not correlate with prior exposure (e.g. prior exposure could impact response bias without affecting subjective visibility itself). The authors describe this exact ambiguity in interpretation later in the article (line 664), but in its current form, at least in line 270 (when the study is most extensively discussed), the manuscript heavily implies that HGA is not an NCC proper. This generates a false impression that intracranial recordings have conclusively determined that occipitotemporal HGA is not a pure NCC, which is certainly a premature conclusion.

      Line 243: "Altogether, these early human intracranial studies indicate that early-latency visual processing steps, reflected in broadband and low gamma activity, occur irrespective of whether a stimulus is consciously perceived or not. They also identified a candidate NCC: later (>200 ms) activity in the occipitotemporal region responsible for higher-order visual processing."

      The authors claim in this section that later (>200ms) activity in occipitotemporal regions may be a candidate for an NCC. However, the Fisch et al. (2009) study they describe in support of this conclusion found that early (~150ms) activity could dissociate conscious and unconscious processing. This would suggest that it is early processing that lays claim to perceptual consciousness. The authors explicitly describe the Fisch et al results as showing evidence for early markers of consciousness (line 240: '...exhibited an early...response following recognized vs unrecognised stimuli.) Yet only a few lines later they use this to support the conclusion that a candidate NCC is 'later (>200ms) activity in the occipitotemporal region' (line 245). As such, I am not sure what conclusion the authors want me to make from these studies.

      This problem is repeated in lines 386-387: "Altogether, studies that investigated the cortical correlates of visual consciousness point to a role of neural responses starting ~250 ms after stimulus onset in the non-primary visual cortex and prefrontal cortex."

      This seems to be directly in conflict with the Fisch et al results, which show that correlates of consciousness can begin ~100ms earlier than the authors state in this passage.

      (3) Justifying single-neuron cortical correlates of consciousness

      The purpose of the present manuscript is to highlight why and how intracortical measures of neural activity can help reveal the neural correlates of perceptual consciousness. As such, in the section 'Single-neuron cortical correlates of perceptual consciousness', I think the paper is lacking an argument as to why single-neuron research is useful when searching for the NCC. Most theories of consciousness are based around circuit or system-level analyses (e.g., global ignition, recurrent feedback, prefrontal indexing, etc.) and usually do not make predictions about single cells. Without any elaboration or argument as to why single-cell research is necessary for a science of consciousness, the research described in this section, although excellent and valuable in its own right, seems out of place in the broader discussion of NCCs. A particularly strong interpretation here could be that intracranial recordings mislead researchers into studying single cells simply because it is the finest level of analysis, rather than because it offers helpful insight into the NCCs.

      (4) No mention of combined fMRI-EEG research

      A minor point, but I was surprised that the authors did not mention any combined fMRI-EEG research when they were discussing the limits of noninvasive recordings. Intracortical recordings are one way to surpass the spatial and temporal resolution limits of M/EEG and fMRI respectively, but studies that combine fMRI and EEG are also an alternative means to solve this problem: by combining the spatial resolution of fMRI with the temporal resolution of EEG, researchers can - in theory - compare when and where certain activity patterns (be they univariate ERPs or multivariate patterns) arise. The authors do cite one paper (Dellert et al., 2021 JNeuro) that used this kind of setup, but they discuss it only with respect to the task and ignore the recording method. The argument for using intracranial recordings is weaker for not mentioning a viable, noninvasive alternative that resolves the same issues.

    2. Reviewer #2 (Public review):

      Summary:

      In this work, the authors review the study of the neural correlates of consciousness (NCCs). They discuss several of the difficulties that researchers must face when studying NCCs, and argue that several of these difficulties can be alleviated by using intracranial recordings in humans.

      They describe what constitutes an NCC, and the difficulties to distinguish between an NCC proper from the prerequisites and consequences of conscious processing.

      They also describe the two main types of experimental designs used to study NCCs. These are the contrastive approach (with its report and non-report variants), and the supraliminal approach, each with its own merits and pitfalls.

      They discuss the limitations of non-invasive methods, such as fMRI, EEG and MEG, as well as the limitations of the use of invasive recordings in non-human animals.

      After setting the stage in this way, the authors provide an extensive review of the knowledge acquired by using invasive recordings in humans. This included population-level measurements in vision and in other sensory modalities, as well as single-neuron level studies. The authors also discuss studies of subcortical NCCs.

      The second half of this work discusses the theoretical insights gained through the use of intracranial recordings, as well as their limitations, and a perspective for future work.

      Strengths:

      This work offers an impressive review, which will serve as a useful reference document, both for newcomers to the study of NCC and for experienced researchers. The inclusion of non-visual and subcortical NCCs is of particular merit, as these have been understudied.

      Besides serving as a review, this work includes a perspective, exploring several directions to pursue for the progress of the field.

      Weaknesses:

      The intention of the authors is to argue how some of the problems faced when studying NCCs are alleviated by the use of intracranial recordings in humans. But in some cases, the link between the problems related to the study of NCCs and the advantages of intracranial recordings over non-invasive methods is not clear.

      For example, the authors explain the difficulties in distinguishing between true NCCs from their prerequisites and consequences. This constitutes a difficult conceptual problems that plague all recording techniques. The authors don't provide a convincing explanation of how intracranial recordings offer advantages over EEG or MEG when dealing with these problems.

      For example, the authors explain how the use of non-report designs to rule out post-perceptual processing relies on null results, which, according to them, are harder to interpret given the low resolution of non-invasive methods. But the interpretation of null results is actually more complicated in the case of intracranial recordings. As the coverage achieved by the electrodes is sparse, if a null result is attested, it remains possible that a true effect was present in a nearby patch of cortex out of coverage.

      The authors argue that the spatial resolution of intracranial recordings is better than that of EEG and MEG. While this is technically true (especially compared to EEG), the true spatial scale of the NCCs is unknown. If NCCs' span is in the mm range, then the additional spatial resolution of intracranial recordings might not be an advantage.

      Another factor that should be taken into consideration when assessing the spatial resolution of intracranial recordings is that while the listening zone of individual intracranial contacts is small, coverage is sparse and defined by clinical criteria (something that the authors discuss). In practice, the activity recorded by contacts is usually attributed to anatomically defined ROIs with a scale in the cm range. Given the sparse and uneven (across regions and patients) coverage afforded by intracranial recordings, the advantage of intracranial recordings in terms of spatial resolution is overstated.

      Appraisal of whether the authors achieved their aims:

      In this work, the authors have gathered an impressive review and have discussed several important problems in the field of study of NCCs, as well as provided a perspective on how the field could move forward.

      What is less clear is how the use of intracranial recordings per se holds potential to overcome problems such as the distinction between true NCCs and the prerequisites and consequences of conscious processing.

      Discussion of the likely impact of the work on the field:

      This work has the potential of becoming a must-read for anyone working in the field of consciousness research.

    3. Reviewer #3 (Public review):

      Summary:

      This narrative review provides a clear, well-structured, and comprehensive synthesis of intracerebral recording work on the neural correlates of consciousness. It is written in an accessible manner that will be useful to a broad community of researchers, from those new to iEEG to specialists in the field.

      Strengths:

      The manuscript successfully integrates methodological and theoretical perspectives and offers a balanced overview of current, sometimes contradicting evidence. As such, the manuscript is important as it calls for a concerted and better exploration of NCCs using iEEG in the future.

      Weaknesses:

      The manuscript extensively discusses the use of "report" as a criterion for identifying conscious perception and its limitations for separating between correlates of consciousness and post-consciousness processes, yet the term is not defined at the outset. The authors should specify what they mean by "report" (e.g., verbal report, nonverbal self-report, or any meta-cognitive indication of experience). Importantly, this definition should be explicitly linked to the theoretical landscape: whether the authors adopt an access-consciousness perspective in which (self) reportability is central, or whether the review also aims to address phenomenal consciousness. Making this conceptual grounding explicit at the beginning will help readers interpret the empirical work surveyed throughout the review.

      In addition, the review would benefit from an earlier introduction of the distinction between states and contents of consciousness. This distinction becomes important in the later section on anaesthesia, sleep, and epileptic seizures, where the focus shifts from content-specific NCCs to alterations in global states. Presenting these definitions upfront and briefly explaining how states and contents interact would strengthen the coherence of the manuscript.

      Overall, this is an excellent and timely review. With clearer initial theoretical definitions of consciousness, the manuscript will offer an even stronger conceptual framework for interpreting intracerebral studies of consciousness.

    1. Reviewer #1 (Public review):

      Summary:

      In the manuscript "Pathogen-Phage Geomapping to Overcome Resistance," Do et al. present an impressive demonstration of using geographical sampling and metagenomics to guide sample choice for enrichment in human-associated microbes and the pathogen of interest to increase the chances of success for isolating phages active against highly resistant bacterial strains. The authors document many notable successes (17!) with highly resistant bacterial isolates and share a thoughtfully structured phage discovery effort, potentially opening the door to similar geomapping efforts across the field. While the work is methodologically strong and valuable for the community, there are a few areas where additional clarification and analysis could better align the claims with the data presented.

      Strengths:

      (1) The manuscript describes a well-executed and transparent example of overcoming a major obstacle in therapeutic virus identification, providing a practical success story that will resonate with researchers in microbiology and medicine.

      (2) Many phage researchers have anecdotally experienced a similar phenomenon, that a particular wastewater treatment plant always seems to have the pathogens you need. Quantifying this with metagenomics modernizes and adds evidence to this phenomenon in a way that could help researchers reproduce this success in a methodical way.

      (3) The methodology of combining environmental sampling, viral screening, and host-range analysis is clearly articulated and reproducible, offering a valuable blueprint for others in the field.

      (4) The data are presented with appropriate analytical rigor, and the results include robust sequencing and metagenomic profiling that deepen understanding of local viral communities.

      (5) The 17 successes yielding 35 phages have a lot of phylogenetic novelty beyond what the Tailor labs have typically found with previous methods.

      (6) The work highlights a practical and innovative solution to an increasingly important clinical problem, supporting the development of personalized antiviral strategies.

      Weaknesses:

      (1) The central concept of geomapping as a broadly applicable strategy is wonderfully supported by the 17 successes documented in the paper. While this is actually, of course, a strength, the study does not include a comparative analysis across multiple sites with varying sampling outcomes for different bacterial types, which would be necessary to validate this claim more generally.

      (2) Some elements, such as beta diversity comparisons and the metagenomics analysis of viral dark matter, would benefit from additional statistical analysis and clearer context.

      (3) Claims about therapeutic cocktails would be better framed as speculative and/or moved to the discussion section.

      (4) The manuscript could be strengthened by elaborating on the scope and composition of the phage and bacterial isolate collections, which are important for interpreting the broader significance of the findings.

    2. Reviewer #2 (Public review):

      Summary:

      The manuscript by Do and colleagues aims to develop a workflow for isolating and identifying bacteriophages with potential applications in phage therapy against antibiotic-resistant pathogens. The workflow integrates geΦmapping as a strategy to identify potential phage sources, ΦHD as a device for phage concentration, and RΦ as a phage library constructed from the initial sampling, resulting in the discovery of 36 new phages. The paper is overall interesting, and the proposed method appears robust and effective.

      Strengths:

      The methods proposed combined state-of-the-art strategies to solve an ever-increasing problem of antibiotic resistance. The methods are robust, and the controls are appropriate. The integration of environmental sampling, concentration strategies, and downstream genomic characterization is a clear strength and provides a potentially scalable framework for identifying candidate therapeutic phages. The manuscript is clearly written overall, and the results support the main conclusions.

      Weaknesses:


      While the authors acknowledge several limitations, some aspects require clearer framing or additional clarification. The proposed workflow focuses exclusively on aquatic environments as sources of phages, which may limit the diversity of hosts and phage types recoverable using this approach. Some interpretations, particularly regarding taxonomic classification and sampling saturation, would benefit from more cautious wording given current limitations in viral taxonomy and the observed data.

    1. Reviewer #1 (Public review):

      Summary:

      In this study, the authors trained rats on a "figure 8" go/no-go odor discrimination task. Six odor cues (3 rewarded and 3 non-rewarded) were presented in a fixed temporal order and arranged into two alternating sequences that partially overlap (Sequence #1: 5⁺-0⁻-1⁻-2⁺; Sequence #2: 3⁺-0⁻-1⁻-4⁺) --forming an abstract figure-8 structure of looping odor cues.

      This task is particularly well-suited for probing representations of hidden states, defined here as the animal's position within the task structure beyond superficial sensory features. Although the task can be solved without explicit sequence tracking, it affords the opportunity to generalize across functionally equivalent trials (or "positions") in different sequences, allowing the authors to examine how OFC representations collapse across latent task structure.

      Rats were first trained to criterion on the task and then underwent 15 days of self-administration of either intravenous cocaine (3 h/day) or sucrose. Following self-administration, electrodes were implanted in lateral OFC, and single-unit activity was recorded while rats performed the figure-8 task.

      Across a series of complementary analyses, the authors report several notable findings. In control animals, lOFC neurons exhibit representational compression across corresponding positions in the two sequences. This compression is observed not only in trial/positions involving overlapping odor (e.g., Position 3 = odor 1 in sequence 1 vs sequence 2), but also in trials/positions involving distinct, sequence-specific odors (e.g., Position 4: odor 2 vs odor 4) --indicating generalization across functionally equivalent task states. Ensemble decoding confirms that sequence identity is weakly decodable at these positions, consistent with the idea that OFC representations collapse incidental differences in sensory information into a common latent or hidden state representation. In contrast, cocaine-experienced rats show persistently stronger differentiation between sequences, including at overlapping odor positions.

      Strengths:

      Elegant behavioral design that affords the detection of hidden-state representations.

      Sophisticated and complementary analytical approaches (single-unit activity, population decoding, and tensor component analysis).

      Weaknesses:

      The number of subjects is small --can't fully rule out idiosyncratic, animal-specific effects.

      Comments

      (1) Emergence of sequence-dependent OFC representations across learning.

      A conceptual point that would benefit from further discussion concerns the emergence of sequence-dependent OFC activity at overlapping positions (e.g., position P3, odor 1). This implies knowledge of the broader task structure. Such representations are presumably absent early in learning, before rats have learned the sequence structure. While recordings were conducted only after rats were well trained, it would be informative if the authors could comment on how they envision these representations developing over learning. For example, does sequence differentiation initially emerge as animals learn the overall task structure, followed by progressive compression once animals learn that certain states are functionally equivalent? Clarifying this learning-stage interpretation would strengthen the theoretical framing of the results.

      (2) Reference to the 24-odor position task

      The reference to the previously published 24-odor position task is not well integrated into the current manuscript. Given that this task has already been published and is not central to the main analyses presented here, the authors may wish to a) better motivate its relevance to the current study or b) consider removing this supplemental figure entirely to maintain focus.

      (3) Missing behavioral comparison

      Line 117: the authors state that absolute differences between sequences differ between cocaine and sucrose groups across all three behavioral measures. However, Figure 1 includes only two corresponding comparisons (Fig. 1I-J). Please add the third measure (% correct) to Figure 1, and arrange these panels in an order consistent with Figure 1F-H (% correct, reaction time, poke latency).

      (4) Description of the TCA component

      Line 220: authors wrote that the first TCA component exhibits low amplitude at positions P1 and P4 and high amplitude at positions P2 and P3. However, Figure 3 appears to show the opposite pattern (higher magnitude at P1 and P4 and lower magnitude at P2 and P3). Please check and clarify this apparent discrepancy. Alternatively, a clearer explanation of how to interpret the temporal dynamics and scaling of this component in the figure would help readers correctly understand the result.

      (5) Sucrose control<br /> Sucrose self-administration is a reasonable control for instrumental experience and reward exposure, but it means that this group also acquired an additional task involving the same reinforcer. This experience may itself influence OFC representations and could contribute to the generalization observed in control animals. A brief discussion of this possibility would help contextualize the interpretation of cocaine-related effects.

      (6) Acknowledge low N

      The number of rats per group is relatively low. Although the effects appear consistent across animals within each group, this sample size does not fully rule out idiosyncratic, animal-specific effects. This limitation should be explicitly acknowledged in the manuscript.

      (7) Figure 3E-F: The task positions here are ordered differently (P1, P4, P2, P3) than elsewhere in the paper. Please reorder them to match the rest of the paper.

    2. Reviewer #2 (Public review):

      In the current study, the authors use an odor-guided sequence learning task described as a "figure 8" task to probe neuronal differences in latent state encoding within the orbitofrontal cortex after cocaine (n = 3) vs sucrose (n = 3) self-administration. The task uses six unique odors which are divided into two sequences that run in series. For both sequences, the 2nd and 3rd odors are the same and predict reward is not available at the reward port. The 1st and 4th odors are unique, and are followed by reward. Animals are well-trained before undergoing electrode implant and catheterization, and then retrained for two weeks prior to recording. The hypothesis under test is that cocaine-experienced animals will be less able to use the latent task structure to perform the task, and instead encode information about each unique sequence that is largely irrelevant. Behaviorally, both cocaine and sucrose-experienced rats show high levels of accuracy on task, with some group differences noted. When comparing reaction times and poke latencies between sequences, more variability was observed in the cocaine-treated group, implying animals treated these sequences somewhat differently. Analyses done at the single unit and ensemble level suggests that cocaine self-administration had increased the encoding of sequence-specific information, but decreased generalization across sequences. For example, the ability to decode odor position and sequence from neuronal firing in cocaine-treated animals was greater than controls. This pattern resembles that observed within the OFC of animals that had fewer training sessions. The authors then conducted tensor component analysis (TCA) to enable a more "hypothesis agnostic" evaluation of their data.

      Overall, the paper is well written and the authors do a good job of explaining quite complicated analyses so that the reader can follow their reasoning. I have the following comments.

      While well-written, the introduction mainly summarises the experimental design and results, rather than providing a summary of relevant literature that informed the experimental design. More details regarding the published effects of cocaine self-administration on OFC firing, and on tests of behavioral flexibility across species, would ground the paper more thoroughly in the literature and explain the need for the current experiment.

      For Fig 1F, it is hard to see the magnitude of the group difference with the graph showing 0-100%- can the y axis be adjusted to make this difference more obvious? It looks like the cocaine-treated animals were more accurate at P3- is that right?<br /> The concluding section is quite brief. The authors suggest that the failure to generalize across sequences observed in the current study could explain why people who are addicted to cocaine do not use information learned e.g. in classrooms or treatment programs to curtail their drug use. They do not acknowledge the limitations of their study e.g. use of male rats exclusively, or discuss alternative explanations of their data.

      Is it a problem that neuronal encoding of the "positions" i.e. the specific odors was at or near chance throughout in controls? Could they be using a simpler strategy based on the fact that two successive trials are rewarded, then two successive trials are not rewarded, such that the odors are irrelevant?

      When looking at the RT and poke latency graphs, it seems the cocaine-experienced rats were faster to respond to rewarded odors, and also faster to poke after P3. Does this mean they were more motivated by the reward?

    1. Reviewer #1 (Public review):

      Summary:

      This study makes a significant and timely contribution to the field of attention research. By providing the first direct neuroimaging evidence for the integration-segregation theory of exogenous attention, it fills a critical gap in our understanding of the neural mechanisms underlying inhibition of return (IOR). The authors employ a carefully optimized cue-target paradigm combined with fMRI to elegantly dissociate the neural substrates of cue-target integration from those of segregation, thereby offering compelling support for the integration-segregation account. Beyond validating a key theoretical hypothesis, the study also uncovers an interaction between spatial orienting and cognitive conflict processing, suggesting that exogenous attention modulates conflict processing at both semantic and response levels. This finding shed new light on the neural mechanisms that connect exogenous attentional orienting with cognitive control.

      Strengths:

      The experimental design is rigorous, the analyses are thorough, and the interpretation is well grounded in the literature. The manuscript is clearly written, logically structured, and addresses a theoretically important question. Overall, this is an excellent, high-impact study that advances both theoretical and neural models of attention.

      Weaknesses:

      While this study addresses an important theoretical question and presents compelling neuroimaging findings, a few additional details would help improve clarity and interpretation. Specifically, more information could be provided regarding the experimental conditions (SI and RI), the justification for the criteria used for excluding behavioral trials, and how the null condition was incorporated into the analyses. In addition, given the non-significant interaction effect in the behavioral results, the claim that the behavioral data "clearly isolated" distinct semantic and response conflict effects should be phrased more cautiously.

    2. Reviewer #2 (Public review):

      Summary:

      This study provides evidence for the integration-segregation theory of an attentional effect, widely cited as inhibition of return (IOR), from a neuroimaging perspective, and explores neural interactions between IOR and cognitive conflict, showing that conflict processing is potentially modulated by attentional orienting.

      Strengths:

      The integration-segregation theory was examined in a sophisticated experimental task that also accounted for cognitive conflict processing, which is phenomenologically related to IOR but "non-spatial" by nature. This study was carefully designed and executed. The behavioral and neuroimaging data were carefully analyzed and largely well presented.

      Weaknesses:

      The rationale for the experimental design was not clearly explained in the manuscript; more specifically, why the current ER-fMRI study would disentangle integration and segregation processes was not explained. The introduction of "cognitive conflict" into the present study was not well reasoned for a non-expert reader to follow.

      The presentation of the results can be further improved, especially the neuroimaging results. For instance, Figure 4 is challenging to interpret. If "deactivation" (or a reduction in activation) is regarded as a neural signature of IOR, this should be clearly stated in the manuscript.

    3. Reviewer #3 (Public review):

      Summary:

      This study aims to provide the first direct neuroimaging evidence relevant to the integration-segregation theory of exogenous attention - a framework that has shaped behavioral research for more than two decades but has lacked clear neural validation. By combining an inhibition-of-return (IOR) paradigm with a modified Stroop task in an optimized event-related fMRI design, the authors examine how attentional integration and segregation processes are implemented at the neural level and how these processes interact with semantic and response conflicts. The central goal is to map the distinct neural substrates associated with integration and segregation and to clarify how IOR influences conflict processing in the brain.

      Strengths:

      The study is well-motivated, addressing a theoretically important gap in the attention literature by directly testing a long-standing behavioral framework with neuroimaging methods. The experimental approach is creative: integrating IOR with a Stroop manipulation expands the theoretical relevance of the paradigm, and the use of a genetic-algorithm-optimized fMRI design ensures high efficiency. Methodologically, the study is sound, with rigorous preprocessing, appropriate modeling, and analyses that converge across multiple contrasts. The results are theoretically coherent, demonstrating plausible dissociations between integration-related activity in the fronto-parietal attention network (FEF, IPS, TPJ, dACC) and segregation-related activity in medial temporal regions (PHG, STG). The findings advance the field by supplying much-needed neural evidence for the integration-segregation framework and by clarifying how IOR modulates conflict processing.

      Weaknesses:

      Some interpretive aspects would benefit from clarification, particularly regarding the dual roles ascribed to dACC activation and the circumstances under which PHG and STG are treated as a single versus separate functional clusters. Reporting conventions are occasionally inconsistent (e.g., statistical formatting, abbreviation definitions), which may hinder readability. More detailed reporting of sample characteristics, exclusion criteria, and data-quality metrics-especially regarding the global-variance threshold-would improve transparency and reproducibility. Finally, some limitations of the study, including potential constraints on generalization, are not explicitly acknowledged and should be articulated to provide a more balanced interpretation.

    1. Reviewer #1 (Public review):

      Summary:

      Fahdan et al. present a study investigating the molecular programs underlying axon initial growth and regrowth in Drosophila mushroom body (MB) neurons. The authors leverage the fact that different Kenyon cell (KC) subtypes undergo distinct axonal events on the same developmental timeline: γ KCs prune and then regrow their axons during early pupation, whereas α/β KCs extend their axons for the first time during the same pupal period. Using bulk Smart-seq2 RNA sequencing across six developmental time points, the authors identify genes enriched during γ KC regrowth and α/β KC initial outgrowth, and subsequently perform an RNAi screen to determine which candidates are functionally required for these processes.

      Among these, they focus on Pmvk, a key enzyme in the mevalonate pathway. Both RNAi knockdown and a CRISPR-generated mutant produce strong γ KC regrowth defects. Knockdown of other mevalonate pathway components (Hmgcr, Mvk) partially recapitulates this phenotype. The authors propose that Pmvk promotes axonal regrowth through effects on the TOR pathway.

      Overall, this work identifies new molecular players in developmental axon remodeling and provides intriguing evidence connecting Pmvk to γ KC regrowth.

      While the Pmvk knockdown and loss-of-function data are compelling, the evidence that the mevalonate pathway broadly regulates γ KC axon regrowth is less clear. RNAi knockdown of enzymes upstream of Pmvk (Hmgcr, Mvk) produces only mild phenotypes, and knockdown of several downstream enzymes produces no phenotype. The authors attribute this discrepancy to the possibility of weak RNAi constructs, which is plausible but not fully demonstrated. It would be helpful for the authors to discuss alternative explanations, including non-canonical roles for Pmvk that may not require the full pathway, and clarify the extent to which the current data support the conclusion that the mevalonate pathway, rather than Pmvk specifically, is a core regulator of regrowth.

      It is not clear from the Methods whether γ KCs and α/β KCs were sorted from the same brains using orthogonal binary expression systems (e.g., Gal4 > reporter 1 and LexA > reporter 2), or isolated separately from different fly lines. If the latter, differences in genetic background, staging, or batch effects could influence transcriptional comparisons. This should be explicitly clarified in the Methods, and any associated limitations discussed in the manuscript.

      The authors have made important findings that contribute to our understanding of axon growth and regrowth. As written, some major claims are only partially supported, but these issues can be addressed through reframing and clarification. In particular, the manuscript would benefit from (1) a more cautious interpretation of the mevalonate pathway's role, potentially considering Pmvk non-canonical functions, and (2) addressing methodological ambiguities in the transcriptomic analysis.

    2. Reviewer #2 (Public review):

      Fahdan et al. set out to build upon their previous work outlining the genes involved in axon growth, targeting two axon growth states: initial growth and regrowth. They outline a debate in the field that axon regrowth (For instance, after injury or in the peripheral nervous system) is different from initial axon growth, for which the authors have previously demonstrated distinct mechanisms. The authors set out to directly compare the transcriptomes of initial axon growth and regrowth, specifically within the same neuronal environment and developmental time point. To this end, the authors used the well-characterized genetic tools available in Drosophila melanogaster (the fruit fly) to build a valuable dataset of genes involved at different time points in axon growth (alpha/beta Mushroom Body Kenyon cells) and regrowth (gamma Mushroom Body Kenyon cells). The authors then focus on genes that are upregulated during both initial axon growth and axon regrowth. Then, using this subset of genes, they screen for axonal growth and regrowth deficits by knocking down 300 of these genes. 12 genes are found to be phenotypically involved in both axon growth and regrowth based on RNAi gene-targeted knockdown in the Mushroom Body. Of these 12 genes, the authors focus on one gene, Pmvk, which is part of the mevalonate pathway. They then highlight other genes in this pathway. But these genes primarily affect axon regrowth, not initial axon growth, implicating metabolic pathways in axon regrowth. This comprehensive RNA-seq dataset will be a valuable resource for the field of axon growth and regrowth, as well as for other researchers studying the Mushroom Body.

      Strengths:

      This paper contains many strengths, including the in-depth sequencing of overlapping developmental time points during the alpha/beta KCs' initial axon growth and gamma KCs' regrowth. This produces a rich dataset of differentially expressed genes across different time points in either cell population during development. In addition, the authors characterized expression patterns at developmental time points for 30 Gal4 lines previously identified as alpha/beta KC-expressing. This is very helpful for Drosophila

      Mushroom Body researchers because the authors not only characterized alpha/beta expression but also alpha'/beta' expression, gamma expression, and non-MB expression. The authors comprehensively walked through identifying differentially expressed genes during alpha/beta axon growth, identifying a subset of overlapping upregulated genes between cell types, then systematically characterized whether knockdown of a subset of these genes produced an axonal growth defect, and finally selected 1 of 3 cell-autonomous genes important for gamma KCs regrowth to further study.

      The authors utilized the developing Mushroom Body in Drosophila melanogaster, which happens to have new neurons developing axons and neurons that have undergone pruning and are regrowing neurons at the same developmental time. They are also in the same part of the brain (the Mushroom Body) and, in theory, since the authors implicate a metabolic pathway, they will have similar metabolic growth conditions.

      Identifying Pmvk and two other components of the mevalonate pathway in axon regrowth opens up novel avenues for future studies on the role this metabolic pathway may have in axon growth. The authors of this paper are also very upfront about their negative results, allowing researchers to avoid running redundant experiments and truly build on this work.

      Weaknesses:

      While the dataset produced in this study is a strength, certain aspects make it more challenging to interpret. For instance, the authors state that roughly equal numbers of males and females are used for sequencing, and this vagueness, coupled with only taking a subset of the GFP-labeled neurons during FACs sorting, can introduce confounds into the dataset. This may hold true in imaging studies as well, in which males and females were used interchangeably.

      Additionally, a rationale is needed to explain why random numbers of 1-7 were assigned to zero-expressing genes in the DESeq analysis. This does not seem to conform to the usual way this analysis is normally performed. This can alter how genes across the dataset are normalized and requires further explanation.

      The display and discussion of the data set do not always align with the authors' stated goal of having a comprehensive description of the genes that dynamically change during axon<br /> growth and regrowth. Displaying more information about genes differentially expressed in the alpha/beta KCs, or any information about the genes diƯerentially expressed in the gamma KCs when using the same criteria as the alpha/beta KCs, or the 676 overlapping upregulated genes, would significantly add to this paper. The authors previously performed a similar study across developmental time points for gamma KCs, and it is not clear whether any overlapping genes were identified. Also, more information on the genes consisting of PC1 and PC3 when showing the PCA analysis would be helpful. Within the text, there is a discussion of why certain genes or gene groups were omitted or selected, such as clusters 1 and 2, and then some of their subgroups based on expected genes. There is also some discussion of omitted gene groups, but this is not complete across the different clusters, nor is there a discussion of why PC2 was not selected or of which genes might exhibit greater variability than cell type. The authors would make a stronger case for the genes they pursued if they showed that groups of genes already known to be involved in axon growth clustered within the selected groups. Since we do not see the gene lists, this is unclear and adds to the sometimes arbitrary nature of the author's choices about what to pursue in this paper. A larger set of descriptors, such as gene lists and Gene Ontology analysis beyond what is shown, would be very helpful in putting the results in context and determining whether this is a resource beneficial to others.

      While the Pmvk story is interesting, the authors appear to make some arbitrary decisions in what is shown or pursued in this paper. Visually, CadN and Twr appear to be more severe axon regrowth phenotypes, where the peduncle appears intact, and axons are not regrowing in Figures 3 N and O. In contrast, Pmvk visually appears to lose neurons in Figure 3 M. With a change of the Gal4 driver (Figure 4), Pmvk now produces a gamma axon regrowth phenotype similar to CadN and Twr in Figure 3. This diƯerence in the use of Gal4 for characterizing axonal phenotypes is not discussed, making some interpretations more challenging due to diƯerences in Gal4 expression strength. For instance, the sequencing work was done with a diƯerent Gal4 MB expressing line than the characterization of gene knockdowns. Further characterization of the Pmvk was performed in the same Gal4 lines as the sequencing (Figure 4), suggesting a potential diƯerence in Gal4 strength that may play a role in their rescue experiments if they are using a slightly weaker Gal4 for gamma lobe expression. A broader discussion of this may make the selection of Pmvk less arbitrary if the phenotype is similar to those of CadN and Twr. Along the lines of the sometimes arbitrary nature of the genes chosen to pursue further, the authors state that they selected genes that showed differential expression at any time point. As they refine their list of genes to pursue further, they seem to prioritize genes that change at 18-21 APF. This appears to be the early period for axon growth in alpha/beta KCs and gamma KCs, based on Figure 1. A stronger case might be made at longer time points when the axon is growing or regrowing.

      The paper would benefit from scaling back the claim that the mevalonate pathway is involved. The authors identified only a subset of genes from the mevalonate pathway, all immediately upstream of Pmvk, with no effect on downstream genes. Along these lines, the paper would benefit from a discussion of non-canonical PmvK signaling.

      While the ability to take neurons at the same developmental time and from the same brain region is a strength, they are still 2 different types of neurons. Although gamma neuron axon growth occurs very early in development, it would be interesting to know whether the same genes are involved in their initial growth. A caveat to the author's conclusion is that these are 2 different cell types, and they might use different genetic programs or use overlapping ones at other times. The authors did not show that gamma KCs use these genes in their initial axon growth.

    1. Reviewer #1 (Public review):

      Summary:

      Here, the authors have addressed the recruitment and firing patterns of motor units (MUs) from the long and lateral heads of triceps in the mouse. They used their newly developed Myomatrix arrays to record from these muscles during treadmill locomotion at different speeds, and they used template-based spike sorting (Kilosort) to extract units. Between MUs from the two heads, the authors observe differences in their firing rates, recruitment probability, phase of activation within the locomotor cycle and interspike interval patterning. Examining different walking speeds, the authors find increases in both recruitment probability and firing rates as speed increases. The authors also observed differences in the relation between recruitment and the angle of elbow extension between motor units from each head. These differences indicate meaningful variation between motor units within and across motor pools, and may reflect the somewhat distinct joint actions of the two heads of triceps.

      Strengths:

      The extraction of MU spike timing for many individual units is an exciting new method that has great promise for exposing the fine detail in muscle activation and its control by the motor system. In particular, the methods developed by the authors for this purpose seem to be the only way to reliably resolve single MUs in the mouse, as the methods used previously in humans and in monkeys (e.g. Marshall et al. Nature Neuroscience, 2022) do not seem readily adaptable for use in rodents.

      The paper provides a number of interesting observations. There are signs of interesting differences in MU activation profiles for individual muscles here, consistent with those shown by Marshall et al. It is also nice to see fine scale differences in the activation of different muscle heads, which could relate to their partially distinct functions. The mouse offers greater opportunities for understanding the control of these distinct functions, compared to the other organisms in which functional differences between heads have previously been described.

      The Discussion is very thorough, providing a very nice recounting of a great deal of relevant previous results.

      Weaknesses:

      The findings are limited to one pair of muscle heads. While the findings are important in their own right, the lack of confirmation from analysis of other muscles acting at other joints leaves the generalization of these findings unclear.

      While differences between muscle heads with somewhat distinct functions are interesting and relevant to joint control, differences between MUs for individual muscles, like those in Marshall et al., are more striking because they cannot be attributed potentially to differences in each head's function. The present manuscript does show some signs of differences for MUs within individual heads (e.g. Figure 2C), but the manuscript falls short of providing a statistical basis for the existence of distinct subpopulations.

    2. Reviewer #2 (Public review):

      The present study, led by Thomas and collaborators, aims to characterise the firing activity of individual motor units in mice during locomotion. To achieve this, the team implanted small arrays of eight electrodes into two heads of the triceps and performed spike sorting using a custom implementation of Kilosort. Concurrently, they tracked the positions of the shoulder, elbow, and wrist using a single camera and a markerless motion capture algorithm (DeepLabCut). Repeated one-minute recordings were conducted in six mice across five speeds, ranging from 10 to 27.5 cm-1.

      From these data, the authors demonstrate that:

      - Their recording method and adapted spike-sorting algorithm enable robust decoding of motor unit activity during rapid movements.<br /> - Identified motor units tend to be recruited during a subset of strides, with recruitment probability increasing with speed.<br /> - Motor units within individual heads of the triceps likely receive common synaptic inputs that correlate their activity, whereas motor units from different heads exhibit distinct behaviour.

      The authors conclude that these differences arise from the distinct functional roles of the muscles and the task constraints (i.e., speed).

      Strengths:

      - The novel combination of electrode arrays for recording intramuscular electromyographic signals from a larger muscle volume, paired with an advanced spike-sorting pipeline capable of identifying motor unit populations.<br /> - The robustness of motor unit decoding during fast movements.

      Weaknesses:

      - The data do not clearly indicate which motor units were sampled from each pool, leaving uncertainty as to whether the sample is biased towards high-threshold motor units or representative of the entire pool.<br /> - The results largely confirm the classic physiological framework of motor unit recruitment and rate coding, offering limited new insights into motor unit physiology.

      I would like to thank the authors for their thorough and insightful revisions. I am particularly pleased with the inclusion of the new analyses demonstrating the robustness of motor unit decoding, as well as the improved transparency regarding spike-sorting yield for each muscle and animal. Additionally, the new analyses illustrating that recruitment within muscle heads is consistent with the presence of common synaptic inputs and orderly recruitment significantly strengthen the manuscript.

    3. Reviewer #3 (Public review):

      Summary:

      Using the approach of Myomatrix recording, the authors report that 1) motor units are recruited differently in the two types of muscles and 2) individual units are probabilistically recruited during the locomotion strides, whereas the population bulk EMG has a more reliable representation of the muscle. Third, the recruitment of units was proportional to walking speed.

      Strengths:

      The new technique provides a unique dataset, and the data analysis is convincing and well-executed.

      Weaknesses:

      After the revision, I no longer see any apparent weaknesses in the study.

    1. Reviewer #1 (Public review):

      Summary:

      Mazer & Yovel 2025 dissect the inverse problem of how echolocators in groups manage to navigate their surroundings despite intense jamming using computational simulations.

      The authors show that despite the 'noisy' sensory environments that echolocating groups present, agents can still access some amount of echo-related information and use it to navigate their local environment. It is known that echolocating bats have strong small and large-scale spatial memory that plays an important role for individuals. The results from this paper also point to the potential importance of an even lower-level, short-term role of memory in the form of echo 'integration' across multiple calls, despite the unpredictability of echo detection in groups. The paper generates a useful basis to think about the mechanisms in echolocating groups for experimental investigations too.

      Strengths:

      The paper builds on biologically well-motivated and parametrised 2D acoustics and sensory simulation setup to investigate the various key parameters of interest

      The 'null-model' of echolocators not being able to tell apart objects & conspecifics while echolocating still shows agents successfully emerge from groups - even though the probability of emergence drops severely in comparison to cognitively more 'capable' agents. This is nonetheless an important result showing the direction-of-arrival of a sound itself is the 'minimum' set of ingredients needed for echolocators navigating their environment.

      The results generate an important basis in unraveling how agents may navigate in sensorially noisy environments with a lot of irrelevant and very few relevant cues.

      The 2D simulation framework is simple and computationally tractable enough to perform multiple runs to investigate many variables - while also remaining true to the aim of the investigation.

    2. Reviewer #2 (Public review):

      This manuscript describes a detailed model for bats flying together through a fixed geometry. The model considers elements which are faithful to both bat biosonar production and reception and the acoustics governing how sound moves in air and interacts with obstacles. The model also incorporates behavioral patterns observed in bats, like one-dimensional feature following and temporal integration of cognitive maps. From a simulation study of the model and comparison of the results with the literature, the authors gain insight into how often bats may experience destructive interference of their acoustic signals and those of their peers, and how much such interference may actually negatively effect the groups' ability to navigate effectively. The authors use generalized linear models to test the significance of the effects they observe.

      The work relies on a thoughtful and detailed model which faithfully incorporates salient features, such as acoustic elements like the filter for a biological receiver and temporal aggregation as a kind of memory in the system. At the same time, the authors abstract features that are complicating without being expected to give additional insights, as can be seen in the choice of a two-dimensional rather than three-dimensional system. I thought that the level of abstraction in the model was perfect, enough to demonstrate their results without needless details. The results are compelling and interesting, and the authors do a great job discussing them in the context of the biological literature.

      With respect to the first version of the manuscript, the authors have remedied all my outstanding questions or concerns in the current version. The new supplementary figure 5 is especially helpful in understanding the geometry.

    1. Reviewer #1 (Public review):

      The key discovery of the manuscript is that the authors found that genetically wild type females descended from Khdc3 mutants shows abnormal gene expression relating to hepatic metabolism, which persist over multiple generations and pass through both female and male lineages. They also find dysregulation of hepatically-metabolized molecules in the blood of these wild type mice with Khdc3 mutant ancestry. These data provide solid evidence further support that phenotype can be transmitted to multiple generations without altering DNA sequence, supporting the involvement of epigenetic mechanisms. The authors further performed exploratory studies on the small RNA profiles in the oocytes of Khdc3-null females, and their wild type descendants, suggesting that altered small RNA expression could be a contributor of the observed phenotype transmission, although this has not been functionally validated.

      Comments on revisions:

      My previous comments are addressed.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript aimed to investigate the non-genetic impact of KHDC3 mutation on the liver metabolism. To do that they analyzed the female liver transcriptome of genetically wild type mice descended from female ancestors with a mutation in the Khdc3 gene. They found that genetically wild type females descended from Khdc3 mutants have hepatic transcriptional dysregulation which persist over multiple generations in the progenies descended from female ancestors with a mutation in the Khdc3 gene. This transcriptomic deregulation was associated with dysregulation of hepatically-metabolized molecules in the blood of these wild type mice with female mutational ancestry. Furthermore, to determine whether small non-coding RNA could be involved in the maternal non-genetic transmission of the hepatic transcriptomic deregulation, they performed small RNA-seq of oocytes from Khdc3-/- mice and genetically wild type female mice descended from female ancestors with a Khdc3 mutation and claimed that oocytes of wild type female offspring from Khdc3-null females has dysregulation of multiple small RNAs.

      Finally, they claimed that their data demonstrates that ancestral mutation in Khdc3 can produce transgenerational inherited phenotypes.

      Comments on revisions:

      I thank the authors for their detailed response to my comments. I have nothing to add.

    1. Reviewer #1 (Public review):

      Summary:

      This paper describes a number of alterations in pulmonary surfactant recovered from bottlenosed dolphins. Although the sample consists of only seven diseased and two control animals, due to the difficulty in obtaining these animals, this is considered adequate. However, conclusions must be considered in view of this small sample size. The authors employ a number of sophisticated techniques to show differences in the composition and in the structure of bilayers formed by these two surfactant samples

      Strengths:

      The availability of these samples makes this study quite original. The authors apply mass spectroscopy to observe an increase of an acidic phospholipid and in the level of plasmalogens in the diseased (i.e. pneumonia) aquatic animals. They suggest these increases contribute to hampered function in vivo. They show alterations in lipid bilayers formed from lipid extracts of these surfactants by electron microscopy, by Atomic Force Microscopy and by small and wide-angle X-ray scattering -SAXS/WAXS. They have previously shown that adding small amounts of cardiolin to the clinical surfactant BLES results in altered bilayer structure, consistent with the current study.

      Weaknesses:

      It seems surprising to me that the small changes in cardiolipin can alter surfactant function i.e., reducing surface tension to near zero. As it happens, no surfactant function tests monitoring the reduction in surface tension were conducted. This would add a great deal to the paper. Further, the paper would benefit greatly from the inclusion of a table listing the lipid composition of surfactant recovered from diseased and normal animals and comparing this to the composition of BLES, a clinical surfactant. Finally, there is a possibility that the minor lipid identified by mass spec is the lysosomal marker, bis-(monoacylglcerol)phosphate rather than the metachronal marker, cardiolipin.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Porras-Gómez et al. analyse the lipid composition and biophysical properties of pulmonary surfactant obtained by bronchoalveolar lavage (BAL) from a group of bottlenose dolphins (Tursiops truncatus), including two healthy individuals and five affected by pneumonia. Through lipidomic analysis, the authors report an exacerbated presence of cardiolipin species in the BAL lipid extracts from diseased dolphins compared to healthy ones. Structural analyses using electron microscopy, atomic force microscopy, and X-ray scattering on rehydrated membrane samples reveal that lipids from diseased animals form membranes with a more pronounced Lβ phase and reduced fluidity. Moreover, the membranes from affected lungs appear more interconnected and less hydrated, as indicated by the X-ray scattering data. These findings provide valuable and convincing insights into how pulmonary disease alters the lipid composition and structural properties of surfactant in diving mammals, and may have broader implications for understanding surfactant dysfunction in marine mammals.

      Strengths:

      The study is well designed, and the experimental techniques were applied in a logical and coherent manner. The results are thoroughly analysed and discussed, and the manuscript is clearly written and well organized, making it both easy to follow and scientifically robust. Although the number of samples is limited, the rarity and logistical challenges of obtaining bronchoalveolar lavage material, particularly from animals affected by respiratory disease, make this study especially valuable and relevant.

      Weaknesses:

      In my opinion, the main issue lies in the treatment of the samples. Pulmonary surfactant is a lipoprotein complex produced by type II pneumocytes of the alveolar epithelium in the form of compact and highly dehydrated structures known as tubular myelin. Once secreted, these structures unfold and, upon contact with the air-liquid interface, form an interfacial monolayer connected to surfactant membranes in the subphase, thereby facilitating respiratory dynamics throughout the breathing cycle.

      When bronchoalveolar lavages are treated using the Bligh and Dyer method to extract the hydrophobic fraction of these samples, the structural complexity of the surfactant is disrupted, and this organization cannot be completely restored once the lipids are rehydrated. Although these extracts contain the hydrophobic proteins SP-B and SP-C, the hydrophilic protein SP-A may play an essential role in the formation of pulmonary surfactant structures. It is well established that SP-A is crucial for the formation of tubular myelin, an intermediate structure between the lamellar bodies newly secreted by type II cells and the interfacial surfactant layers.

      Moreover, and more importantly, bronchoalveolar lavage fluid may contain cells, tissue debris, and even bacteria that can alter the lipid composition of the samples used in the study after extraction by the Bligh and Dyer method. For this reason, most studies include a density gradient centrifugation step to isolate the surfactant membranes. Consequently, the samples used may be contaminated with phospholipids originating from other cells, such as macrophages, pneumocytes, or bacterial cells, particularly in lavages obtained from diseased animals.

      Although the techniques employed provide valuable information about the behaviour of surfactant membranes and allow certain inferences regarding their functionality, no functional studies of these samples have been conducted using methods such as the constrained drop surfactometer or the captive bubble surfactometer. The observed alterations do not necessarily demonstrate that surfactant modulates its properties, as claimed by the authors, but rather indicate that it is altered by the presence of other lipids.

      The spin-coating technique used to form lipid films for analysis by atomic force microscopy is not the most suitable approach to reproduce the structures generated by pulmonary surfactant. However, the results obtained may still provide valuable insights into the biophysical behaviour of its components. The analysis of lung tissue shown in Supplementary Figure S3 presents the same limitation, as the samples were embedded in a cutting compound, and the measurements may have been taken from different regions of the tissue. Therefore, it cannot be ensured that the analysed structures correspond to those generated by pulmonary surfactant.

      The finding that the structures formed in samples obtained from diseased animals are more tightly packed and dehydrated than those derived from the surfactant of healthy animals contrasts with the notion that the high efficiency of lamellar bodies in generating interfacial structures is related to their high degree of packing and dehydration. The formation of these structures involves the participation of the ABCA3 protein, which pumps phospholipids into the interior of lamellar bodies, and SP-B, which facilitates the formation of close membrane contacts.

      While the results are interesting from a comparative perspective, the implications for surfactant performance and respiratory dynamics should be interpreted with caution.

    3. Reviewer #3 (Public review):

      In this manuscript, the authors present data on the supposed composition of pulmonary surfactant obtained from bronchoalveolar lavages (BALs) of a small cohort of dolphins, a group of them suffering from pneumonia. The lipid compositional differences of the sample group are consistent with the different pathological situations of the specimens, suggesting that differences in surfactant composition are somehow associated (as a cause or as a consequence) with the particular pathophysiological contexts. It is particularly remarkable that an increase in cardiolipins and plasmalogens appears as an abnormal composition in pathological surfactants. The study is completed by analyzing the differences in membrane properties (order, packing, phase) of abnormal versus "control" membranes, concluding that pneumonia in dolphins is associated with a significant alteration of surfactant membranes that become more rigid, packed and thicker than those in surfactant from animals with no lung disease.

      In general terms, the data provided are of interest as they somehow offer a framework of effects that may extend what is known about alterations of composition, biophysical properties and functional performance of pulmonary surfactant as a consequence of respiratory pathologies. A collection of pertinent biophysical methodologies (fluorescence, X-ray scattering, AFM) have been applied to complete a full characterization of membrane properties in the different samples.

      However, they way the samples have been processed, i.e. by making organic extracts of hydrophobic (lipid and protein) components before surfactant membranes have been purified or at least, separated from bulk lavage, open the question of how much of the altered composition is actually occurring in surfactant or comes from other membranes (from cells, bacteria) that have been completely intermixed as a consequence of the organic extraction. Without an appropriate surfactant membrane obtention, the results of the study should be taken with caution and await confirmation. Specific questions that need to be considered include:

      (1) As said, the direct organic extract of BAL samples ends in a full mix of lipid and protein components that in origin could be part of different membranes, either from different surfactant assemblies, or even from pulmonary cells or membrane debris, or microorganisms, collected within the lavage. Obtaining conclusions about the structure and properties of membranes artefactually reconstituted from such lipid and protein mixtures is far from correct.

      It is mentioned that "subsequentially" to the organic extraction, the samples were subjected to ultracentrifugation to separate debris and membrane cells. I do not see what the ultracentrifugation is going to change if it is done after the organic extraction. It should have been done before the extraction, for the organic solvents to solubilize exclusively the large, and relatively light, surfactant membrane complexes.

      On the other hand, the ulterior reconstitution of the obtained full lipid mixture surely ends in membrane assemblies whose compositional distribution and organization may differ significantly from those in the original membranes.

      Taking all this into account, statements such as "These aggregate forms reproduce the expected membrane microstructures observed in native alveolar hypophase" or "pulmonary membranes can be successfully extracted and reconstituted from BALs of Navy dolphins" are simply not true and should be rephrased.

      One can understand that the limitation of material may make it difficult to obtain first the purified surfactant membranes and then their organic extract. However, the limitation should be acknowledged to make the readers clear that the actual compositional effects caused in surfactant by pneumonia need confirmation.

      (2) In some of the experiments, i.e. in the AFM characterization, supported membranes were prepared by the spray-dry method applied to organic solutions. Again, the spray-dry of organic lipid solutions ends in a lipid dispersion that may be very far from the real organization of the lipids in actual surfactant membranes.

      (3) When stated that phospholipid concentrations are greater in BAL from pinnipeds than in humans, how has the actual concentration been determined? BAL volumes are typically subjected to large variations depending on the conditions used to obtain the lavage (including volume of saline instilled, level of atelectasia in the lung tissue, presence of inflammation and edema, etc). If total amounts of phospholipids in BAL are to be compared, certain normalization procedures should be applied, such as for instance, with respect to the urea concentration in serum.

      (4) All the differences regarding membrane phase and lipid order/packing have been interpreted in terms of the potential coexistence of Lbeta (gel)/Lalpha (liquid crystalline) phases. However, it has been well established that in lipid systems containing cholesterol, such as pulmonary surfactant, phase coexistence can actually be of the type liquid-ordered (Lo)/liquid-disordered (Ld), very different in terms of mobility and true molecular order. Why do the authors consider that Lbeta is the phase observed in the surfactant membranes they have reconstituted? The presence of round-shaped domains seems to indicate that a liquid/liquid phase segregation is actually occurring.

      (5) In the same line as the previous comment, the authors state that SAXS shows that bovine-extracted pulmonary membranes exhibit a coexistence of two lamellar phases, one rich in unsaturated lipids and one in saturated lipids. SAXS and WAXS cannot provide compositional information, but structural parameters such as membrane thickness, or molecular order. This should be clarified.

      (6) It is mentioned that the surfactant monolayer at the air-liquid interface is interconnected to tubular membranous structures (tubular myelin, TM). It is true that TM, when present, appears interconnected with the interface. However, it is widely recognized that there are many other structures connected with the interfacial film, including multilamellar membrane arrays or reservoirs that have not been mentioned here. Furthermore, TM is not required for surfactant function, because it is absent, for instance, in mice lacking expression of surfactant protein SP-A, which can breathe perfectly.

      (7) In the Discussion, the authors mention that "...after squeeze-out, the excluded multilayers remain closely associated with the interfacial monolayer rather than escaping into the subphase". The authors may like to complete this discussion by specifying that the stable association of excluded assemblies with the interfacial film is actually possible thanks to the surfactant proteins.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript presents an ambitious attempt to examine whether episodic memory traces ("engrams") of forgotten associations persist in the human brain and whether these traces continue to influence behavior implicitly. Using 7T fMRI, the authors track 96 one-shot face-object associations across learning, 30-minute retrieval, and 24-hour retrieval, complemented by a recognition test. Participants classify each memory as sure, unsure, or guess, enabling an operational dissociation between consciously accessible and inaccessible memories.

      Strengths:

      The study addresses a timely and theoretically important question arising from rodent engram research, i.e., whether forgotten human memories leave detectable neural signatures. The use of high-resolution 7T fMRI, representational similarity analysis (RSA), and gPPI connectivity analyses aims at a detailed systems-level perspective. The results suggest that correct guess responses (i.e., when participants believe they are guessing) are accompanied by hippocampal activity and connectivity patterns that correlate with behavioral performance, potentially pointing to residual memory traces. The study also presents evidence for divergent consolidation trajectories: consciously accessible memories become more neocortically distributed after sleep, whereas inaccessible memories exhibit strengthened hippocampal signatures.

      Weaknesses:

      Despite the methodological rigor, some interpretational issues merit caution. First, the reliance on participants' subjective "guess" reports to categorize trials as forgotten is problematic. Guess responses at the 30-minute retrieval were at chance level, whereas guess responses during recognition were above chance; interpreting both as "implicit episodic memory" may conflate different mechanisms (episodic retrieval, familiarity, associative priming).

      Second, several analyses raise concerns about circularity or insufficient independence, for example, when contrasting correct vs. incorrect guess trials to locate "engram" activity and then correlating that activity with guessing accuracy. Similarly, the behavioral analyses are fragmented (multiple t-tests across conditions) rather than using a factorial model that accounts for dependencies among confidence levels and timepoints.

      Third, the choice to include only "sure" and "guess" responses discards a substantial portion of trials ("unsure"), reducing power and complicating interpretation, especially given that unsure responses show above-chance performance.

      Finally, the study's two-scanner-sequence design (small-FOV vs. whole-brain) is challenging as it complicates comparisons across analyses, especially when some critical results (e.g., hippocampal reinstatement patterns) do not consistently replicate across sequences.

      Conclusion:

      Overall, the manuscript provides preliminary evidence that neural traces of forgotten episodic memories might persist in humans and could guide behavior in the absence of conscious awareness. While interpretational caution is warranted, especially regarding the nature of "guess"-based retrieval and the independence of neural contrasts, the study makes a valuable contribution to debates on engram persistence, systems consolidation, and the role of consciousness in episodic memory.

    2. Reviewer #2 (Public review):

      Summary:

      The goal of the experiment was to identify the fMRI neural correlates of persistence and recovery of forgotten memories. A forgotten memory was defined behaviorally as successful learning, followed by failure in a recall format task, followed by next-day success in a recognition format task. The comparison is to memories that were not forgotten at any stage of the task. Various univariate, connectivity, and multivariate analyses were used to identify neural correlates of forgotten memories that were recovered, that remained forgotten, and successful memory. Some claims are made about how activity of the "episodic memory network" predicts the persistence of forgotten memories.

      Strengths:

      Studies on the persistence of forgotten memories in rodent models have been used to make some novel claims about the potential properties of engrams. Attempting similar research in humans is a laudable goal.

      Patterns of behavioral responses are consistent across subjects.

      Weaknesses:

      I do not find that the fMRI results fit the narrative provided.

      A major issue is that primary results do not replicate across the two fMRI datasets that were collected using the same task. For example, hippocampal activity associated with correct responses (confident and guess) was identified in the group receiving the fMRI scan that used a small FOV, but not in the group that received an fMRI scan of the whole brain, for both 30-min and 24-hr delays (lines 202-217). This suggests that the main findings are not even replicable internally within the same experiment. There is no reasonable justification for this.

      Next, most of the reported fMRI findings do not meet reasonable thresholds for statistical significance. In many places, the authors acknowledge this in the text by saying that a difference in the fMRI metric "tended towards significant correlation" or that comparisons "revealed non-significant mean value comparisons". It is not clear why these non-significant findings are interpreted as though they are positive findings. Beyond that, many of the reported findings are not meeting the threshold (i.e., p=0.058), without any acknowledgement that they are marginal. Beyond that, the majority of comparisons that are interpreted in the main text are not significant based on the companion information provided in the supplementary tables. That is, they are totally non-significant when using FWE or FDR correction at either the cluster or peak levels.

      Beyond this, the supplementary tables indicate that "clusters identified solely within white matter regions have been excluded." The fact that there are any findings in white matter to ignore indicates that the statistical thresholds are inappropriate. It's tantamount to seeing activation in the brain of a dead fish.

      The overall picture based on these factors is that the statistical tests did not use sufficiently stringent safeguards against false positives given the multiple comparison problem that plagues fMRI. So, there are tons of false positives, which are being selectively interpreted to tell a particular story. That is, each comparison yields lots of findings in many brain area, and those that do not fit the particular narrative are being ignored (including those in white matter). What's more, when the small FOV fMRI scan is done, the imaging volume is centered on the hippocampus and its close network, so all false positives appear to be exactly in those brain regions about which the authors want to make conclusions. When throwing darts, you will always hit a bullseye if that is all that exists. The fact that the same comparisons done in the companion whole-brain dataset do not yield the same results is telling: the analysis plan is not sufficiently rigorous to yield findings that are replicable.

      Further, I think that it is highly debatable whether the task measures the recovery of forgotten memories at all. Forgotten memories are defined as those that fail when tested using a recollection format but succeed when tested using a recognition format. The well-characterized distinction between recollection and recognition is thus being construed as telling us something about the fate of engrams. I think the much more likely alternative is that "forgotten" memories are just relatively weak memories that don't meet whatever criteria subjects typically use when making recollection judgments, and not some special category of memory. In terms of brain activation, they seem for the most part to follow the pattern of stronger memory, but weaker.

      Finally, many hypotheses are used as though they are proven. For instance, fMRI activity patterns are called "engrams" even though there are no tests to determine whether they meet reasonable criteria that have been adopted in the engram literature (e.g., necessity, sufficiency). Whatever happens over the 24-hour delay is called "consolidation" even if there is no test that consolidation has occurred. Etc. It becomes hard to differentiate what is an assumption, versus a hypothesis, versus an inference/conclusion.

    1. Reviewer #1 (Public review):

      Summary:

      This study extends the short-term synaptic plasticity (STP)-based theory of activity-silent working memory (WM) by introducing a physiological mechanism for chunking that relies on synaptic augmentation (SA) and specialized chunking clusters. The model consists of a recurrent neural network comprising excitatory clusters representing individual items and a global inhibitory pool. The self-connections within each cluster dynamically evolve through the combined effects of STP and SA. When a chunking cue, such as a brief pause in a stimulus sequence, is presented, the chunking cluster transiently suppresses the activity of the item clusters, enabling the grouped items to be maintained as a coherent unit and subsequently reactivated in sequence. This mechanism allows the network to enhance its effective memory capacity without exceeding the number of simultaneously active clusters, which defines the basic capacity. They further derive a new upper limit of WM capacity, the new magic number. When the basic capacity is four, the upper bound for complete recall becomes eight, and the optimal hierarchical structure corresponds to a binary tree of two-item pairs forming four chunks that combine into two meta-chunks. Reanalysis of linguistic data and single-neuron recordings from human epilepsy patients (identifying boundary neurons) provides qualitative support for the model's predictions.

      Strengths:

      This study makes an important contribution to theoretical and computational neuroscience by proposing a physiologically grounded mechanism for chunking based on STP and SA. By embedding these processes in a recurrent neural network, the authors provide a unified account of how chunks can be formed, maintained, and sequentially retrieved through local circuit dynamics, rather than through top-down cognitive strategies. The work is conceptually original, analytically rigorous, and clearly presented, deriving a simple yet powerful capacity law that extends the classical magic number framework from four to eight items under hierarchical chunking. The modeling results are further supported by preliminary empirical evidence from linguistic data and single-neuron recordings in the human medial temporal lobe, lending credibility to the proposed mechanism. Overall, this is a well-designed and well-written study that offers novel insights into the neural basis of working-memory capacity and establishes a solid bridge between theoretical modeling and experimental findings.

      Weaknesses:

      This study is conceptually strong and provides an elegant theoretical framework, but several aspects limit its biological and empirical grounding.

      First, the control mechanism that triggers and suppresses chunking clusters remains only schematically defined. The model assumes that chunking events are initiated by pauses, prosodic cues, or internal control signals, but does not specify the underlying neural circuits (e.g., prefrontal-basal ganglia loops) that could mediate this gating in the brain. Clarifying where, when, and how the chunking clusters are turned on and off will be critical for establishing biological plausibility.

      Second, the network representation is simplified: item clusters are treated as non-overlapping and homogeneous, whereas real cortical circuits exhibit overlapping representations, distinct excitatory/inhibitory populations, and multiscale local and long-range connectivity. It remains unclear how robust the proposed dynamics and derived capacity limit would be under such biologically realistic conditions.

      Third, the model heavily relies on SA operating over a timescale of several seconds, yet in vivo, the time constants and prevalence of SA can vary widely across cortical regions and neuromodulatory states. The stability of the predicted "new magic number" under realistic noise levels and modulatory influences, therefore, needs to be systematically evaluated.

    2. Reviewer #2 (Public review):

      Summary:

      This work extends a previous recurrent neural network model of activity-silent working memory to account for well-established findings from psychology and neuroscience suggesting that working memory capacity constraints can be partially overcome when stimuli can be organized into chunks. This is accomplished via the introduction of specialized chunking clusters of neurons to the original model. When these chunking clusters are activated by a cue (such as a longer delay between stimuli), they rapidly suppress recently active stimulus clusters. This makes these stimulus clusters available for later retrieval via a synaptic augmentation mechanism, thereby expanding the network's overall effective capacity. Furthermore, these chunking clusters can be arranged in a hierarchical fashion, where chunking clusters are themselves chunked by higher-level chunking clusters, further expanding the network's overall effective capacity to a new "magic number", 2^{C-1} (where C is the basic capacity without chunking). In addition to illustrating the basic dynamics of the model with detailed simulations (Figures 1 and 2), the paper also utilizes qualitative predictions from the model to (re-)analyze data collected in previous experiments, including single-unit recordings from human medial temporal lobe as well as behavioral findings from a classic study of human memory.

      Strengths:

      The writing and figures are very clear, and the general topic is relevant to a broad interdisciplinary audience. The work is strongly theory-driven, but also makes some effort to engage with existing data from two empirical studies. The basic results showcasing how chunking can be achieved in an activity-silent working memory model via suppression and synaptic augmentation dynamics are interesting. Furthermore, we agree with the authors that the derivation of their new "magic number" is relatively general and could apply to other models, so those findings in particular may be of interest even to researchers using different modeling frameworks.

      Weaknesses:

      (1) Very important aspects of the model are assumed / hard-coded, raising the concern that it relies too much on an external controller, and that it would therefore be difficult to implement the same principles in a fully behaving model responsible for producing its own outputs from a sequence of stimuli (i.e., without a priori knowledge of the structure of incoming sequences).

      (i) One such aspect is the use of external chunking cues provided to the model at critical times to activate the chunking clusters. The simulations reported in the paper were conducted in a setting where signals to chunk are conveniently indicated by longer delays between stimuli. In this case, it is not difficult to imagine how an external component could detect the presence of such a delay and activate a chunking cluster in response. However, in order for the model to be more broadly applicable to different memory tasks that elicit chunking-related phenomena, a more general-purpose detector would be required (see further comments below and alternative models).

      (ii) Relatedly, and as the authors acknowledge in the discussion, the network relies on a pretty sophisticated external controller that decides when the individual chunking clusters are activated or deactivated during readout/retrieval. This seems especially complex in the hierarchical case. How might a network decide which chunking/meta-chunking clusters are activated/deactivated in which order? This was hard-coded in their simulations, but we imagine that it would be difficult to implement a general solution to this problem, especially in cases where there is ambiguity about which stimuli should be chunked, or where the structure of the incoming sequence is not known in advance.

      (iii) One of the central mechanisms of the model is the rapid synaptic plasticity in the inhibitory connections responsible for binding chunking clusters to their corresponding stimulus clusters. This mechanism again appears to have been hard-coded in the main simulations. Although we appreciate that the authors worked on one possible way that this could be implemented (Methods section D, Supplementary Figure S2), in the end, their solution seems to rely on precisely fine-tuning the timing with which stimuli are presented - a factor that seems unlikely to matter very much in humans/animals. This stands in contrast with models of working memory that rely on persistent activity, which are more robust to changes in timing. Note that we do not discount the possibility of activity-silent WM, and indeed it should be studied in its own right, but it is then even more important to highlight which of its features are dependent on the time constants, etc.

      (2) Another key shortcoming of this work is its limited direct engagement with empirical evidence and alternative computational accounts of chunking in WM. Although the efforts to re-analyze existing empirical results in light of the new predictions made by the model are commendable, in the end, we think they fall short of being convincing. As noted above, the model doesn't actually perform the same two tasks used in the human experiments, so direct quantitative comparisons between the model and human behavior or neural data are not possible. Instead, the authors rely on isolating two qualitative predictions of the model - the "dip" and "ramp" phenomena observed after a chunking cluster is activated (Figure 3), and the new magic number for effective capacity derived from the model in the case where stimuli are chunkable, which approximately converges with human recall performance in a memory study (Figure 4). Below, we highlight some specific issues related to these two sets of analyses, but the larger point is that if the model is making a commitment about how these neural mechanisms relate to behavioral phenomena, it would be important to test if the model can produce the behavioral patterns of data in experimental paradigms that have been extensively used to characterize those phenomena. For example, modern paradigms characterizing capacity limits have been more careful to isolate the contributions of WM per se (whereas the original magic number 7 is now thought to reflect a combination of episodic and working memory; see Cowan 2010). There are several existing models that more directly engage with this literature (e.g., Edin et al., 2009; Matthey et al., 2015; Nassar et al., 2018; Soni & Frank, 2025; Swan & Wyble, 2014; van den Berg et al., 2014; Wei et al., 2012), some of which also account for chunking-related phenomena (e.g., Wei et al, 2012; Nassar et al., 2018; Panichello et al., 2019; Soni & Frank, 2025). A number of related proposals suggest that WM capacity limits emerge from fundamentally different mechanisms than the one considered here - for example, content-related interference (Bays, 2014; Ma et al., 2014; Schurgin et al., 2020), or limitations in the number of content-independent pointers that can be deployed at a given time (Awh & Vogel, 2025), and/or the inherent difficulty of learning this binding problem (Soni & Frank, 2025). We think it would be worth discussing how these ideas could be considered complementary or alternatives to the ones presented here.

      (i) Single unit recordings. We found it odd that the authors chose to focus on evidence from single-unit recordings in the medial temporal lobe from a study focused on episodic memory. It was unclear how exactly these data are supposed to relate to their proposal. Is the suggestion that a mechanism similar to the boundary neurons might be operative in the case of working memory over shorter timescales in WM-related areas such as the prefrontal cortex, or that their chunking mechanism may relate not only to working memory but also to episodic memory in the medial temporal lobe?

      (ii) N-gram memory experiment. Our main complaint about the analysis of the behavioral data from the human memory study (Figure 4) is that the model clearly does not account for the main effect observed in that study - namely, the better recall observed for higher-order n-gram approximations to English. We acknowledge that this was perhaps not the main point of the analysis (which related more to the prediction about the absolute capacity limit M*), but it relates to a more general criticism that the model cannot account for chunking behavior associated with statistical learning or semantic similarity. Most of the examples used in the introduction and discussion are of this kind (e.g., expressions such as "Oh my God" or "Easier said than done", etc.). However, the chunking mechanism of the model should not have any preference for segmenting based on statistical regularities or semantic similarity - it should work just as well if statistical anomalies or semantic dissimilarity were used as external chunking cues. In our view, these kinds of effects are likely to relate to the brain's use of distributed representations that can capture semantic similarity and learn statistical regularities in the environment. Although these kinds of effects may be beyond the scope of this model, some effort could be made to highlight this in the discussion. But again, more generally, the paper would be more compelling if the model were challenged to simulate more modern experimental paradigms aimed at testing the nature of capacity limits in WM, or chunking, etc.

      (iii) There are a number of other empirical phenomena that we're not sure the model can explain. In particular, one of the hallmarks of WM capacity limits is that it suffers from a recency bias, where people are more likely to remember the most recent items at the expense of items presented prior to that (Oberauer et al 2012). [There are also studies showing primacy effects in addition to recency effects, but the primacy effects are generally attributed to episodic rather than working memory - for example, introducing a distractor task abolishes the recency but not primacy effect]. But the current model seems to make the opposite prediction: when the stimuli exceed its base capacity, it appears to forget the most recent stimuli rather than the earliest ones (Figure 1d). This seems to result from the number of representations that can be reactivated within a cycle and thus seems inherent to the dynamics of the model, but the authors can clarify if, instead, it depends on the particular values of certain parameters. (In contrast, this recency effect is captured in other models with chunking capabilities based on attractive dynamics and/or gating mechanisms - eg Boboeva et al 2023; Soni & Frank (2025)). Relatedly, we're not sure if the model could account for the more recent finding that recall is specifically enhanced when chunks occur in early serial positions compared to later ones (Thalmann, Souza, Oberauer, 2019).

    3. Reviewer #3 (Public review):

      The paper presents a synaptic mechanism for chunking in working memory, extending previous work of the last author by introducing specialized "chunking clusters", neural populations that can dynamically segment incoming items into chunks. The idea is that this enables hierarchical representations that increase the effective capacity of working memory. They also derive a theoretical bound for working memory capacity based on this idea, suggesting that hierarchical chunking expands the number of retrievable items beyond the basic WM capacity. Finally, they present neural and behavioral data related to their hypothesis.

      Strengths

      A major strength of the paper is its clear theoretical ambition of developing a mechanistic model of working memory chunking.

      Weaknesses

      Despite the inspiration in biophysical mechanisms (short-term synaptic plasticity with different time constants), the model is "cartoonish". It is unclear whether the proposed mechanism would work reliably in the presence of noise and non-zero background activity or in a more realistic implementation (e.g., a spiking network).

      As far as I know, there is no evidence for cyclic neural activation patterns, which are supposed to limit WM capacity (such as in Figure 1d). In fact, I believe there is no evidence for population bursts in WM, which are a crucial ingredient of the model. For example, Panicello et al. 2024 have found evidence for periods during which working memory decoding accuracy decreases, but no population bursts were observed in their data. In brief, my critique is that including some biophysical mechanism in an abstract model does not make the model plausible per se.

      It is claimed that "our proposed chunking mechanism applies to both the persistent-activity and periodic-activity regimes, with chunking clusters serving the same function in each", but this is not shown. If the results and model predictions are the same, irrespective of whether WM is activity-silent or persistent, I suggest highlighting this more and including the corresponding simulations.

      The empirical validations of the model are weak. The single-unit analysis is purely descriptive, without any statistical quantification of the apparent dip-ramp pattern. I agree that the dip-ramp pattern may be consistent with the proposed model, but I don't believe that this pattern is a specific prediction of the proposed model. It seems just to be an interesting observation that may be compatible with several network mechanisms involving some inhibition and a rebound.

      Moreover, the reanalyses of n-gram behavioral data do not constitute a mechanistic test of the model. The "new magic number" depends strongly on structural assumptions about how chunking operates, and it is unclear whether human working memory uses the specific hierarchical scheme required to achieve the predicted limit.

      The presentation of the modeling results is highly compressed in two figures and is rather hard to follow. Plotting the activity of different neural clusters in separate subplots or as heatmaps (x-axis time, y-axis neural population, color = firing rate) would help to clarify (Figure 1d). Also, control signals that activate the chunking clusters should be shown.

      Overall, the theoretical proposal is interesting, but its empirical grounding and biological plausibility need to be substantially reinforced.