5,827 Matching Annotations
  1. Dec 2024
    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public Review):

      In this revision, the authors significantly improved the manuscript. They now address some of my concerns. Specifically, they show the contribution of end-effects on spreading the inputs between dendrites. This analysis reveals greater applicability of their findings to cortical cells, with long, unbranching dendrites than other neuronal types, such as Purkinje cells in the cerebellum.

      They now explain better the interactions between calcium and voltage signals, which I believe improve the take-away message of their manuscript. They modified and added new figures that helped to provide more information about their simulations.

      However, some of my points remain valid. Figure 6 shows depolarization of ~5mV from -75. This weak depolarization would not effectively recruit nonlinear activation of NMDARs. In their paper, Branco and Hausser (2010) showed depolarizations of ~10-15mV.

      More importantly, the signature of NMDAR activation is the prolonged plateau potential and activation at more depolarized resting membrane potentials (their Figure 4). Thus, despite including NMDARs in the simulation, the authors do not model functional recruitment of these channels. Their simulation is thus equivalent to AMPA only drive, which can indeed summate somewhat nonlinearly.

      In the current study, we used short sequences of 5 inputs, since the convergence of longer sequences is extremely unlikely in the network configurations we have examined. This resulted in smaller EPSP amplitudes of ~5mV (Figure 6 - Supplement 2A, B). Longer sequences containing 9 inputs resulted in larger somatic depolarizations of ~10mV (Figure 6 - Supplement 2E, F). Although we had modified the (Branco, Clark, and Häusser 2010) model to remove the jitter in the timing of arrival of inputs and made slight modifications to the location of stimulus delivery on the dendrite, we saw similar amplitudes when we tested a 9-length sequence using (Branco, Clark, and Häusser 2010)’s published code (Figure 6 - Supplement 2I, J). In all the cases we tested (5 input sequence, 9 input sequence, 9 input sequence with (Branco, Clark, and Häusser 2010) code repository), removal of NMDA synapses lowered both the somatic EPSPs (Figure 6 - Supplement 2C,D,G,H,K,L) as well as the selectivity (measured as the difference between the EPSPs generated for inward and outward stimulus delivery) (Figure 6 Supplement 2M,N,O). Further, monitoring the voltage along the dendrite for a sequence of 5 inputs showed dendritic EPSPs in the range of 20-45 mV (Figure 6 - Supplement 2P, Q), which came down notably (10-25mV) when NMDA synapses were abolished (Figure 6 - Supplement 2R, S). Thus, even sequences containing as few as 5 inputs were capable of engaging the NMDA-mediated nonlinearity to show sequence selectivity, although the selectivity was not as strong as in the case of 9 inputs.

      Reviewer #1 (Recommendations for the authors):

      Minor points:

      Figure 8, what does the scale in A represent? I assume it is voltage, but there are no units. Figure 8, C, E, G, these are unconventional units for synaptic weights, usually, these are given in nS / per input.

      We have corrected these. The scalebar in 8A represents membrane potential in mV. The units of 8C,E,G are now in nS.

      Reviewer #2 (Public Review):

      Summary:

      If synaptic input is functionally clustered on dendrites, nonlinear integration could increase the computational power of neural networks. But this requires the right synapses to be located in the right places. This paper aims to address the question of whether such synaptic arrangements could arise by chance (i.e. without special rules for axon guidance or structural plasticity), and could therefore be exploited even in randomly connected networks. This is important, particularly for the dendrites and biological computation communities, where there is a pressing need to integrate decades of work at the single-neuron level with contemporary ideas about network function.

      Using an abstract model where ensembles of neurons project randomly to a postsynaptic population, back-of-envelope calculations are presented that predict the probability of finding clustered synapses and spatiotemporal sequences. Using data-constrained parameters, the authors conclude that clustering and sequences are indeed likely to occur by chance (for large enough ensembles), but require strong dendritic nonlinearities and low background noise to be useful.

      Strengths:

      (1) The back-of-envelope reasoning presented can provide fast and valuable intuition. The authors have also made the effort to connect the model parameters with measured values. Even an approximate understanding of cluster probability can direct theory and experiments towards promising directions, or away from lost causes.

      (2) I found the general approach to be refreshingly transparent and objective. Assumptions are stated clearly about the model and statistics of different circuits. Along with some positive results, many of the computed cluster probabilities are vanishingly small, and noise is found to be quite detrimental in several cases. This is important to know, and I was happy to see the authors take a balanced look at conditions that help/hinder clustering, rather than to just focus on a particular regime that works.

      (3) This paper is also a timely reminder that synaptic clusters and sequences can exist on multiple spatial and temporal scales. The authors present results pertaining to the standard `electrical' regime (~50-100 µm, <50 ms), as well as two modes of chemical signaling (~10 µm, 100-1000 ms). The senior author is indeed an authority on the latter, and the simulations in Figure 5, extending those from Bhalla (2017), are unique in this area. In my view, the role of chemical signaling in neural computation is understudied theoretically, but research will be increasingly important as experimental technologies continue to develop.

      Weaknesses:

      (1) The paper is mostly let down by the presentation. In the current form, some patience is needed to grasp the main questions and results, and it is hard to keep track of the many abbreviations and definitions. A paper like this can be impactful, but the writing needs to be crisp, and the logic of the derivation accessible to non-experts. See, for instance, Stepanyants, Hof & Chklovskii (2002) for a relevant example.

      It would be good to see a restructure that communicates the main points clearly and concisely, perhaps leaving other observations to an optional appendix. For the interested but time-pressed reader, I recommend starting with the last paragraph of the introduction, working through the main derivation on page 7, and writing out the full expression with key parameters exposed. Next, look at Table 1 and Figure 2J to see where different circuits and mechanisms fit in this scheme. Beyond this, the sequence derivation on page 15 and biophysical simulations in Figures 5 and 6 are also highlights.

      We appreciate the reviewers' suggestions. We have tightened the flow of the introduction. We understand that the abbreviations and definitions are challenging and have therefore provided intuitions and summaries of the equations discussed in the main text.

      Clusters calculations

      Our approach is to ask how likely it is that a given set of inputs lands on a short segment of dendrite, and then scale it up to all segments on the entire dendritic length of the cell.

      Thus, the probability of occurrence of groups that receive connections from each of the M ensembles (PcFMG) is a function of the connection probability (p) between the two layers, the number of neurons in an ensemble (N), the relative zone-length with respect to the total dendritic arbor (Z/L) and the number of ensembles (M).

      Sequence calculations

      Here we estimate the likelihood of the first ensemble input arriving anywhere on the dendrite, and ask how likely it is that succeeding inputs of the sequence would arrive within a set spacing.

      Thus, the probability of occurrence of sequences that receive sequential connections (PcPOSS) from each of the M ensembles is a function of the connection probability (p) between the two layers, the number of neurons in an ensemble (N), the relative window size with respect to the total dendritic arbor (Δ/L) and the number of ensembles (M).

      (2) I wonder if the authors are being overly conservative at times. The result highlighted in the abstract is that 10/100000 postsynaptic neurons are expected to exhibit synaptic clustering. This seems like a very small number, especially if circuits are to rely on such a mechanism. However, this figure assumes the convergence of 3-5 distinct ensembles. Convergence of inputs from just 2 ense mbles would be much more prevalent, but still advantageous computationally. There has been excitement in the field about experiments showing the clustering of synapses encoding even a single feature.

      We agree that short clusters of two inputs would be far more likely. We focused our analysis on clusters with three of more ensembles because of the following reasons:

      (1) The signal to noise in these clusters was very poor as the likelihood of noise clusters is high.

      (2) It is difficult to trigger nonlinearities with very few synaptic inputs.

      (3) At the ensemble sizes we considered (100 for clusters, 1000 for sequences), clusters arising from just two ensembles would result in high probability of occurrence on all neurons in a network (~50% in cortex, see p_CMFG in figures below.). These dense neural representations make it difficult for downstream networks to decode (Foldiak 2003).

      However, in the presence of ensembles containing fewer neurons or when the connection probability between the layers is low, short clusters can result in sparse representations (Figure 2 - Supplement 2). Arguments 1 and 2 hold for short sequences as well.

      (3) The analysis supporting the claim that strong nonlinearities are needed for cluster/sequence detection is unconvincing. In the analysis, different synapse distributions on a single long dendrite are convolved with a sigmoid function and then the sum is taken to reflect the somatic response. In reality, dendritic nonlinearities influence the soma in a complex and dynamic manner. It may be that the abstract approach the authors use captures some of this, but it needs to be validated with simulations to be trusted (in line with previous work, e.g. Poirazi, Brannon & Mel, (2003)).

      We agree that multiple factors might affect the influence of nonlinearities on the soma. The key goal of our study was to understand the role played by random connectivity in giving rise to clustered computation. Since simulating a wide range of connectivity and activity patterns in a detailed biophysical model was computationally expensive, we analyzed the exemplar detailed models for nonlinearity separately (Figures 5, 6, and new figure 8), and then used our abstract models as a proxy for understanding population dynamics. A complete analysis of the role played by morphology, channel kinetics and the effect of branching requires an in-depth study of its own, and some of these questions have already been tackled by (Poirazi, Brannon, and Mel 2003; Branco, Clark, and Häusser 2010; Bhalla 2017). However, in the revision, we have implemented a single model which incorporates the range of ion-channel, synaptic and biochemical signaling nonlinearities which we discuss in the paper (Figure 8, and Figure 8 Supplement 1, 2,3). We use this to demonstrate all three forms of sequence and grouped computation we use in the study, where the only difference is in the stimulus pattern and the separation of time-scales inherent in the stimuli.

      (4) It is unclear whether some of the conclusions would hold in the presence of learning. In the signal-to-noise analysis, all synaptic strengths are assumed equal. But if synapses involved in salient clusters or sequences were potentiated, presumably detection would become easier? Similarly, if presynaptic tuning and/or timing were reorganized through learning, the conditions for synaptic arrangements to be useful could be relaxed. Answering these questions is beyond the scope of the study, but there is a caveat there nonetheless.

      We agree with the reviewer. If synapses receiving connectivity from ensembles had stronger weights, this would make detection easier. Dendritic spikes arising from clustered inputs have been implicated in local cooperative plasticity (Golding, Staff, and Spruston 2002; Losonczy, Makara, and Magee 2008). Further, plasticity related proteins synthesized at a synapse undergoing L-LTP can diffuse to neighboring weakly co-active synapses, and thereby mediate cooperative plasticity (Harvey et al. 2008; Govindarajan, Kelleher, and Tonegawa 2006; Govindarajan et al. 2011). Thus if clusters of synapses were likely to be co-active, they could further engage these local plasticity mechanisms which could potentiate them while not potentiating synapses that are activated by background activity. This would depend on the activity correlation between synapses receiving ensemble inputs within a cluster vs those activated by background activity. We have mentioned some of these ideas in a published opinion paper (Pulikkottil, Somashekar, and Bhalla 2021). In the current study, we wanted to understand whether even in the absence of specialized connection rules, interesting computations could still emerge. Thus, we focused on asking whether clustered or sequential convergence could arise even in a purely randomly connected network, with the most basic set of assumptions. We agree that an analysis of how selectivity evolves with learning would be an interesting topic for further work.

      References

      • Bhalla, Upinder S. 2017. “Synaptic Input Sequence Discrimination on Behavioral Timescales Mediated by Reaction-Diffusion Chemistry in Dendrites.” Edited by Frances K Skinner. eLife 6 (April):e25827. https://doi.org/10.7554/eLife.25827.

      • Branco, Tiago, Beverley A. Clark, and Michael Häusser. 2010. “Dendritic Discrimination of Temporal Input Sequences in Cortical Neurons.” Science (New York, N.Y.) 329 (5999): 1671–75. https://doi.org/10.1126/science.1189664.

      • Foldiak, Peter. 2003. “Sparse Coding in the Primate Cortex.” The Handbook of Brain Theory and Neural Networks. https://research-repository.st-andrews.ac.uk/bitstream/handle/10023/2994/FoldiakSparse HBTNN2e02.pdf?sequence=1.

      • Golding, Nace L., Nathan P. Staff, and Nelson Spruston. 2002. “Dendritic Spikes as a Mechanism for Cooperative Long-Term Potentiation.” Nature 418 (6895): 326–31. https://doi.org/10.1038/nature00854.

      • Govindarajan, Arvind, Inbal Israely, Shu-Ying Huang, and Susumu Tonegawa. 2011. “The Dendritic Branch Is the Preferred Integrative Unit for Protein Synthesis-Dependent LTP.” Neuron 69 (1): 132–46. https://doi.org/10.1016/j.neuron.2010.12.008.

      • Govindarajan, Arvind, Raymond J. Kelleher, and Susumu Tonegawa. 2006. “A Clustered Plasticity Model of Long-Term Memory Engrams.” Nature Reviews Neuroscience 7 (7): 575–83. https://doi.org/10.1038/nrn1937.

      • Harvey, Christopher D., Ryohei Yasuda, Haining Zhong, and Karel Svoboda. 2008. “The Spread of Ras Activity Triggered by Activation of a Single Dendritic Spine.” Science (New York, N.Y.) 321 (5885): 136–40. https://doi.org/10.1126/science.1159675.

      • Losonczy, Attila, Judit K. Makara, and Jeffrey C. Magee. 2008. “Compartmentalized Dendritic Plasticity and Input Feature Storage in Neurons.” Nature 452 (7186): 436–41. https://doi.org/10.1038/nature06725.

      • Poirazi, Panayiota, Terrence Brannon, and Bartlett W. Mel. 2003. “Pyramidal Neuron as Two-Layer Neural Network.” Neuron 37 (6): 989–99. https://doi.org/10.1016/S0896-6273(03)00149-1.

      • Pulikkottil, Vinu Varghese, Bhanu Priya Somashekar, and Upinder S. Bhalla. 2021. “Computation, Wiring, and Plasticity in Synaptic Clusters.” Current Opinion in Neurobiology, Computational Neuroscience, 70 (October):101–12. https://doi.org/10.1016/j.conb.2021.08.001.

    2. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public Review):

      In this revision, the authors significantly improved the manuscript. They now address some of my concerns. Specifically, they show the contribution of end-effects on spreading the inputs between dendrites. This analysis reveals greater applicability of their findings to cortical cells, with long, unbranching dendrites than other neuronal types, such as Purkinje cells in the cerebellum.

      They now explain better the interactions between calcium and voltage signals, which I believe improve the take-away message of their manuscript. They modified and added new figures that helped to provide more information about their simulations.

      However, some of my points remain valid. Figure 6 shows depolarization of ~5mV from -75. This weak depolarization would not effectively recruit nonlinear activation of NMDARs. In their paper, Branco and Hausser (2010) showed depolarizations of ~10-15mV.

      More importantly, the signature of NMDAR activation is the prolonged plateau potential and activation at more depolarized resting membrane potentials (their Figure 4). Thus, despite including NMDARs in the simulation, the authors do not model functional recruitment of these channels. Their simulation is thus equivalent to AMPA only drive, which can indeed summate somewhat nonlinearly.

      In the current study, we used short sequences of 5 inputs, since the convergence of longer sequences is extremely unlikely in the network configurations we have examined. This resulted in smaller EPSP amplitudes of ~5mV (Figure 6 - Supplement 2A, B). Longer sequences containing 9 inputs resulted in larger somatic depolarizations of ~10mV (Figure 6 - Supplement 2E, F). Although we had modified the (Branco, Clark, and Häusser 2010) model to remove the jitter in the timing of arrival of inputs and made slight modifications to the location of stimulus delivery on the dendrite, we saw similar amplitudes when we tested a 9-length sequence using (Branco, Clark, and Häusser 2010)’s published code (Figure 6 - Supplement 2I, J). In all the cases we tested (5 input sequence, 9 input sequence, 9 input sequence with (Branco, Clark, and Häusser 2010) code repository), removal of NMDA synapses lowered both the somatic EPSPs (Figure 6 - Supplement 2C,D,G,H,K,L) as well as the selectivity (measured as the difference between the EPSPs generated for inward and outward stimulus delivery) (Figure 6 Supplement 2M,N,O). Further, monitoring the voltage along the dendrite for a sequence of 5 inputs showed dendritic EPSPs in the range of 20-45 mV (Figure 6 - Supplement 2P, Q), which came down notably (10-25mV) when NMDA synapses were abolished (Figure 6 - Supplement 2R, S). Thus, even sequences containing as few as 5 inputs were capable of engaging the NMDA-mediated nonlinearity to show sequence selectivity, although the selectivity was not as strong as in the case of 9 inputs.

      Reviewer #1 (Recommendations for the authors):

      Minor points:

      Figure 8, what does the scale in A represent? I assume it is voltage, but there are no units. Figure 8, C, E, G, these are unconventional units for synaptic weights, usually, these are given in nS / per input.

      We have corrected these. The scalebar in 8A represents membrane potential in mV. The units of 8C,E,G are now in nS.

      Reviewer #2 (Public Review):

      Summary:

      If synaptic input is functionally clustered on dendrites, nonlinear integration could increase the computational power of neural networks. But this requires the right synapses to be located in the right places. This paper aims to address the question of whether such synaptic arrangements could arise by chance (i.e. without special rules for axon guidance or structural plasticity), and could therefore be exploited even in randomly connected networks. This is important, particularly for the dendrites and biological computation communities, where there is a pressing need to integrate decades of work at the single-neuron level with contemporary ideas about network function.

      Using an abstract model where ensembles of neurons project randomly to a postsynaptic population, back-of-envelope calculations are presented that predict the probability of finding clustered synapses and spatiotemporal sequences. Using data-constrained parameters, the authors conclude that clustering and sequences are indeed likely to occur by chance (for large enough ensembles), but require strong dendritic nonlinearities and low background noise to be useful.

      Strengths:

      (1) The back-of-envelope reasoning presented can provide fast and valuable intuition. The authors have also made the effort to connect the model parameters with measured values. Even an approximate understanding of cluster probability can direct theory and experiments towards promising directions, or away from lost causes.

      (2) I found the general approach to be refreshingly transparent and objective. Assumptions are stated clearly about the model and statistics of different circuits. Along with some positive results, many of the computed cluster probabilities are vanishingly small, and noise is found to be quite detrimental in several cases. This is important to know, and I was happy to see the authors take a balanced look at conditions that help/hinder clustering, rather than to just focus on a particular regime that works.

      (3) This paper is also a timely reminder that synaptic clusters and sequences can exist on multiple spatial and temporal scales. The authors present results pertaining to the standard `electrical' regime (~50-100 µm, <50 ms), as well as two modes of chemical signaling (~10 µm, 100-1000 ms). The senior author is indeed an authority on the latter, and the simulations in Figure 5, extending those from Bhalla (2017), are unique in this area. In my view, the role of chemical signaling in neural computation is understudied theoretically, but research will be increasingly important as experimental technologies continue to develop.

      Weaknesses:

      (1) The paper is mostly let down by the presentation. In the current form, some patience is needed to grasp the main questions and results, and it is hard to keep track of the many abbreviations and definitions. A paper like this can be impactful, but the writing needs to be crisp, and the logic of the derivation accessible to non-experts. See, for instance, Stepanyants, Hof & Chklovskii (2002) for a relevant example.

      It would be good to see a restructure that communicates the main points clearly and concisely, perhaps leaving other observations to an optional appendix. For the interested but time-pressed reader, I recommend starting with the last paragraph of the introduction, working through the main derivation on page 7, and writing out the full expression with key parameters exposed. Next, look at Table 1 and Figure 2J to see where different circuits and mechanisms fit in this scheme. Beyond this, the sequence derivation on page 15 and biophysical simulations in Figures 5 and 6 are also highlights.

      We appreciate the reviewers' suggestions. We have tightened the flow of the introduction. We understand that the abbreviations and definitions are challenging and have therefore provided intuitions and summaries of the equations discussed in the main text.

      Clusters calculations

      Our approach is to ask how likely it is that a given set of inputs lands on a short segment of dendrite, and then scale it up to all segments on the entire dendritic length of the cell.

      Thus, the probability of occurrence of groups that receive connections from each of the M ensembles (PcFMG) is a function of the connection probability (p) between the two layers, the number of neurons in an ensemble (N), the relative zone-length with respect to the total dendritic arbor (Z/L) and the number of ensembles (M).

      Sequence calculations

      Here we estimate the likelihood of the first ensemble input arriving anywhere on the dendrite, and ask how likely it is that succeeding inputs of the sequence would arrive within a set spacing.

      Thus, the probability of occurrence of sequences that receive sequential connections (PcPOSS) from each of the M ensembles is a function of the connection probability (p) between the two layers, the number of neurons in an ensemble (N), the relative window size with respect to the total dendritic arbor (Δ/L) and the number of ensembles (M).

      (2) I wonder if the authors are being overly conservative at times. The result highlighted in the abstract is that 10/100000 postsynaptic neurons are expected to exhibit synaptic clustering. This seems like a very small number, especially if circuits are to rely on such a mechanism. However, this figure assumes the convergence of 3-5 distinct ensembles. Convergence of inputs from just 2 ense mbles would be much more prevalent, but still advantageous computationally. There has been excitement in the field about experiments showing the clustering of synapses encoding even a single feature.

      We agree that short clusters of two inputs would be far more likely. We focused our analysis on clusters with three of more ensembles because of the following reasons:

      (1) The signal to noise in these clusters was very poor as the likelihood of noise clusters is high.

      (2) It is difficult to trigger nonlinearities with very few synaptic inputs.

      (3) At the ensemble sizes we considered (100 for clusters, 1000 for sequences), clusters arising from just two ensembles would result in high probability of occurrence on all neurons in a network (~50% in cortex, see p_CMFG in figures below.). These dense neural representations make it difficult for downstream networks to decode (Foldiak 2003).

      However, in the presence of ensembles containing fewer neurons or when the connection probability between the layers is low, short clusters can result in sparse representations (Figure 2 - Supplement 2). Arguments 1 and 2 hold for short sequences as well.

      (3) The analysis supporting the claim that strong nonlinearities are needed for cluster/sequence detection is unconvincing. In the analysis, different synapse distributions on a single long dendrite are convolved with a sigmoid function and then the sum is taken to reflect the somatic response. In reality, dendritic nonlinearities influence the soma in a complex and dynamic manner. It may be that the abstract approach the authors use captures some of this, but it needs to be validated with simulations to be trusted (in line with previous work, e.g. Poirazi, Brannon & Mel, (2003)).

      We agree that multiple factors might affect the influence of nonlinearities on the soma. The key goal of our study was to understand the role played by random connectivity in giving rise to clustered computation. Since simulating a wide range of connectivity and activity patterns in a detailed biophysical model was computationally expensive, we analyzed the exemplar detailed models for nonlinearity separately (Figures 5, 6, and new figure 8), and then used our abstract models as a proxy for understanding population dynamics. A complete analysis of the role played by morphology, channel kinetics and the effect of branching requires an in-depth study of its own, and some of these questions have already been tackled by (Poirazi, Brannon, and Mel 2003; Branco, Clark, and Häusser 2010; Bhalla 2017). However, in the revision, we have implemented a single model which incorporates the range of ion-channel, synaptic and biochemical signaling nonlinearities which we discuss in the paper (Figure 8, and Figure 8 Supplement 1, 2,3). We use this to demonstrate all three forms of sequence and grouped computation we use in the study, where the only difference is in the stimulus pattern and the separation of time-scales inherent in the stimuli.

      (4) It is unclear whether some of the conclusions would hold in the presence of learning. In the signal-to-noise analysis, all synaptic strengths are assumed equal. But if synapses involved in salient clusters or sequences were potentiated, presumably detection would become easier? Similarly, if presynaptic tuning and/or timing were reorganized through learning, the conditions for synaptic arrangements to be useful could be relaxed. Answering these questions is beyond the scope of the study, but there is a caveat there nonetheless.

      We agree with the reviewer. If synapses receiving connectivity from ensembles had stronger weights, this would make detection easier. Dendritic spikes arising from clustered inputs have been implicated in local cooperative plasticity (Golding, Staff, and Spruston 2002; Losonczy, Makara, and Magee 2008). Further, plasticity related proteins synthesized at a synapse undergoing L-LTP can diffuse to neighboring weakly co-active synapses, and thereby mediate cooperative plasticity (Harvey et al. 2008; Govindarajan, Kelleher, and Tonegawa 2006; Govindarajan et al. 2011). Thus if clusters of synapses were likely to be co-active, they could further engage these local plasticity mechanisms which could potentiate them while not potentiating synapses that are activated by background activity. This would depend on the activity correlation between synapses receiving ensemble inputs within a cluster vs those activated by background activity. We have mentioned some of these ideas in a published opinion paper (Pulikkottil, Somashekar, and Bhalla 2021). In the current study, we wanted to understand whether even in the absence of specialized connection rules, interesting computations could still emerge. Thus, we focused on asking whether clustered or sequential convergence could arise even in a purely randomly connected network, with the most basic set of assumptions. We agree that an analysis of how selectivity evolves with learning would be an interesting topic for further work.

      References

      Bhalla, Upinder S. 2017. “Synaptic Input Sequence Discrimination on Behavioral Timescales Mediated by Reaction-Diffusion Chemistry in Dendrites.” Edited by Frances K Skinner. eLife 6 (April):e25827. https://doi.org/10.7554/eLife.25827.

      Branco, Tiago, Beverley A. Clark, and Michael Häusser. 2010. “Dendritic Discrimination of Temporal Input Sequences in Cortical Neurons.” Science (New York, N.Y.) 329 (5999): 1671–75. https://doi.org/10.1126/science.1189664.

      Foldiak, Peter. 2003. “Sparse Coding in the Primate Cortex.” The Handbook of Brain Theory and Neural Networks. https://research-repository.st-andrews.ac.uk/bitstream/handle/10023/2994/FoldiakSparse HBTNN2e02.pdf?sequence=1.

      Golding, Nace L., Nathan P. Staff, and Nelson Spruston. 2002. “Dendritic Spikes as a Mechanism for Cooperative Long-Term Potentiation.” Nature 418 (6895): 326–31. https://doi.org/10.1038/nature00854.

      Govindarajan, Arvind, Inbal Israely, Shu-Ying Huang, and Susumu Tonegawa. 2011. “The Dendritic Branch Is the Preferred Integrative Unit for Protein Synthesis-Dependent LTP.” Neuron 69 (1): 132–46. https://doi.org/10.1016/j.neuron.2010.12.008.

      Govindarajan, Arvind, Raymond J. Kelleher, and Susumu Tonegawa. 2006. “A Clustered Plasticity Model of Long-Term Memory Engrams.” Nature Reviews Neuroscience 7 (7): 575–83. https://doi.org/10.1038/nrn1937.

      Harvey, Christopher D., Ryohei Yasuda, Haining Zhong, and Karel Svoboda. 2008. “The Spread of Ras Activity Triggered by Activation of a Single Dendritic Spine.” Science (New York, N.Y.) 321 (5885): 136–40. https://doi.org/10.1126/science.1159675.

      Losonczy, Attila, Judit K. Makara, and Jeffrey C. Magee. 2008. “Compartmentalized Dendritic Plasticity and Input Feature Storage in Neurons.” Nature 452 (7186): 436–41. https://doi.org/10.1038/nature06725.

      Poirazi, Panayiota, Terrence Brannon, and Bartlett W. Mel. 2003. “Pyramidal Neuron as Two-Layer Neural Network.” Neuron 37 (6): 989–99. https://doi.org/10.1016/S0896-6273(03)00149-1.

      Pulikkottil, Vinu Varghese, Bhanu Priya Somashekar, and Upinder S. Bhalla. 2021. “Computation, Wiring, and Plasticity in Synaptic Clusters.” Current Opinion in Neurobiology, Computational Neuroscience, 70 (October):101–12. https://doi.org/10.1016/j.conb.2021.08.001.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      Reviewer #1 was very appreciative of our results and commented “This is a novel result in ferredoxin and a significant contribution to the field”. We are very honored and pleased.

      Reviewer #2:

      (1) Changing the nomenclature of the models investigated to include the oxidation state being discussed. As they are now (CM, CMNA, etc), multiple re-reads were required to ascertain which redox state was being discussed for a particular model in a given section of the text. Appending "Ox" or "Red" for oxidized or reduced would be sufficient. 

      As you indicated there are several nomenclatures to distinguish the model systems in the text. On the other hand, the main issue discussed in the text is the ionization potential (IP), which is calculated by the difference in energies between oxidized and reduced states for each model. In other words, a discussion of the IP value on each model includes both the “Ox” and “Red” energies. In order to clarify the relationship between the nomenclature of models and redox states, we added sentences below.

      “Note that the IP value is obtained for each model by calculating both the Ox and Red state energies of the model.” (lines 195-196).

      On the other hand, we must specify the charge state when the geometry optimization is performed for CM and CMH models. Therefore, we revised the sentence as follows.

      “The decrease in |IP| value indicates that the relative stability of the Red state is suppressed compared with the CMH but is significantly larger than the CM, suggesting the importance of the protonation of Asp64 (Fig. S2B). 

      To consider the effect of the structural change caused by the redox on the IP, geometrical optimization of the 4Fe-4S core was performed for the CM (Red) and CMH (Red) models using the same level of theory to the single-point calculations. The optimized Cartesian coordinates are summarized in Table S3. As illustrated in Fig. S2A, the IP values of CM and CMH change from –3.27 to –2.38 eV (|DIP| = 0.89 eV), and from –1.06 to –0.19 eV (|DIP| = 0.87 eV), respectively, before and after the geometrical optimization.” (lines 224-232)

      (2) In addition to the very thorough DFT investigation of the different spin and charge combinations, did the authors try a broken-symmetry calculation to obtain the ground state description of the FeS cluster? Given the ubiquity of this approach in other FeS cluster studies, it was surprising that this approach was not taken here. Granted, the DFT investigation of each possible combination is sufficiently thorough and need not be redone. 

      Thank you for your comments. A term “spin-unrestricted method”, which is used in the manuscript in the text is synonym of “broken-symmetry method”. In order to emphasize this, we revised the manuscript as follows. 

      “All calculations were performed by using the spin-unrestricted (broken-symmetry) hybrid DFT method with the B3LYP functional set. As the basis set, 6-31G* and 6-31+G* were used for [Fe, C, N, O, H] and [S] atoms, respectively, for the IP calculations.” (Line 451)

      (3) Line 161 "an" to "a" 

      We corrected the mistake. Thank you so much. (Line 161)

      (4) Figure 4A seems a bit odd. Why do the traces eclipse the y-axis? And the traces between 330 and 370 nm are much noisier and appear thicker than the rest of the plot. Is this an issue with the monochromator grating used in wavelength selection? Reducing the thickness of the individual traces may help the data presentation in this figure. Also, the arrows on the plot have an opaque white background. Can this be removed so that the arrows do not eclipse the traces in the plot? 

      The spectrum in the Fig.4A seemed to be odd. The spectral figure has been revised to improve its appearance. (We have also corrected E53A in Figure 5B.) This reviewer also pointed out that “the traces between 330 and 370 nm are much noisier”. We are struggling with the noise caused by the grating (or the motor malfunction) of the monochromator as you pointed out. Once the monochromator is repaired and a smooth spectrum is obtained, we will upload further revisions.

      (5) Figure S9 is a very nice schematic illustrating the general findings of the study. Can this be moved to the main text?

      Thank you for your helpful comment. Accordingly, the Fig.9S and its legend are moved to the main text. (Lines 675-680)

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      This manuscript by Bai et al concerns the expression of Scleraxis (Scx) by muscle satellite cells (SCs) and the role of that gene in regenerative myogenesis. The authors report the expression of this gene associated with tendon development in satellite cells. Genetic deletion of Scx in SCs impairs muscle regeneration, and the authors provide evidence that SCs deficient in Scx are impaired in terms of population growth and cellular differentiation. Overall, this report provides evidence of the role of this gene, unexpectedly, in SC function and adult regenerative myogenesis.

      We appreciate the comments and thank her/him for the support.

      There are a few minor points of concern.

      (1) From the data in Figure 1, it appears that all of the SCs, assessed both in vitro and in vivo, express Scx. The authors refer to a scRNA-seq dataset from their lab and one report from mdx mouse muscle that also reveals this unexpected gene expression pattern. Has this been observed in many other scRNA-seq datasets? If not, it would be important to discuss potential explanations as to why this has not been reported previously.

      Thanks for this question regarding data in Fig.1. We did initially use immunofluorescence staining of Pax7 and GFP on muscle sections and primary myoblast cultures prepared from Tg-ScxGFP mice to conclude that Scx was expressed in satellite cells (SCs). In addition to the cited mdx RNA-seq data, we have included a re-analysis of a published scRNA-seq data set in Fig.2E (Dell'Orso et al., Development, 2019), and our own scRNA-seq data (Fig.S5D, F). We have now re-examined an additional scRNA-seq data set of TA muscles at various regeneration time points (De Micheli et al., Cell Rep. 2020), in which Scx expression was detected in MuSC progenitors and mature muscle cells. We have added the De Micheli et al. reference and the re-analysis of that scRNA-seq data set for Scx expression as an additional panel in Fig. 2E, with accompanying text (p. 7, ln. 4-6). Thus, our immunostaining results are consistent with scRNA-seq data from our and two other independent scRNA-seq data sets.

      We think that Scx expression in the adult myogenic lineage was not previously reported mainly because its expression level was low, and might be dismissed as spurious detection. Additionally, detecting such low expression levels requires sophisticated detection methods with high capture efficiency. Previous studies have noted limitations in transcript capture or transcription factor dropout in 10x Genomics-based datasets (Lambert et al., Cell, 2018; Pokhilko et al., Genome Res., 2021). The most likely and straightforward reason is that Scx was simply not a focus in prior studies amid so many other genes of interest. We have now added this last explanation in the text (p.7, ln. 8-9), following the re-analyses of Scx expression in published scRNA-seq data sets.

      (2) A major point of the paper, as illustrated in Fig. 3, is that Scx-neg SCs fail to produce normal myofibers and renewed SCs following injury/regeneration. They mention in the text that there was no increased PCD by Caspase staining at 5 DPI. A failure of cell survival during the process of SC activation, proliferation, and cell fate determination (differentiation versus self-renewal) would explain most of the in vivo data. As such, this conclusion would seem to warrant a more detailed analysis in terms of at least one or two other time points and an independent method for detecting dead/dying cells (the in vitro data in Fig. 4F is also based on an assessment of activated Caspase to assess cell death). The in vitro data presented later in Fig. S4G, H do suggest an increase in cell loss during proliferative expansion of Scx-neg SCs. To what extent does cell loss (by whatever mechanism of cell death) explain both the in vivo findings of impaired regeneration and even the in vitro studies showing slower population expansion in the absence of Scx?

      We appreciate these constructive suggestions. Based on the number of available control and cKO animals, we were limited to one additional time point at 3 dpi to assess PCD by TUNEL in vivo. We were disappointed again to find no appreciable levels of PCD at 3 dpi by TUNEL (new Fig.S4I), thus no quantifications were included. We also re-did the in vitro experiment using purified SCs and monitored PCD by staining for cleaved Caspase-3 using a validated tube of antibodies (positive staining after 6 h of treatment by 1 mM staurosporine of control and ScxcKO cells; included as new Fig. S4J and legend). We were pleased to find an increase of cleaved Caspase3 stained cells, i.e. PCD, of Scx-cKO SCs at day 4 in culture, compared to that of the control. We have now replaced the old Fig. 4F with new Fig.4F and 4G to document PCD. We also provided new text/legend for these new data (p.10. ln. 2-10; new legend for Fig. 4F and 4G).

      (3) I'm not sure I understand the description of the data or the conclusions in the section titled "Basement membrane-myofiber interaction in control and Scx cKO mice". Is there something specific to the regeneration from Scx-neg myogenic progenitors, or would these findings be expected in any experimental condition in which myogenesis was significantly delayed, with much smaller fibers in the experimental group at 5 DPI?

      We very much appreciate this comment. We agree that there is unlikely anything specific about the regeneration from Scx-negative myogenic progenitors. Unfilled or empty ghost fibers (basement membrane remnant) are expected due to small fiber and poor regeneration in the ScxcKO mice at 5 dpi. We have removed the subtitle and changed the content to an expected consequence rather than something special (p. 8, ln. 19-22).

      (4) The data presented in Fig. 4B showing differences in the purity of SC populations isolated by FACS depending on the reporter used are interesting and important for the field. The authors offer the explanation of exosomal transfer of Tdt from SCs to non-SCs. The data are consistent with this explanation, but no data are presented to support this. Are there any other explanations that the authors have considered and that could be readily tested?

      Thanks for highlighting this phenomenon. We struggled with the SC purity issue for a long time. The project started with using the R26RtdT reporter for tdT’s paraformaldehyde  resistant strong fluorescence (fixation) to aid visualization in vivo. Later, when we used the tdT signal to purify SCs by FACS, we found that only 80% sorted tdT+ cells are Pax7+. We then switched to the R26RYFP reporter, from which we achieved much higher purity (95%) of SCs (Pax7+) by FACS. As such, we also repeated and confirmed many in vivo experimental results using the R26RYFP reporter (included in the manuscript). Due to the low purity of tdT+SCs by FACS, we discontinued that mouse colony after we confirmed the superior utility of the R26RYFP reporter for SC isolation.

      We sincerely apologize for not being able to conduct further testable experiments on this intriguing phenomenon. However, this issue has since been addressed and published by Murach et al., iScience, (2021). Like our experience, they found non-satellite mononuclear cells with tdT fluorescence after TMX treatment when SCs were isolated via FACS. To determine this was not due to off-target recombination or a technical artifact from tissue processing, they conducted extensive analyses. They found that the tdT+ mononuclear cells included fibrogenic cells (fibroblasts and FAPs), immune cells/macrophages, and endothelial cells. Additionally, they confirmed the significant potential of extracellular vesicle (EV)-mediated cargo transfer, which facilitates the transfer of full-length tdT transcript from lineage-marked Pax7+ cells to those mononuclear cells. We have modified the text to emphasize and acknowledge their contribution to this important point, and explained the difference between YFP and tdT reporter alleles in more detail (p.9, ln. 11-17).

      (5) The Cut&Run data of Fig. 6 certainly provide evidence of direct Scx targets, especially since the authors used a novel knock-in strain for analyses. The enrichment of E-box motifs provides support for the 207 intersecting genes (scRNA-seq and Cut&Run) being direct targets. However, the rationale elaborated in the final paragraph of the Results section proposing how 4 of these genes account for the phenotypes on the Scx-neg cells and tissues is just speculation, however reasonable. These are not data, and these considerations would be more appropriate in the Discussion in the absence of any validation studies.

      We agree with this comment and have moved speculations into the Discussion (p. 15, ln. 4-15, and from p. 18, ln. 4 to p. 19, ln. 4).

      Reviewer #2 (Public Review):

      Summary:

      Scx is a well-established marker for tenocytes, but the expression in myogenic-lineage cells was unexplored. In this study, the authors performed lineage-trace and scRNA-seq analyses and demonstrated that Scx is expressed in activated SCs. Further, the authors showed that Scx is essential for muscle regeneration using conditional KO mice and identified the target genes of Scx in myogenic cells, which differ from those of tendons.

      Strengths:

      Sometimes, lineage-trace experiments cause mis-expression and do not reflect the endogenous expression of the target gene. In this study, the authors carefully analyzed the unexpected expression of Scx in myogenic cells using some mouse lines and scRNA-seq data.

      We appreciate the comments and thank her/him for noting the strengths of our manuscript.

      Weaknesses:

      Scx protein expression has not been verified.

      We are aware of this weakness. We had previously used Western blotting (WB) using cultured SCs from control and ScxcKO mice, but did not detect endogenous Scx protein even in the control. In response to this comment, we have re-done several WB experiments using new lysates from control and ScxcKO SCs and two commercial antibodies: anti-Scx antibody 1 from Abcam (ab58655) and anti-Scx antibody 2 from Invitrogen (PA5-23943). These antibodies have been reported to detect endogenous Scx protein in tendon cells in Spang et al., BMC Musculoskelet Disord (2016) and  Bochon et al., Int J Stem Cells (2021). Despite our best efforts, we were not able to detect a reliable Scx band. We have also conducted immunofluorescence using these two antibodies. Still, we failed to detect a difference of staining signals between control and cKO SCs using these antibodies. Lastly, we conducted immunofluorescence using the ScxTy1 myoblasts and we did not find the staining signal coinciding with the Ty1 signal (by double staining). We have been very frustrated by not knowing what caused this technical difficulty in our hands. Given that these were negative data, we did not include them. However, we do hope that the combined data from scRNA-seq, ScxCreERT2 lineage-tracing, Tg-ScxGFP expression, and ScxTy1 knock-in together are deemed sufficient to make up for the deficiency of data for endogenous Scx protein in regenerative myogenic cells.

      Response to Recommendations for the Authors:

      Reviewer #1 (Recommendations For The Authors):

      p. 8: The text refers to Fig. 3I, but this should be Fig. 3H.

      We apologize for the confusion. Please note that by keeping all 14 dpi data in the same row, we placed Fig.3I at an unconventional/unexpected position, i.e., next to 3D &3E, and above 3F-H. We were aware that this unconventional placement could cause confusion, and it did. With that said, we have now re-arranged the subfigures (same data content) so that the updated Fig.3 contains subfigures in the expected and proper spatial order. We double-checked the figure referral in the text (p. 8, ln. 16-17) and the text is correct – just that the original Fig.3I should have been at the original Fig.3H position and that is now corrected.

      Reviewer #2 (Recommendations For The Authors):

      (1) Given that Scx binds to the E-box and regulates gene expression, it is of interest to know the relevance between MyoD and Scx. If possible, the reviewer recommends to include some discussions.

      Thanks for the comment. MyoD1 is a well-known transcript factor regulating myogenesis, whereas Scx is primarily studied in tenocytes and other connective tissues. We agree that our new findings deserve a discussion regarding the relevance between MyoD1 and Scx.  We have added a description of their differences in the discussion and two new references (p.19, ln. 7-17).

      (2) Considering that Scx is a transcriptional factor, it is interesting that Scx-GFP was not detected in the nuclei of regenerated myofibers. Could the subcellular localization of Scx-GFP provide some insights into the function of Scx as a transcription factor during muscle regeneration?

      Tg-ScxGFP is a transgenic line generated by random insertion into the genome (Pryce et al., 2007; cited). The plasmid used for transgenesis was constructed by replacing most of Scx’s first exon with GFP, and including ~ 9Kb flanking regulatory sequences. As such, the ScxGFP is not a fusion gene, but rather that the GFP expression is regulated by Scx promoter and enhancer(s). This GFP reporter lacks a nuclear localization signal (NLS), hence it is mainly detected in the cytoplasm; some nuclear signal is detected, presumably due to GFP’s small size permitting passive diffusion into the nucleus. Thus, the GFP signal is used as a reporter for Scx expression, but GFP subcellular localization does not provide insight into Scx function per se. Conversely, ScxTy1/Ty1 is a knock-in allele created by fusing a triple-Ty1 tag (3XTy1) to the C-terminus of Scx, and we observed that Ty1 is located in the nucleus by the immunofluorescent staining. We used the Ty1 epitope to carry out CUT&RUN experiments to gain insight to the function of Scx as a transcription factor.

      (3) Fig1D The number of arrows in the Merge image is not matched with others. In addition, the star mark in the Pax7 image is likely an error.

      Apologies. We have now corrected these errors in the revised Fig.1D.

      (4) FigS1A Is there only one myofiber shown in the dashed line in this image? It is unclear why only this myofiber is surrounded by the dashed line.

      The dashed line encircles a single fiber because it was not visible in the provided image. However, there are 3 fibers in this image. Because we did not immuno-stain for myofibers here, we circled one fiber for illustration. For clarity, we brightened the background (of the entire original images) so the background signals from myofiber boundaries are discernable without outlines.

      (5) FigS1B There was no overlapped DAPI staining in the Myogenin+ cell. DAPI-staining should be present in Myogenin+ cells because myogenin is located in the nucleus.

      Fig.S1B is immuno-staining for MyoD , and we marked one MyoD+DAPI+GFP+ cell/nucleus. Fig.S1C is immune-staining for Myogenin, and we also marked one (cell/nucleus) that is triple positive.

      (6) The position of the asterisk for the ScxGFP in FigS1D is misaligned. In addition, the position is not matched with Fig1C. Because all myofibers are Scx-positive, it is strange that only one myofiber has an asterisk. The reviewer suggests removing the mark.

      Thank you for pointing out these errors. We have now corrected the misalignment and removed the unnecessary asterisk.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment 

      This study presents valuable experimental and numerical results on the motility of a magnetotactic bacterium living in sedimentary environments, particularly in environments of varying magnetic field strengths. The evidence supporting the claims of the authors is solid, although the statistical significance comparing experiments with the numerical work is weak. The study will be of interest to biophysicists interested in bacterial motility. 

      We thank the reviewers and editors for their careful reading and the constructive comments. With respect to the statement about weak statistical significance, we think that this statement mixes two separate issues, the significance of the difference between experiments at 0 and 50µT and the comparison of experiments with simulations. We have amended our manuscript to address both points as described below. The difference between the experiments at 0 and 50µT is indeed significant, and the discrepancy between experiments and simulations can be explained by unavoidable differences in the way we quantify bacterial throughput.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      The authors present experimental and numerical results on the motility Magnetospirillum gryphiswaldense MSR-1, a magnetotactic bacterium living in sedimentary environments. The authors manufactured microfluidic chips containing three-dimensional obstacles of irregular shape, that match the statistical features of the grains observed in the sediment via microcomputer tomography. The bacteria are furthermore subject to an external magnetic field, whose intensity can be varied. The key quantity measured in the experiments is the throughput ratio, defined as the ratio between the number of bacteria that reach the end of the microfluidic channel and the number of bacteria entering it. The main result is that the throughput ratio is non-monotonic and exhibits a maximum at magnetic field strength comparable with Earth's magnetic field. The authors rationalize the throughput suppression at large magnetic fields by quantifying the number of bacteria trapped in corners between grains. 

      Strengths: 

      While magnetotactic bacteria's general motility in bulk has been characterized, we know much less about their dynamics in a realistic setting, such as a disordered porous material. The micro-computer tomography of sediments and their artificial reconstruction in a microfluidic channel is a powerful method that establishes the rigorous methodology of this work. This technique can give access to further characterization of microbial motility. The coupling of experiments and computer simulations lends considerable strength to the claims of the authors, because the model parameters (with one exception) are directly measured in the experiments. 

      Weaknesses: 

      The main weakness of the manuscript pertains to the discussion of the statistical significance of the experimental throughput ratio. Especially when comparing results at zero and 50 micro Tesla. The simulations seem to predict a stronger effect than seen in the experiments. The authors do not address this discrepancy. 

      We thank the reviewer for their positive assessment and the detailed constructive remarks. 

      The increase in bacterial throughput between 0 and 50 µT is indeed more pronounced in the simulations than in the experiments, partly due to the fact that there is considerably more variability in the experimental data. We did two things to address this issue: (1) We performed additional statistical test addressing the difference between the experimental results at 0 and 50 µT. Indeed, the difference is only weakly significant (in contrast to the difference of either to 500µT). The increase is however consistent with the observation in the absence of obstacles in the channel, where we see a monotonous increase from 0 to 500 µT (Supp. Figure S5). We have added the test results in the caption of Fig. 3. (2) To address the difference between simulations and experiments, we added a section in Methods on how we determine the throughput and a short discussion in the Results section. The key points are that the initial condition is different in simulations and experiments and that the throughput is therefore quantified differently. This difference is due to experimental limitations: we cannot track bacteria through the whole channel and we wanted to avoid pushing them into the channel with fluid flow to avoid effects of flow on the results. As a consequence, bacteria continue to enter the IN region of the channel from the inlet during the experiment, while in the simulation, they all start at the beginning of the channel simultaneously. We expect this to mostly affect the case with diffusive transport (B=0).

      Reviewer #2 (Public Review): 

      Summary: 

      simulation study of magnetotactic bacteria in microfluidic channels containing sediment-mimicking obstacles. The obstacles were produced based on micro-computer tomography reconstructions of bacteria-rich sediment samples. The swimming of bacteria through these channels is found experimentally to display the highest throughput for physiological magnetic fields. Computer simulations of active Brownian particles, parameterized based on experimental trajectories are used to quantify the swimming throughput in detail. Similar behavior as in experiments is obtained, but also considerable variability between different channel geometries. Swimming at strong field is impeded by the trapping of bacteria in corners, while at weak fields the direction of motion is almost random. The trapping effect is confirmed in the experiments, as well as the escape of bacteria with reducing field strength. 

      Strengths: 

      This is a very careful and detailed study, which draws its main strength from the fruitful combination of the construction of novel microfluidic devices, their use in motility experiments, and simulations of active Brownian particles adapted to the experiment. Based on their results, the authors hypothesize that magnetotactic bacteria may have evolved to produce magnetic properties that are adapted to the geomagnetic field in order to balance movement and orientation in such crowded environments. They provide strong arguments in favor of such a hypothesis. 

      Weaknesses: 

      Some of the issues touched upon here have been studied also in other articles. It would be good to extend the list of references accordingly and discuss the relation briefly in the text. 

      We thank the reviewer for the constructive comments. We answer to the point concerning previous literature in the response to the recommendations below.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Here follows a list of points the authors should address. 

      (1) Are additional experiments feasible to decrease the statistical noise present in Fig. 3c? At the very least, the authors should discuss the statistical significance of the results at 50 muT vis-a-vis 0 T. 

      See our response to Strengths/Weaknesses above

      (2) The experimental setup is not immediately clear. I think that adding a panel from Fig. S1 (or a sketch thereof) would help clarify, especially in relation to the entry zone and end zone. 

      We are not sure what you mean. Fig. 3A already contains exactly such a panel. We have however added another supplementary figure that shows an additional detailed view of the setup (Fig. S3). In addition, we revised several figures: We have replaced Fig. S1 with a better version and exchanged the schematic view of the obstacle channel in Fig 1, removing the additional inlets that were not used in this study (also in Fig 3A), Instead we added a comment in Methods explaining their presence. Hopefully this makes the setup clear.

      (3) It should be also stated that there is no external flow imposed on the channel. 

      We have added such a statement in the description of the experiment (in section 2.2 Swimming of magnetotactic bacteria through sediment-mimicking obstacle channels.  

      (4) Fig. 3c and Fig. 6c are seemingly showing the same quantity (or closely related ones). The authors should use the same symbol and give an explicit mathematical definition. 

      The two quantities are not exactly the same, as we cannot directly quantify the flux of bacteria through the channel in our experiments. On the one hand, we cannot track bacteria through the whole channel, on the other hand, the initial conditions are not exactly the same as in the simulations. In the simulations all bacteria start at the same time at the entrance to the channel. In the experiments, they enter from the inlet and do so at different times (pushing them in with fluid flow would be possible, but carries the risk of perturbing the results due to induced flow through the channel). We have added a new section in the Methods section that explains this difference and describes the procedure used to obtain the throughput from the experiments in detail. We have also added a corresponding comment in the Result section, where the simulations are compared with the experiments. 

      Minor issues: 

      - Figures have different styles that should be unified. For example, the panel labels sometimes have round brackets and sometimes they don't.

      See above

      - Page 6, (muCT) should have the Greek letter mu 

      Thanks, corrected.

      - Fig. 3a is not very clear; see my point 2 above. 

      See above

      Reviewer #2 (Recommendations For The Authors): 

      I have only a few comments and questions, which the authors should address: 

      (1) The observed exponential dependence of decay time on the "well" depth could be related to the exponential density distribution of active particles in a gravitational field, which has been derived previously. Might be interesting to discuss such a possible connection. 

      Thank you for the suggestion, the two cases are indeed somewhat analogous with behaviors reminiscent of thermal processes with an effective temperature. Such a description is however not generally possible (even for sedimentation, only some features are described). We plan to address in future work whether it can be made more quantitative in our case of escape from the corner traps. We have included a short discussion of the analogy in the section on trapping and escape. 

      (2) The authors should consider the following relevant references, and discuss them briefly in their manuscript:

      - Sedimentation, trapping, and rectification of dilute bacteria J Tailleur, ME Cates EPL 86, 60002 (2009) 

      - Human spermatozoa migration in microchannels reveals boundary-following navigation P Denissenko, V Kantsler, DJ Smith, J Kirkman-Brown Proc. Natl. Acad. Sci. USA 109, 8007-8010 (2012) 

      - Wall accumulation of self-propelled spheres J Elgeti, G Gompper Europhysics Letters 101, 48003 (2013) 

      - Wall entrapment of peritrichous bacteria: a mesoscale hydrodynamics simulation study SM Mousavi, G Gompper, RG Winkler Son Maber 16 (20), 4866-4875 (2020) 

      - A Geometric Criterion for the Optimal Spreading of Active Polymers in Porous Media C Kurzthaler, S Mandal, T Bhabacharjee, H Löwen, SS Daba, HA Stone Nat. Commun. 12, 7088 (2021) 

      - Run-to-Tumble Variability Controls the Surface Residence Times of E. coli Bacteria G Junot, T Darnige, A Lindner, VA Martinez, J Arlt, A Dawson, WCK Poon, H Auradou, E Clement Phys. Rev. Leb. 128, 248101 (2022) 

      - Dynamics and phase separation of active Brownian particles on curved surfaces and in porous media P Iyer, RG Winkler, DA Fedosov, G Gompper Phys. Rev. Research 5, 033054 (2023) 

      We agree that there is a lot of literature on these aspects, specifically interaction of self-propelled objects with walls and motion of swimmers through porous media. We have slightly extended our overview of previous literature in the introduction and included most of these references.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1: 

      (1) Their results with human macrophages suggest that there are differences between murine and human macrophages in inflammasome-mediated restriction of STm growth. For example, Thurston et al. showed that in murine macrophages that inflammasome activation controls the replication of mutant STm that aberrantly invades the cytosol, but only slightly limits replication of WT STm. In contrast, here the authors found that primed human macrophages rely on caspase-1, gasdermin D and ninjurin-1 to restrict WT STm. I wonder if the priming of the human macrophages in this study could account for the differences in these studies. Along those lines, do the authors see the same results presented in this study in the absence of priming the macrophages with Pam3CSK4. I think that determining whether the control of intracellular STm replication is dependent on priming is very important.

      We thank the Reviewer for their careful attention to our manuscript and for their thoughtful comments. We have addressed this question about the impact of priming by repeating the bacterial intracellular burden assays in unprimed WT and CASP1-/- THP-1 cells. We have added additional figures to the manuscript to address this: Figure 1 – Figure Supplement 3. Under unprimed conditions, CASP1-/- cells still harbored significantly higher bacterial burdens at 6 hpi and a significant fold-increase in bacterial CFUs compared to WT cells. These results suggest that the caspase-1-mediated restriction of intracellular Salmonella replication in human macrophages is independent of priming. 

      (2) Another difference with the Thurston et al. paper is the way that the STm inoculum was prepared - stationary phase bacteria that were opsonized. Could this also account for differences between the two studies rather than differences between murine and human macrophages in inflammasome-dependent control of STm?

      We thank the Reviewer for this excellent suggestion. To address this possibility, we repeated the bacterial intracellular burden assays in WT and CASP1-/- THP-1 cells using stationary phase bacteria. We infected WT and CASP1-/- THP-1 cells with stationary phase Salmonella, and we subsequently assayed for intracellular bacterial burdens. These data have now been added to the manuscript in Figure 1 – Figure Supplement 4. Interestingly, we did not observe any fold-change in the bacterial colony forming units in both the WT and CASP1-/- THP-1 cells for the stationary phase Salmonella. These data indicate that by 6 hours postinfection, Salmonella do not replicate efficiently in human macrophages unless grown under SPI-1-inducing conditions. Furthermore, these results suggest that differences in how the Salmonella inoculum is prepared may contribute to the discrepancies between our study and previous studies, as noted by the Reviewer. 

      (3) The authors show that the pore-forming proteins GSDMD and Ninj1 contribute to control of STm replication in human macrophages. Is it possible that leakage of gentamicin from the media contributes to this control?

      Response: We thank the Reviewer for their insightful comment. We have addressed this question on the impact of gentamicin by repeating the bacterial intracellular burden assays using a lower concentration of gentamicin in combination with extensively washing the cells with RPMI media to remove the gentamicin. WT and CASP1-/- THP-1 cells were infected with WT Salmonella. Then, at 30 minutes post-infection, cells were treated with 25 μg/ml of gentamicin to kill any extracellular bacteria. At 1 hour post-infection (hpi), the cells were washed for a total of five times with fresh RPMI to remove the gentamicin, and then the media was replaced with fresh media containing no gentamicin. In parallel, we also treated cells with 100 μg/ml of gentamicin at 30 minutes post-infection, washed the cells five times with fresh RPMI at 1 hpi to remove the gentamicin, and then replaced the media with fresh media containing 10 μg/ml of gentamicin. This data has now been included in the manuscript as Figure 1 – Figure Supplement 5. We observed similar levels in the intracellular bacterial burdens at 1 hpi and 6 hpi and a fold-increase in bacterial colony forming units in CASP1-/- cells compared to WT cells across both gentamicin conditions, suggesting that gentamicin appears to not contribute to the intracellular control of Salmonella replication in human macrophages. Of note, we also tried repeating the bacterial intracellular burden assays without gentamicin, using only washes to remove extracellular at 1 hpi; however, under these experimental conditions, we observed high levels of extracellular Salmonella. Therefore, we relied on using a lower concentration of gentamicin to kill extracellular Salmonella in conjunction with extensive washing to remove the gentamicin for the remainder of the infection. 

      (4) One major question that remains to be answered is whether casp-1 plays a direct role in the intracellular localization of STm. If the authors quantify the percentage of vacuolar vs. cytosolic bacteria at early time points in WT and casp-1 KO macrophages, would that be the same in the presence and absence of casp-1? If so, then this would suggest that there is a basal level of bacterial-dependent lysis of the SCV and in WT macrophages the presence of cytosolic PAMPS trigger cell death and bacteria can't replicate in the cytosol. However, in the inflammasome KO macrophages, the host cell remains alive and bacteria can replicate in the cytosol.

      We thank this Reviewer for raising this important point. We have addressed this experimentally by quantifying the percentage of vacuolar vs. cytosolic Salmonella at 2 hpi in WT, NAIP-/-, and CASP1-/- THP-1 cells using a chloroquine (CHQ) resistance assay. This data has now been included in the manuscript in the new Figure 5A. The original subfigures of Figure 5 have consequently been rearranged. We did not observe any significant differences in vacuolar and cytosolic bacterial burdens at this early time point in WT, NAIP-/-, and CASP1-/- THP-1 cells. As noted by the Reviewer, these results suggest that the basal level of bacterialdependent lysis of the SCV in human macrophages is not dependent on caspase-1 or NAIP. 

      Reviewer #3: 

      (1) The main weaknesses of the study are the inherent limitations of tissue culture models. For example, to study interaction of Salmonella with host cells in vitro, it is necessary to kill extracellular bacteria using gentamicin. However, since Salmonella-induced macrophage cell death damages the cytosolic membrane, gentamicin can reach intracellular bacteria and contribute to changes in CFU observed in tissue culture models (major point 1). This can result in tissue culture "artefacts" (i.e., observations/conclusions that cannot be recapitulated in vivo). For example, intracellular replication of Salmonella in murine macrophages requires T3SS-2 in vitro, but T3SS-2 is dispensable for replication in macrophages of the spleen in vivo (Grant et al., 2012).  

      We thank the Reviewer for their helpful comments and insightful suggestions. We have addressed some of the concerns about gentamicin in our response to Reviewer #1 above. To address the Reviewer’s concerns further, we have included language to acknowledge the limitations of our study based on the artefacts of tissue culture models in our Discussion section: “In this study, we utilized tissue culture models to examine intracellular Salmonella replication in human macrophages. These in vitro systems allow for precise control of experimental conditions and, therefore, serve as powerful tools to interrogate the molecular mechanisms underlying inflammasome responses and Salmonella replication in both immortalized and primary human cells. Still, there are limitations of tissue culture models, as they lack the inherent complexity of tissues and organs in vivo. To assess whether our findings reflect Salmonella dynamics in the mammalian host, it will be important to complement our studies and extend the implications of our work using approaches that model more complex systems, such as organoids or organ explant models co-cultured with immune cells, and in vivo techniques, such as humanized mouse models.”

      (2) In Figure 1: are increased CFU in WT vs CASP1-deficient THP-1 cells due to Caspase 1 restricting intracellular replication or due to Caspase-1 causing pore formation to allow gentamicin to enter the cytosol thereby restricting bacterial replication? The same question arises about Caspase-4 in Figure 2, where differences in CFU are observed only at 24h when differences in cell death also become apparent. The idea that gentamicin entering the cytosol through pores is responsible for controlling intracellular Salmonella replication is also consistent with the finding that GSDMD-mediated pore formation is required for restricting intracellular Salmonella replication (Figure 3). Similarly, the finding that inflammasome responses primarily control Salmonella replication in the cytosol could be explained by an intact SCV membrane protecting Salmonella from gentamicin (Figure 5). 

      We thank the Reviewer for highlighting this important point regarding gentamicin.

      We have addressed this question in our response above to Review #1 and in Figure 1 – Figure Supplement 5. We observed caspase-1-mediated restriction of Salmonella in human macrophages even when cells were treated with a lower concentration of gentamicin (25 μg/ml) for 30 minutes and then extensively washed with RPMI media to remove any gentamicin for the remainder of the infection. These data suggest that gentamicin is likely not responsible for controlling intracellular Salmonella in human macrophages.

    1. Author response:

      We thank all three reviewers and the editors for their detailed comments on our manuscript.  The two main themes of this feedback concern the paper’s generality and its presentation.  Reviewers #2 and #3 raise questions about how the discrepancies in fitness statistics we report will be realized across organisms, environments, and in models with interactions beyond resource competition (e.g., toxicity or cross-feeding).  All reviewers and the editors have also expressed the need for the presentation to be improved, including a broader introduction to the concept of fitness (Reviewer #1), a clearer explanation of our model (Reviewer #1), better explanations of how quantifying fitness answers key biological questions (Reviewer #3), and improvements to the most technical sections to ensure accessibility to experimentalists (Reviewer #3).

      In light of these comments, we wish to clarify that the goal of this paper is to provide a proof-of-principle for how different choices in quantifying fitness can lead to different analysis outcomes.  Since the focus of this paper is on the theoretical concepts, we focus on a few example data sets and a simple model to demonstrate the existence of these discrepancies.  While other organisms and environments, especially with more complex growth dynamics and interactions, could certainly have additional or different discrepancies in fitness statistics, we believe the simplicity of our approach is valuable because it demonstrates that even basic features of microbial growth (common across systems) with realistic parameter values are sufficient to cause significant differences in fitness depending on these quantification choices.  We agree with the reviewers that a systematic documentation of how these fitness discrepancies are empirically realized is important, but we believe that question is best explored in separate future works that can focus fully on this empirical rather than theoretical question.

      We plan to revise the manuscript in several ways, following the suggestions of the three reviewers and the editor.  First, we will better articulate the main goal and conclusions of this manuscript, especially its generality and limitations.  Second, we will work to streamline and clarify several points in the main text identified by the reviewers to make it more accessible and useful to a broader audience, especially experimentalists who routinely measure fitness in their work.  We are grateful to the reviewers and the editor for their time and effort in assessing the manuscript, and we look forward to providing an updated version that addresses these concerns.

    1. Author response:

      Reviewer #1 (Public review):

      Li et al. investigate Ca2+ signaling in T. gondii and argue that Ca2+ tunnels through the ER to other organelles to fuel multiple aspects of T. gondii biology. They focus in particular on TgSERCA as the presumed primary mechanism for ER Ca2+ filling. Although, when TgSERCA was knocked out there was still a Ca2+ release in response to TG present.

      Note that we did not knockout SERCA as it is an essential gene so it would not be possible to isolate parasites that do not express SERCA. We created conditional mutants that downregulate the expression of SERCA and some activity is present in the mutant after 24 h of ATc treatment.

      Overall the Ca2+ signaling data do not support the conclusion of Ca2+ tunneling through the ER to other organelles in fact they argue for direct Ca2+ uptake from the cytosol.

      The authors show EM membrane contact sites between the ER and other organelles, so Ca2+ released by the ER could presumably be taken up by other organelles but that is not ER Ca2+ tunneling.

      They clearly show that SERCA is required for T. gondii function.

      Overall, the data presented to not fully support the conclusions reached

      We agree that the data does not support Ca2+ tunneling as defined and characterized in mammalian cells. In response to this comment, we modified the title and the text accordingly.

      However, we think that the study shows far more than just the role of SERCA in T. gondii functions. We argue that the study shows that the ER (through the activity of the SERCA pump) sequesters and re-distributes calcium to other organelles following influx through the PM. The experiments show that the ER is able to take calcium from the cytosol as it enters the parasite through SERCA activity, and this activity is important for the transition of the parasite between various extracellular calcium exposures. We believe that the role of the ER in redistributing calcium following exposure to physiological levels of extracellular calcium is demonstrated in the experiments shown in Figs 1H-I, 4G-H and 5G,H, I, J, K . There are no previous T. gondii studies that address the question of how intracellular stores are filled with calcium, which are essential for the continuation of the lytic cycle, meaning they are essential for the parasitism of T. gondii.

      Data argue for direct Ca2+ uptake from the cytosol

      The ER most likely takes up calcium from the cytosol following its entry through the PM and redistributes it to the other organelles. We will delete the word “tunneling” and replace it with transfer and re-distribution as they represent our results.

      What we think is re-distribution is shown in Figure 1H and I in which the calcium released after GPN and nigericin are enhanced after TG addition. Of note is that there is no experimental evidence that supports the regulation of calcium entry by store depletion (PMID: 24867952), and we do not think that the enhanced response is due to calcium entry.

      Figure 4G and H show that knocking down SERCA reduces significantly the response to GPN. Fig 5I shows that the mitochondrial calcium uptake is reduced after the addition of GPN in the knockdown mutant. Fig 2B shows that SERCA can take up calcium at 55 nM calcium while mitochondrial uptake needs higher concentrations (Fig 5B-C). However, higher calcium concentrations could be reached at the microdomains formed around MCS between the ER and mitochondrion. Figure 5E shows that the mitochondrion is not responsive to an increase of cytosolic calcium. This is also shown for the apicoplast in Fig. 7 E and F of the Li et al, Nat Commun 2021 paper.

      Reviewer #2 (Public review):

      The role of the endoplasmic reticulum (ER) calcium pump TgSERCA in sequestering and redistributing calcium to other intracellular organelles following influx at the plasma membrane.

      T. gondii transitions through life cycle stages within and exterior to the host cells, with very different exposures to calcium, adds significance to the current investigation of the role of the ER in redistributing calcium following exposure to physiological levels of extracellular calcium.

      They also use a conditional knockout of TgSERCA to investigate its role in ER calcium store-filling and the ability of other subcellular organelles to sequester and release calcium. These knockout experiments provide important evidence that ER calcium uptake plays a significant role in maintaining the filling state of other intracellular compartments.

      We thank the reviewer.

      While it is clearly demonstrated, and not surprising, that the addition of 1.8 mM extracellular CaCl2 to intact T. gondii parasites preincubated with EGTA leads to an increase in cytosolic calcium and subsequent enhanced loading of the ER and other intracellular compartments, there is a caveat to the quantitation of these increases in calcium loading. The authors rely on the amplitude of cytosolic free calcium increases in response to thapsigargin, GPN, nigericin, and CCCP, all measured with fura2. This likely overestimates the changes in calcium pool sizes because the buffering of free calcium in the cytosol is nonlinear, and fura2 (with a Kd of 100-200 nM) is a substantial, if not predominant, cytosolic calcium buffer. Indeed, the increases in signal noise at higher cytosolic calcium levels (e.g. peak calcium in Figure 1C) are indicative of fura2 ratio calculations approaching saturation of the indicator dye.

      We agree about the limitations of using Fura2 but according to the literature (PMID:3838314, fig. 3) Fura2 is suitable for measurements between 100 nM and 1 mM calcium.  The responses in our experiments were within its linear range and the experiments with the SERCA mutant and mitochondrial GCaMPs supports the conclusions of our work.

      We agree that the experiment shown in Fig 1C shows a response close to the limit of the linear range of Fura2 and we can provide a more representative trace in the final article. We can include new quantifications and comparisons.

      Another caveat, not addressed, is that loading of fura2/AM can result in compartmentalized fura2, which might modify free calcium levels and calcium storage capacity in intracellular organelles.

      We are aware of this issue and because of that we have modified our protocol to minimize compartmentalization. We load cells for 26 min at room temperature and keep cells in ice and do not use them for longer that 2-3 hours because we do see evidence of compartmentalization. One evidence of compartmentalization is the increase in the resting calcium concentration.

      The finding that the SERCA inhibitor cyclopiazonic acid (CPA) only mobilizes a fraction of the thapsigargin-sensitive calcium stores in T. gondii coincides with previously published work in another apicomplexan parasite, P. falciparum, showing that thapsigargin mobilizes calcium from both CPA-sensitive and CPA-insensitive calcium pools (Borges-Pereira et al., 2020, DOI: 10.1074/jbc.RA120.014906). It would be valuable to determine whether this reflects the off-target effects of thapsigargin or the differential sensitivity of TgSERCA to the two inhibitors.

      This is an interesting observation, and we will discuss the result considering the Plasmodium study and include the citation. We will add inhibition curves using the MagFluo protocol and compare CPA and TG.

      Figure S1 suggests differential sensitivity, and it shows that thapsigargin mobilizes calcium from both CPA-sensitive and CPA-insensitive calcium pools in T. gondii. Also important is that we used 1 µM TG as we are aware that TG has shown off-target effects at higher concentrations. 

      The authors interpret the residual calcium mobilization response to Zaprinast observed after ATc knockdown of TgSERCA (Figures 4E, 4F) as indicative of a target calcium pool in addition to the ER. While this may well be correct, it appears from the description of this experiment that it was carried out using the same conditions as Figure 4A where TgSERCA activity was only reduced by about 50%.

      We partially agree as pointed by the reviewer knock down of TgSERCA by only 50% means that the ER still could be targeted by zaprinast and no evidence of another target calcium pool. From the MagFLuo4 experiment (although we are aware that the fluorescence of mag Fluo4 is not linear to calcium), there is SERCA activity after 24 hr of ATc treatment.  However, when adding Zaprinast after TG we see a significant release of calcium which is true for both wild type and conditional knockdowns. Because of this result we proposed that there could be another large neutral calcium pool than the one mobilized by TG. We will address these possibilities in the discussion and interpretation of the result.

      The data in Figures 4A vs 4G and Figures 4B vs 4H indicate that the size of the response to GPN is similar to that with thapsigargin in both the presence and absence of extracellular calcium. This raises the question of whether GPN is only releasing calcium from acidic compartments or whether it acts on the ER calcium stores, as previously suggested by Atakpa et al. 2019 DOI: 10.1242/jcs.223883. Nonetheless, Figure 1H shows that there is a robust calcium response to GPN after the addition of thapsigargin.

      The results of the experiments did not exclude the possibility that GPN can also mobilize some calcium from the ER besides acidic organelles. We don’t have any evidence to support that GPN can mobilize calcium from the ER either. Based on our unpublished work, we think GPN mainly release calcium from the PLVAC. We will include the mentioned citation and discuss the result considering the possibility that GPN may be acting on the ER.

      An important advance in the current work is the use of state-of-the-art approaches with targeted genetically encoded calcium indicators (GECIs) to monitor calcium in important subcellular compartments. The authors have previously done this with the apicoplast, but now add the mitochondria to their repertoire. Despite the absence of a canonical mitochondrial calcium uniporter (MCU) in the Toxoplasma genome, the authors demonstrate the ability of T. gondii mitochondrial to accumulate calcium, albeit at high calcium concentrations. Although the calcium concentrations here are higher than needed for mammalian mitochondrial calcium uptake, there too calcium uptake requires calcium levels higher than those typically attained in the bulk cytosolic compartment. And just like in mammalian mitochondria, the current work shows that ER calcium release can elicit mitochondrial calcium loading even when other sources of elevated cytosolic calcium are ineffective, suggesting a role for ER-mitochondrial membrane contact sites. With these new tools in hand, it will be of great value to elucidate the bioenergetics and transport pathways associated with mitochondrial calcium accumulation in T. gondii.

      We thank this reviewer for his/her positive comment. Studies of bioenergetics and transport pathways associated with mitochondrial calcium accumulation is part of our future plans.

      The current studies of calcium pools and their interactions with the ER and dependence on SERCA activity in T. gondi are complemented by super-resolution microscopy and electron microscopy that do indeed demonstrate the presence of close appositions between the ER and other organelles (see also videos). Thus, the work presented provides good evidence for the ER acting as the orchestrating organelle delivering calcium to other subcellular compartments through contact sites in T. gondi, as has become increasingly clear from work in other organisms.

      Thank you

      Reviewer #3 (Public review):

      This manuscript describes an investigation of how intracellular calcium stores are regulated and provides evidence that is in line with the role of the SERCA-Ca2+-ATPase in this important homeostasis pathway. Calcium uptake by mitochondria is further investigated and the authors suggest that ER-mitochondria membrane contact sites may be involved in mediating this, as demonstrated in other organisms.

      The significance of the findings is in shedding light on key elements within the mechanism of calcium storage and regulation/homeostasis in the medically important parasite Toxoplasma gondii whose ability to infect and cause disease critically relies on calcium signalling. An important strength is that despite its importance, calcium homeostasis in Toxoplasma is understudied and not well understood.

      We agree with the reviewer. Thank you

      A difficulty in the field, and a weakness of the work, is that following calcium in the cell is technically challenging and thus requires reliance on artificial conditions. In this context, the main weakness of the manuscript is the extrapolation of data. The language used could be more careful, especially considering that the way to measure the ER calcium is highly artificial - for example utilising permeabilization and over-loading the experiment with calcium. Measures are also indirect - for example, when the response to ionomycin treatment was not fully in line with the suggested model the authors hypothesise that the result is likely affected by other storage, but there is no direct support for that.

      The MagFluo protocol has been amply used in mammalian cells, DT40 cells and other cells for the characterization of the IP3 receptor response to IP3. We will include and discuss more citations in the revised article. The scheme at the top of the figure shows the protocol used. There is no overloading with calcium because the cells are permeabilized and the concentrations of calcium used are physiological and all experiments were performed at 220 nm calcium which is within the cytosolic levels tolerated by cells. The experiment was done with permeabilized cells because permeabilization allows the indicator to become diluted, the substrate MgATP to reach the membrane of the ER and in addition allows for the exposure to precise concentrations of calcium. MagFluo4 loading is intended for its compartmentalization to all intracellular compartments and the uptake stimulated by MgATP exclusively occurs in the compartment occupied by SERCA. IO is an ionophore that causes calcium release from other stores in addition to the ER and it is expected that will result in a larger release. We must clarify that the experiment shown in Fig. 2 was done to characterize the activity of SERCA and was not aimed at the characterization of the role of SERCA in the parasite. We will explain this result better in the revised version of the article.

      Below we provide some suggestions to improve controls, however, even with those included, we would still be in favour of revising the language and trying to avoid making strong and definitive conclusions. For example, in the discussion perhaps replace "showed" with "provide evidence that are consistent with..."; replace or remove words like "efficiently" and "impressive"; revise the definitive language used in the last few lines of the abstract (lines 13-17); etc. Importantly we recommend reconsidering whether the data is sufficiently direct and unambiguous to justify the model proposed in Figure 7 (we are in favour of removing this figure at this early point of our understanding of the calcium dynamic between organelles in Toxoplasma).

      We thank the reviewer for the suggestions and will modify the language as suggested.

      Fig 7 is only a model and as all models could be incorrect. However, considering this reviewer’s criticism we will replace the model for a simpler one that is less speculative.

      Another important weakness is poor referencing of previous work in the field. Lines 248-250 read almost as if the authors originally hypothesised the idea that calcium is shuttled between ER and mitochondria via membrane contact sites (MCS) - but there is extensive literature on other eukaryotes which should be first cited and discussed in this context. Likewise, the discussion of MCS in Toxoplasma does not include the body of work already published on this parasite by several groups. It is informative to discuss observations in light of what is already known.

      We added a citation following the sentence mentioned by the reviewer in lines 248-250 (corrected preprint) and will include more in the revised article. We cite several pertinent articles that describe MCS in Toxoplasma (lines 378-380, very few actually). We will make sure not to miss any new articles that could have been recently published. Note that our work is not about describing the presence of MCSs. We are showing transfer of calcium between the ER and mitochondria and we present evidence that supports that it happens through MCSs.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      - Summary: 

      Recordings were made from the dentate nucleus of two monkeys during a decision-making task. Correlates of stimulus position and stimulus information were found to varying degrees in the neuronal activities. 

      We agree with this summary.

      - Strengths: 

      A difficult decision-making task was examined in two monkeys.

      We agree with this statement.

      - Weaknesses: 

      One of the monkeys did not fully learn the task. The manuscript lacked a coherent hypothesis to be tested, and no attempt was made to consider the possibility that this part of the brain may have little to do with the task that was being studied. 

      We understand the reviewers concern. It is correct that one of the monkeys (Mi) did not perform at a high level, but it should be noted that both monkeys learned significantly above chance level. Therefore, we would argue that both monkeys in fact did learn the task but Mi’s performance was suboptimal. This difference in the performance levels gave us a rare opportunity to dive deeper into the reasons why some animals perform better than the others and we show that Mi (the lower performing monkey) paid more attention to the outcome of the previous trial – this is evident from our behavioural and decoding models.

      We tested the overall hypothesis that neurons of the nucleus dentate can dynamically modulate their activity during a visual attention task, comprising not only sensorimotor but also cognitive attentional components. Many neurons in the dentate are multimodal (Figure 3C-D) which was something that was theorized. One of the specific hypotheses that we tested is that the dentate cells can be direction-selective for both the sensorimotor and cognitive component. Given that many of the recorded cells showed direction-selectivity in their firing rate modulation for gap directions and/or stimulus directions, we provide strong evidence that this hypothesis is correct. We have now spelled out this hypothesis more explicitly in the introduction of the revised version. We now also explain better why we tested this specific hypothesis. Indeed, earlier studies in primates such as those by Herzfeld and colleagues (2018, Nat. Neuro.) and van Es and colleagues (2019, Current Biol) have indicated that direction-selectivity of cerebellar activity may occur in various sensorimotor domains.

      We also appreciate the comment of this Reviewer that in our original submission we did not show our attempt to consider the possibility that this part of the brain may have little to do with the task that was being studied. We in fact did consider this possibility in that we successfully injected 3 ml of muscimol (5 μg/ml, Sigma Aldrich) into the dentate nucleus in vivo in one of the monkeys (Mo). This application resulted in a reduction of more than 10% in correct responses of the covert attention task after 45 minutes, whereas the performance remained the same following saline injections. Unfortunately, due to the timing of the experiments and Covid19-related laboratory restrictions we were unable to perform these experiments in the other monkey or repeat them in Mo. We aim to replicate this in future experiments and publish it when we have full datasets of at least two monkeys available. For this paper we have prioritized our tracing experiments, highlighting the connections of the dentate nucleus with attention related areas in brainstem and cortex in both monkeys, following perfusion.

      - Perhaps the large differences in performance between the two subjects can be used as a way to interpret the neural data's relationship to behavior, as it provided a source of variance. This is what we would hypothesize if we believed that this area of the brain is playing a significant role in the task. If one animal learns much more poorly, and this region of the brain is important for that behavior, then shouldn't there be clear, interpretable differences in the neural data? 

      We thank the Reviewer for this comment. We have added a new Supplementary Figure 2, in which we present the data for both monkeys separately in the revised manuscript. Comparing the two datasets however, we see more commonalities related to the significant learning in both monkeys than differences that might be related to their different levels of learning. We have therefore decided to show the different datasets transparently in the new Supplementary Figure 2, but to stay on the conservative side in our interpretations.

      - How should we look for these differences? A number of recent papers in mice have uncovered a large body of data showing that during the deliberation period, when the animal is interpreting a sensory stimulus (often using the whisker system), there is ramping activity in a principal component space among neurons that contribute to the decision. This ramping activity is present (in the PCA space) in the motor areas of the cortex, as well as in the medial and lateral cerebellar nuclei. Perhaps a similar computational approach would benefit the current manuscript. 

      We also appreciate this point. We have done the principal component analysis accordingly, and we indeed do find the ramping activity in several components of the dentate activity of both monkeys (Mi and Mo). We have now added a new Supplementary Figure 3 with the first three components of both correct and incorrect trials for Mi and Mo, highlighting their potential contribution.

      - What is the hypothesis that is being tested? That is, what do you think might be the function of this region of the cerebellum in this task? It seems to me that we are not entirely in the dark, as previous literature on mice decision-making tasks has produced a reasonable framework: the deliberation period coincides with ramping activity in many regions of the frontal lobe and the cerebellum. Indeed, the ramp in the cerebellum appears to be a necessary condition for the ramp to be present in the frontal lobe. Thus, we should see such ramping activity in this task in the dentate. When the monkey makes the wrong choice, the ramp should predict it. If you don't see the ramping activity, then it is possible that the hypothesis is wrong, or that you are not recording from the right place. 

      It is indeed one of our specific hypotheses that the dentate cells can be direction-selective for the preparing cognitive component and/or sensorimotor response. We provide evidence that this hypothesis may be correct when we analyze the regular time response curves (see Figure 2 and the new Supplementary Figure 2 where the data of both monkeys are now presented separately). Moreover, we have now verified this by analysing the ramping curves of PCA space (new Supplementary Figure 3) and firing frequency of DN neurons that modulated upon presentation of the C-stimulus (new Supplementary Figure 4). These figures and findings are now referred to in the main text.

      - As this is a difficult task that depends on the ability of the animals to understand the meaning of the cues, it is quite concerning that one of the monkeys performed poorly, particularly in the early sessions. Notably, the disparity between the two subjects is rather large: one monkey at the start of the recordings achieved a performance that was much better than the second monkey did at the end of the recording sessions. You highlighted the differences in performance in Figure 1D and mentioned that you started recording once the animals reached 60% performance. However, this did not make sense to me as the performance of Mi even after the final day of recording did not reach the performance of Mo on the first day of recording. Thus, in contrast to Mo, Mi appeared to be not ready for the task when the recording began.

      We understand this point. However, please note that the learning performance of the monkeys concerned retraining sessions after they had had several weeks of vacation. So, even though it is correct that one of the two monkeys had a very good consolidation and started already at a relatively high level on the first retraining session, the other one also started and ended at a level above chance level (the y-axis starts at 0.5). We now highlight this point better in the Results section.

      - One objective of having two monkeys is to illustrate that what is true in one animal is also true in the other. In some figures, you show that the neural data are significantly different, while in others you combine them into one. Thus, are you confident that the neural data across the animals should be combined, as you have done in Figure 2? Perhaps you can use the large differences in performance as a source of variance to find meaning in the neural data. 

      This is a valid question; as highlighted above, we have now addressed this point in the new Supplementary Figure 2, where the data for both monkeys are presented separately. Given the sample sizes and level of variances, it is in general difficult to draw conclusions about the potential differences and contributions, but the data are sufficiently transparent to observe common trends. With regard to linking differences in the neural data to the differences in performance level, please also consider Figure 4, the new Supplementary Figure 3 (with the ramping PCA component) and new Supplementary Figure 4 (with the additional analysis of the ramping activity of DN neurons that modulated upon presentation of the C-stimulus), which suggests that the ramping stage of Mo starts before that of Mi. This difference highlights the possibility that injecting accelerations of the simple spike modulations of Purkinje cells in the cerebellar hemispheres into the complex of cerebellar nuclei may be instrumental in improving the performance of responses to covert attention, akin to what has been shown for the impact of Purkinje cells of the vestibulocerebellum on eye movement responses to vestibular stimulation (De Zeeuw et al. 1995, J Neurophysiol). This possibility is now also raised in the Discussion.

      - How do we know that these neurons, or even this region of the brain, contribute to this task? When a new task is introduced, the contributions of the region of the brain that is being studied are usually established via some form of manipulation. This question is particularly relevant here because the two subjects differed markedly in their performance, yet in Figure 3 you find that a similar percentage of neurons are responding to the various elements of the task.

      We appreciate this question. As highlighted above, we are refraining from showing our muscimol manipulation (3 ml of 5 μg/ml muscimol, Sigma Aldrich), as it only concerns 1 successful dataset and 1 control experiment. We hope to replicate this reversible lesion experiment in the future and publish it when we have full new datasets of at least two monkeys available. As explained above, for this paper we have sacrificed both monkeys following a timed perfusion, so as to have similar survival times for the transport of the neuro-anatomical tracer involved.  

      - Behavior in both animals was better when the gap direction was up/down vs. left/right. Is this difference in behavior encoded during the time that the animal is making a decision? Are the dentate neurons better at differentiating the direction of the cue when the gap direction is up/right vs. left/right? 

      These data have now been included in the new Supplementary Figure 2; we did not observe any significant differences in this respect.

      Reviewer #2:

      - The authors trained monkeys to discriminate peripheral visual cues and associate them with planning future saccades of an indicated direction. At the same time, the authors recorded single-unit neural activity in the cerebellar dentate nucleus. They demonstrated that substantial fractions of DN cells exhibited sustained modulation of spike rates spanning task epochs and carrying information about stimulus, response, and trial outcome. Finally, tracer injections demonstrated this region of the DN projects to a large number of targets including several known to interconnect the visual attention network. The data compellingly demonstrate the authors' central claims, and the analyses are well-suited to support the conclusions. Importantly, the study demonstrates that DN cells convey many motor and nonmotor variables related to task execution, event sequencing, visual attention, and arguably decision-making/working memory. 

      We thank the Reviewer for this positive and constructive feedback.

      - The study is solid and I do not have major concerns, but only points for possible improvement. 

      We thank the Reviewer for this positive feedback.

      - A key feature of this data is the extended changes/ramps in DN output across epochs (Figure 2). Crudely, this presents a challenge for the view that DN output mainly drives motor effectors, as the saccade itself lasts only a tiny fraction of the overall task. Some discussion of this dichotomy in thinking about the function(s) of the cerebellum, vis a vis the multifarious DN targets the authors demonstrate here, etc., would be helpful. 

      We agree with the Reviewer and we have expanded our Discussion on this point, also now highlighting the outcome of the new PCA analysis recommended by Reviewer 1 (see the new Supplementary figure Figure 3).

      - A high-level suggestion on the data: the presentation of the data focuses (sensibly) on the representation of the stimulus and response epochs (Figures 2-3). Yet, the authors then show that from decoding, it is, in fact, a trial outcome that is best represented in the population (Figure 4). While there is nothing 'wrong' with this, it reads slightly incongruously, and the reader does a bit of a "double take" back to the previous figures to see if they missed examples of the trial-outcome signals, but the previous presentations only show correct trials. Consider adding somewhere in the first 3 main figures some neural data showing comparisons with incorrect trials. This way, the reader develops prior expectations for the outcome decoding result and frame of reference for interpreting it. On a related note, the text contains an earlier introduction of this issue (p24 last sentence) and p25 paragraph 1 cites Figure 3D and 3E for signals "related to the absence of reward" - but the caption says this includes only correct trials? 

      We thank the Reviewer for bringing up these points. We have addressed the textual suggestions. Moreover, we have done the PCA analysis suggested by Reviewer 1 for both the correct and incorrect trials (see Supplementary material).

      - P29: The discrepancy in retrograde labeling between monkeys (2 orders of magnitude): I realize the authors can't really do anything about this, but the difference is large enough to warrant concerns in the interpretation (how did the tracer spread over the drastically larger area? Isotropically? Could it cross more "hard boundaries" and incorporate qualitatively different inputs/outputs?). A small discussion of possible caveats in interpreting the outcomes would be helpful. 

      We fully agree with this comment. As highlighted in the text, in both monkeys we first identified the optimal points for injection in the dentate nucleus electrophysiologically and we used the same pump with the same settings to carry out the injections, but even so the differences are substantial. We suspect that the larger injection might have been caused by an air bubble trapped in the syringe or a deviation in the stock solution, but we can never be sure of that. We have added a potential explanation for the caveat that might have played a role.

      - And a list of quick points: 

      We have addressed all points listed below; we want to thank the Reviewer for bringing them up.

      P3 paragraph 2 needs comma "in daily life,". 

      P4 paragraph 2 "C-gap" terminology not previously defined. 

      P4 paragraph 2 "animals employed different behavioral strategies". Grammatically, you should probably say "each animal employed a different behavioral strategy," but also scientifically the paragraph doesn't connect this claim to anything about the DN (whereas, e.g., the abstract does make this connection clear). 

      P5 paragraph 1 "theca" should be "the". 

      P6 paragraph 1 problem with ignashenkova citation insert. 

      P10 paragraph 1 I think the spike rate "difference between highest and lowest" is not exactly the same as "variance," you might want to change the terminology. 

      P10 paragraph 1 should probably say "To determine if a cell preferentially modulated". 

      P10 paragraph 1 last sentence the last clause could be clearer. 

      P17 paragraph 2 should be something like "as well as those by Carpenter and..."? 

      P20 caption: consider "...directionality in the task: only one C-stim...". 

      P20 caption: consider "to the left and right in the [L/R] task...to the top/bottom in the [U/D] task". 

      Fig1E and S1 - is there a physical meaning of the "weight" unit, and if none, can this be transformed into a more meaningful unit? 

      P21 paragraph 1 consider "activity was recorded for 304 DN neurons...". 

      P21 paragraph 1 "correlations with the temporal windows" it's not clear how activity can "correlate" with a time window, consider rephrasing (activity levels changed during these time epochs, depending on stimulus identity). 

      P21 paragraph 1 should be "by comparing the number of spikes in a bin...". 

      P22 paragraph 2 "when we aligned the neurons to the time of maximum change" needs clarification. The maximum change of what? And per neuron? Across the population? 

      P22 paragraph 2 "than that of the facilitating" should be "than did the facilitating units". 

      P24 paragraph 1 needs a comma and rewording "Within each direction, trials are sorted by the time of saccade onset". 

      P24 paragraph 1 should probably say "Same as in G, but for suppressed cells". 

      P24 paragraph 2 should say "more than one task event" not "events". 

      P24 paragraph 2 needs a comma "To fully characterize the neural responses, we fitted". 

      P25 paragraph 1 should probably say "we sampled from similar populations of DN". 

      P34 paragraph 3 consider rephrasing the sentence that contains both "dissociation" and "dissociate". 

      P37 last line: consider "coordination of cerebellum and cerebral cortex *in* higher order mental..."? 

      P38 paragraph 1 citation needed for "kinematics of goal-directed hand actions of others"? 

      P38 paragraph 1 commas probably not needed "map visual input, from high-level visual regions, onto..." 

      References

      - Herzfeld D.J., Kojima Y, Soetedjo R, Shadmehr R (2018) Encoding of error and learning to correct that error by the Purkinje cells of the cerebellum. Nat Neurosci 21:736–743.

      - van Es, D.M., van der Zwaag W., and Knapen T. (2019) Topographic Maps of Visual Space in the Human Cerebellum. Current Biol Volume 29, Issue 10p1689-1694.e3May 20.

      - De Zeeuw CI, Wylie DR, Stahl JS, Simpson JI. (1995) Phase relations of Purkinje cells in the rabbit flocculus during compensatory eye movements. J Neurophysiol. Nov;74(5):2051-64. doi: 10.1152/jn.1995.74.5.2051.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The pituitary gonadotropins, FSH and LH, are critical regulators of reproduction. In mammals, synthesis and secretion of FSH and LH by gonadotrope cells are controlled by the hypothalamic peptide, GnRH. As FSH and LH are made in the same cells in mammals, variation in the nature of GnRH secretion is thought to contribute to the differential regulation of the two hormones. In contrast, in fish, FSH and LH are produced in distinct gonadotrope populations and may be less (or differently) dependent on GnRH than in mammals. In the present manuscript, the authors endeavored to determine whether FSH may be independently controlled by a distinct peptide, cholecystokinin (CCK), in zebrafish.

      Strengths:

      The authors demonstrated that the CCK receptor is enriched in FSH-producing relative to LH-producing gonadotropes, and that genetic deletion of the receptor leads to dramatic decreases in gonadotropin production and gonadal development in zebrafish. Also, using innovative in vivo and ex vivo calcium imaging approaches, they show that LH- and FSH-producing gonadotropes preferentially respond to GnRH and CCK, respectively. Exogenous CCK also preferentially stimulated FSH secretion ex vivo and in vivo.

      Weaknesses:

      The concept that there may be a distinct FSH-releasing hormone (FSHRH) has been debated for decades. As the authors suggest that CCK is the long-sought FSHRH (at least in fish), they must provide data that convincingly leads to such a conclusion. In my estimation, they have not yet met this burden. In particular, they show that CCK is sufficient to activate FSH-producing cells, but have not yet demonstrated its necessity. Their one attempt to do so was using fish in which they inactivated the CCK receptor using CRISPR-Cas9. While this manipulation led to a reduction in FSH, LH was affected to a similar extent. As a result, they have not shown that CCK is a selective regulator of FSH.

      Our conclusion regarding the necessity of CCK signaling for FSH secretion is based on the following evidence:

      (1) CCK-like receptors are expressed in the pituitary gland predominantly on FSH cells.

      (2) Application of CCK to pituitaries elicits FSH cell activation and to a much lesser degree activation of LH cells.  (calcium imaging assays)

      (3) Application of CCK to pituitaries and by injections in-vivo significantly increased only FSH release.

      (4) Mutating the FSH-specific CCK receptor in a different species of fish (medaka) also causes a complete shutdown of FSH production and phenocopies a fsh-mutant phenotype (Uehara, Nishiike et al. 2023).

      Taken together, we believe that this data strongly supports the conclusion that CCK is necessary for FSH production and release from the fish pituitary. Admittedly, the overlapping effects of CCK on both FSH and LH cells in zebrafish (evident in both our calcium imaging experiments and especially in the KO phenotype) complicates the interpretation of the phenotype. We speculate that the effect of CCK on LH cells in zebrafish can be caused either by paracrine signaling within the gland or by the effects of CCK on GnRH neurons that were shown to express CCK receptors .

      In the current version, we emphasize that CCK also induces LH secretion. Although it does not affect LH to the same extent as FSH, an overlap does exist. This is mentioned in the abstract and discussion.

      Moreover, they do not yet demonstrate that the effects observed reflect the loss of the receptor's function in gonadotropes, as opposed to other cell types.

      Although there is evidence for the expression of CCK receptor in other tissues, we do show a direct decrease of FSH and LH expression in the gonadotrophs of the pituitary of the mutant fish; taken together with its significant expression in FSH cells compared to the rest of the cells of the pituitary in the cell specific transcriptomic, it is the most reasonable explanation for the mutant phenotype.

      Unfortunately, unlike in mice, technologies for conditional knockout of genes in specific cell types are not yet available for our model and cell types. Additional tissue distribution of the three receptors types of CCK was added in supplementary figure 1, from this tissue distribution it can be appreciated how in the pituitary only CCKBRA (our identified CCK receptor) is expressed, while in other tissues it is either not expressed or expressed with the additional CCK receptors that can compensate its activity.

      It also is not clear whether the phenotypes of the fish reflect perturbations in pituitary development vs. a loss of CCK receptor function in the pituitary later in life. Ideally, the authors would attempt to block CCK signaling in adult fish that develop normally. For example, if CCK receptor antagonists are available, they could be used to treat fish and see whether and how this affects FSH vs. LH secretion.

      While the observed gonadal phenotype of the KO (sex inversed fish) should have a developmental origin since it requires a long time to manifest, the effect of the KO on FSH and LH cells is probably more acute. Unfortunately a specific antagonist that affect only CCKRBA and not the other CCK receptors wasn’t identified yet.

      In the Discussion, the authors suggest that CCK, as a satiety factor, may provide a link between metabolism and reproduction. This is an interesting idea, but it is not supported by the data presented. That is, none of the results shown link metabolic state to CCK regulation of FSH and fertility. Absent such data, the lengthy Discussion of the link is speculative and not fully merited.

      In the revised manuscript, we provided data to link cck with metabolic status in supplementary figure 1 and modified the discussion to tone down the link between metabolic status to and reproductive state.

      Also in the Discussion, the authors argue that "CCK directly controls FSH cells by innervating the pituitary gland and binding to specific receptors that are particularly abundant in FSH gonadotrophs." However, their imaging does not demonstrate innervation of FSH cells by CCK terminals (e.g., at the EM level).

      Innervation of the fish pituitary does not imply a synaptic-like connection between axon terminals and endocrine cells. In fact, such connections are extremely rare, and their functionality is unclear. Instead, the mode of regulation between hypothalamic terminals and endocrine cells in the fish pituitary is more similar to "volume transmission" in the CNS, i.e. peptides are released into the tissue and carried to their endocrine cell targets by the circulation or via diffusion. A short explanation was added in lines 395-398 in the discussion

      Moreover, they have not demonstrated the binding of CCK to these cells. Indeed, no CCK receptor protein data are shown.

      Our revised manuscript  includes detailed experiments showing the activation of the receptor by its homologous ligand, supplementary Figure 1 includes a transactivation  assay of CCK to its receptor and the effect of the different mutants on the activation of the receptor. Unfortunately, no antibody is available against this fish specific receptor (one of the caveats of working with fish models); therefore, we cannot present receptor protein data.

      The calcium responses of FSH cells to exogenous CCK certainly suggest the presence of functional CCK receptors therein; but, the nature of the preparations (with all pituitary cell types present) does not demonstrate that CCK is acting directly in these cells.

      We agree with the reviewer that there are some disadvantages in choosing to work with a whole-tissue preparation. However, we believe that the advantages of working in a more physiological context far outweigh the drawbacks as it reflects the natural dynamics more precisely. Since our transcriptome data, as well as our ISH staining, show that the CCK receptor is exclusively expressed in FSH cells, it is improbable that the observed calcium response is mediated via a different pituitary cell type.

      Indeed, the asynchrony in responses of individual FSH cells to CCK (Figure 4) suggests that not all cells may be activated in the same way. Contrast the response of LH cells to GnRH, where the onset of calcium signaling is similar across cells (Figure 3).

      The difference between the synchronization levels of LH and FSH cells activity stems from the gap-junction mediated coupling between LH cells that does not exist between FSH cells(Golan, Martin et al. 2016). Therefore, the onset of calcium response in FSH cells is dependent on the irregular diffusion rate of the peptide within the preparation, whereas the tight homotypic coupling between LH cells generates a strong and synchronized calcium rise that propagates quickly throughout the entire population

      The differences in connectivity between LH and FSH cells is mentioned in lines 194-195

      Finally, as the authors note in the Discussion, the data presented do not enable them to conclude that the endogenous CCK regulating FSH (assuming it does) is from the brain as opposed to other sources (e.g., the gut).

      We agree with the reviewer that, for now, we are unable to determine whether hypothalamic or peripheral CCK are the main drivers of FSH cells. While the strong innervation of the gland by CCK-secreting hypothalamic neurons strengthens the notion of a hypothalamic-releasing hormone and also fits with the dogma of the neural control of the pituitary gland in fish (Ball 1981), more experiments are required to resolve this question.

      Reviewer #2 (Public Review):

      Summary:

      This manuscript builds on previous work suggesting that the CCK peptide is the releasing hormone for FSH in fishes, which is different than that observed in mammals where both LH and FSH release are under the control of GnRH. Based on data using calcium imaging as a readout for stimulation of the gonadotrophs, the researchers present data supporting the hypothesis that CCK stimulates FSH-containing cells in the pituitary. In contrast, LH-containing cells show a weak and variable response to CCK but are highly responsive to GnRH. Data are presented that support the role of CCK in the release of FSH. Researchers also state that functional overlap exists in the potency of GnRH to activate FSH cells, thus the two signalling pathways are not separate. The results are of interest to the field because for many years the assumption has been that fishes use the same signalling mechanism. These data present an intriguing variation where a hormone involved in satiation acts in the control of reproduction.

      Strengths:

      The strengths of the manuscript are that researchers have shed light on different pathways controlling reproduction in fishes.

      Weaknesses:

      Weaknesses are that it is not clear if multiple ligand/receptors are involved (more than one CCK and more than one receptor?). The imaging of the CCK terminals and CCK receptors needs to be reinforced.

      Reviewer consultation summary: 

      The data presented establish sufficiency, but not necessity of CCK in FSH regulation. The paper did not show that CCK endogenously regulates FSH in fish. This has not been established yet.

      This is a very important comment, also raised by reviewer 1. To avoid repetition, please see our detailed response to the comment above.

      The paper presents the pharmacological effects of CCK on ex vivo preparations but does not establish the in vivo physiological function of the peptide. The current evidence for a novel physiological regulatory mechanism is incomplete and would require further physiological experiments. These could include the use of a CCK receptor antagonist in adult fish to see the effects on FSH and LH release, the generation of a CCK knockout, or cell-specific genetic manipulations.

      As detailed in the responses to the first reviewer, we cannot conduct conditional, cellspecific gene knockout in our model. However we did conducted KO and show the direct effect on FSH and LH secretion together with physiological characterisation of the mutant.

      Zebrafish have two CCK ligands: ccka, cckb and also multiple receptors: cckar, cckbra and cckbrb. There is ambiguity about which CCK receptor and ligand are expressed and which gene was knocked out.

      In the revised manuscript, we clarified which of the receptors are expressed (CCKRBA) and which receptor is targeted. We also provided data showing the specificity of the receptors (both WT and mutant) to the ligands. Supplementary 1 shows receptor cross-activation. The method also specifies the exact NCBI ID numbers of the targeted receptor and the antibody used for the immunostaining.

      Blocking CCK action in fish (with receptor KO) affects FSH and LH. Therefore, the work did not demonstrate a selective role for CCK in FSH regulation in vivo and any claims to have discovered FSHRH need to be more conservative.

      We agree with the reviewer that the overlap in the effect of CCK measured in the calcium activation of cells and in the KO model does not allow us to conclude selectivity. In this context, it is crucial to highlight that CCKRBA exhibits high expression on FSH cells but not on LH cells. Therefore, the effect of CCK on LH cells is likely paracrine or through GnRH neurons that were shown to express CCK receptors. In the current version, we emphasize that CCK also induces LH secretion. Although it does not affect LH to the same extent as FSH, an overlap does exist. This is mentioned in the abstract and discussion.

      The labelling of the terminals with anti-CCK looks a lot like the background and the authors did not show a specificity control (e.g. anti-CCK antibody pre-absorbed with the peptide or anti-CCK in morphant/KO animals).

      Figures colours had been updated to better visualise the specific staining of the antibody. Also, The same antibody had been previously used to mark CCK-positive cells in the gut of the red drum fish(Webb, Khan et al. 2010) , where a control (pre-absorbed with the peptide) experiment had been conducted.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Abstract:

      The authors have not yet established that CCK is the primary regulator of FSH in vivo.

      In the new version, we highlight the leading effect of CCK on the reproductive axis, which includes FSH and LH.

      Introduction:

      The authors need to make clear earlier in the Introduction that fish have two types of gonadotropes. This information comes too late (last paragraph) currently.

      Added in line 42

      They should discuss relevant data on the differential regulation of FSH and LH in fish, as a rationale for looking for different releasing factors.

      This has been discussed in the first paragraph of the introduction

      In the last sentence of the penultimate paragraph, the authors assume that it must be a hypothalamic factor that regulates FSH. Why is this necessarily the case? Are there data indicating that a hypothalamic factor is required for FSH production in fish?

      This has been mentioned in the discussion, we do not deny that circulating CCK or CCK from other brain areas might affect FSH secretion in the pituitary (line 402-404). However, as the hypothalamus serves as the main gateway from the brain to the pituitary and contains hypophysiotropic CCK neurons it is the most reasonable assumption.

      Results:

      In the first paragraph, the authors reference three types of CCK receptors, only one of which is expressed in the pituitary. The specific receptor should be named here.

      The receptor name and NCBI id had been added in this paragraph.

      Figure 1: What specificity controls were used for the ISH in Figure 1?

      HCR- The method used to identify RNA expression and developed by Molecular Instruments (https://www.molecularinstruments.com/hcr-rnafish-protocols), do not require specific control as had been previously done with older ISH methods. The use of multiple short probes assure the specificity to the RNA.More over the expression is specific to the targeted cells.

      In Figure 1D, the red square is missing in the KO fish (at low magnification).

      This was fixed in the updated version.

      In Figure 1G, the number of dots does not correspond to the number of animals described in the figure legend. Does each point represent an animal?

      Each dot represent a fish. The order of the numbers in the legend didn’t match the order in the graph, this had been fixed in the last version

      Figure 2A: It is not clear that all FSH (GFP) cells are double-labeled. Should all double-labeled cells appear white? Many appear as green. Some quantification of the proportion of co-labeling is needed. Also, the scale bars are too small to read. Perhaps add the size of the scale bars to the legend.

      They are all double-labeled, as can be seen by the single-color images, since GFP fluorescence is stronger than RCaMP fluorescence, the double-labelling might be seen a green cells; a scale bar was added.

      Figure 2C: Is the synchronous activity of LH cells here dependent on endogenous GnRH? Can these events be blocked with a GnRH receptor antagonist?

      We currently do not have enough data to support this hypothesis and the in vivo 2 photon system is not optimal to answer these questions since these are spontaneous events which are difficult to predict. This is the main reason we moved to an ex vivo system. The similar response we receive when applying GnRH in the ex vivo system support it is GnRH activation.

      Figure 4C: As some LH cells respond to CCK, can the authors really claim that CCK is a selective regulator of FSH? What explains the heterogeneity in the response of LH cells to CCK?

      In this version, we highlight that CCK directly activates FSH but it is also affecting LH to some extent. However it is clear that the effect on FSH cells is more significant.

      Figures 5A and B: With larger Ns, some of the trends might be significant (e.g., GnRH stimulated FSH release and CCK stimulated LH release).

      Though there is a trend, the values in the Y axis reveal that the trend of response of FSH to GnRH and LH to CCK is lower then the distribution of the basal response (the before) in all of the graphs. Hence we do not believe a larger N will affect those results. We added the range of the secreted hormones concentrations in the result description to emphasize the difference in values,

      Figures 5C and D: What explains the lack of an increase in LH secretion following GnRH treatment?

      We did not measure LH Secretion in the plasma as we didn’t have enough blood, we do see an increase in LH transcription (see supplementary figure 5 – figure supplement 1)

      Also, as mRNA levels were measured (in C), reference should be made to expression rather than transcription. Not all changes in mRNA levels reflect changes in transcription.Also, remove transcription from the legend. Reference to supplementary Figure 4 in the legend should be supplementary Figure 6. Finally, in C and D, distinguish males from females (as in 5A and B).

      Modifications had been done according to the reviewer suggestions.

      Figure legends:

      The figure legends are very long. One way to shorten them is to remove descriptions of the results. The legends should indicate what is in each figure, not the results of the experiments.

      Modifications had been done according to the reviewer suggestions.

      Sample sizes should be spelled out in the legends, as they are not in the M&M.

      We made sure all sample sizes are mentioned in the legend

      Materials and Methods:

      Section 1.1 can be removed as it repeats content presented elsewhere.

      This section was removed

      Section 1.5: It is unclear what this means: "blinding was not applied to ensure tractability" Please clarify.

      This section was removed

      Reviewer #2 (Recommendations For The Authors):

      It appears that zebrafish have two ligands: ccka, cckb. Also multiple receptors: cckar, cckbra and cckbrb. Authors need to discuss this and clearly state which ligand and which receptor they are referring to in the manuscript.

      We discussed the receptor type in the first paragraph of the results, the exact synthetic peptide used is described in the methods. The 8 amino acids of the mature CCK peptide are the same between CCKa and CCKb. A sentence regarding the specificity of the antibody to the mature CCK peptide was added in line 101.

      "to GnRH puff application (300 μl of 30 μg/μl)"; (250 μl of 30 μg/ml CCK)

      Please give the final concentration to make it easy on the readers of the data.

      The molarity of the final concentration was added.

      (2.4) Differential calcium response underlies differential hormone. This section is a bit confusing to read, for example:

      "For that, we collected the medium perfused through our ex vivo system (Fig. 2a) and measured LH and FSH levels using a specific ELISA validated for zebrafish [31] while monitoring the calcium activity of the cells."

      So the authors did the ELISA while monitoring the activity (?). This sentence does not make sense: please rewrite it.

      We modified this sentence  in line 308-311

      To functionally validate the importance of CCK signalling we used CRISPR-cas9 to generate loss-of-function (LOF) mutations in the pituitary- CCK receptor gene.

      The authors need to clearly state WHICH gene they inactivated: Zebrafish have three CCK-receptors, so "the pituitary receptor gene" needs to be defined.

      Was added again in line 107, and is mentioned in the methods

      Figure 3 is a crucial figure!

      Figure 3B: The data are not very convincing. Please state how thick the sections are in the figure legend (assuming these are adult pituitaries),

      Added in the legend (figure 1C in the new version), slice thickness and adult fish.

      Please show at least the merged image a high magnification view of the co-localization of the receptor with the cells.

      This is figure 1 in the new revision, a magnified figure was added

      Please give the scale bar size for 3B.

      Scales for all images were added

      Figure 3C: the co-localization of the terminals of the CCK and FSH cells shows very few cells expressing close to terminals.

      Important: Because the labelling of the terminals with anti-CCK looks a lot like the background, it is very important to show the control (anti-CCK antibody pre-absorbed with the peptide). The authors should have these data. The photo needs to have been taken at the same gain (contrast) and the photo showing the terminals.

      This is  a commercial antibody that had been previously validated for CCK in fish. The co-localization pattern resembles GnRH innervation in the pituitary. In fish when hypothalamic neurons innervate the pituitary they do not innervate all the cells, as this is an endocrine system, the peptide can travel to neighbouring cells via diffusion or aided blood flow (Golan, Zelinger et al. 2015) ).  The images reveal the direct innervation of CCK in the pituitary and its proximity to FSH cells.

      Figure 4c, on right. The text seems to be stretched as if the photo was adjusted without locking the aspect ratio. Please check the original images.

      This has been fixed

      Can the authors use different pseudo colours? Differentiating a double label of white versus yellow is very difficult, and thus the photo is not very convincing.

      This had been changed to green and magenta

      What is meant by "CCK-AB" antibody? Perhaps anti-CCK would be a better label

      This has been fixed

      Figure 5A: increase the magnification of the insets; the structure of the gonads is very difficult to see with clarity in these low mag images. The most obvious way to improve this figure is to reduce or eliminate the pie graph (not really necessary) and show a high magnification (and larger) image of the gonadal structure.

      This is figure 1 in the new version, with magnification of the gonad next to each body section.

      Discussion:

      " Moreover, in the zebrafish, as well as in other species, the functional overlap in gonadotropin signalling pathways is not limited to the pituitary but is also present in the gonad, through the promiscuity of the two gonadotropin receptors"<br /> The reasoning of this sentence is not clear: zebrafish do not use GnRH to control reproduction: they lack GnRH1 through genomic rearrangement (see Whitlock, Postlethwait and Ewer 2019) and KO of GnRH2/GnRH3 does not affect reproduction.

      While GnRH KO model indicate a redundancy of GnRH in this axis in zebrafish, there is also ample evidence for its importance in regulating reproduction such as its effect on gonadotropin (Golan, Martin et al. 2016) and its use in spawning inductions in fish (Mizrahi and Levavi-Sivan 2023). We believe it is currently too soon to conclude that GnRH signalling is completely non relevant to reproduction in cyprinids.  

      Reviewing Editor (Recommendations For The Authors):

      It would be interesting to see calcium imaging experiments in the CCKR receptor mutants to establish a more direct connection between peptide action and activity.

      We added a receptor assay that reflect the non-activation of the mutated receptors by CCK (supplementary figure 1) , and compared it to the wild type that is activated. This show that: 1) CCK directly activate our identified receptor in FSH cells. 2) the mutated receptors are non-active.

      "all homozygous fish (CCKR+12/+7/-1/ CCKR+12/+7/-1, n=12)"

      It may be better to write the genotype of fish separately as CCKR+12/+12, CCKR+7/+7 and CCKR-1/-1, n=12) otherwise it seems as if all alleles occurred together in the same fish.

      Modified according to the reviewer request

      In Figure 1 scale bar legends are very small. 

      Description of the scale bars were added to the all the legends

      Figure 1 legend "On the top right of each panel is the gender distribution" - fish have no gender but sex.

      Modified according to the reviewer request

      The authors should endeavour to improve the presentation of the figures. They should use a sans-serif font and check that text is not cut at the edge of figure panels, that scale bars are uniform and clearly labelled and fonts are of similar size and clearly legible. E.g. labels of the fish brain of Fig3A are very small.

      We modified all the figures to adapt the font and the scales, we increased the size of the image in Figure 3a to make the labels clearer.

      Please use the elife format to name supplementary figures, as Figure X - Figure Supplement Y (each supplement associated with one of the main figures).

      Fixed

      Peptide concentrations in the ex vivo experiments should also be given as molar concentrations not only as '250 μl of 30 μg/ml CCK'.

      Fixed

      "In contrast, FSH cells responded with a very low calcium rise in hormonal secretion in response to GnRH" - a very low rise in hormonal secretion

      Fixed

      Please clarify why you used a GnRH synthetic agonist and not the native peptide.

      It is commonly used for spawning induction in fish (line 245); it has also been shown to directly affect the secretion of LH and FSH (Biran, Golan et al. 2014, Biran, Golan et al. 2014, Mizrahi, Gilon et al. 2019) , added to line 245.

      References

      Ball, J. (1981). "Hypothalamic control of the pars distalis in fishes, amphibians, and reptiles." General and comparative endocrinology 44(2): 135-170.

      Biran, J., M. Golan, N. Mizrahi, S. Ogawa, I. S. Parhar and B. Levavi-Sivan (2014). "Direct regulation of gonadotropin release by neurokinin B in tilapia (Oreochromis niloticus)." Endocrinology 155(12): 4831-4842.

      Biran, J., M. Golan, N. Mizrahi, S. Ogawa, I. S. Parhar and B. Levavi-Sivan (2014). "LPXRFa, the Piscine Ortholog of GnIH, and LPXRF Receptor Positively Regulate Gonadotropin Secretion in Tilapia (Oreochromis niloticus)." Endocrinology 155(11): 4391-4401.

      Golan, M., A. O. Martin, P. Mollard and B. Levavi-Sivan (2016). "Anatomical and functional gonadotrope networks in the teleost pituitary." Scientific Reports 6: 23777.

      Golan, M., E. Zelinger, Y. Zohar and B. Levavi-Sivan (2015). "Architecture of GnRH-Gonadotrope-Vasculature Reveals a Dual Mode of Gonadotropin Regulation in Fish." Endocrinology 156(11): 4163-4173.

      Mizrahi, N., C. Gilon, I. Atre, S. Ogawa, I. S. Parhar and B. Levavi-Sivan (2019). "Deciphering Direct and Indirect Effects of Neurokinin B and GnRH in the Brain-Pituitary Axis of Tilapia." Front Endocrinol (Lausanne) 10: 469.

      Mizrahi, N. and B. Levavi-Sivan (2023). "A novel agent for induced spawning using a combination of GnRH analog and an FDA-approved dopamine receptor antagonist." Aquaculture 565: 739095.

      Uehara, S. K., Y. Nishiike, K. Maeda, T. Karigo, S. Kuraku, K. Okubo and S. Kanda (2023). "Cholecystokinin is the follicle-stimulating hormone (FSH)-releasing hormone." bioRxiv: 2023.2005.2026.542428.

      Webb, K. A., Jr., I. A. Khan, B. S. Nunez, I. Rønnestad and G. J. Holt (2010). "Cholecystokinin: molecular cloning and immunohistochemical localization in the gastrointestinal tract of larval red drum, Sciaenops ocellatus (L.)." Gen Comp Endocrinol 166(1): 152-159.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review):

      The authors introduce a computational model that simulates the dendrites of developing neurons in a 2D plane, subject to constraints inspired by known biological mechanisms such as diffusing trophic factors, trafficked resources, and an activity-dependent pruning rule. The resulting arbors are analyzed in terms of their structure, dynamics, and responses to certain manipulations. The authors conclude that 1) their model recapitulates a stereotyped timecourse of neuronal development: outgrowth, overshoot, and pruning 2) Neurons achieve near-optimal wiring lengths, and Such models can be useful to test proposed biological mechanisms- for example, to ask whether a given set of growth rules can explain a given observed phenomenon - as developmental neuroscientists are working to understand the factors that give rise to the intricate structures and functions of the many cell types of our nervous system. 

      Overall, my reaction to this work is that this is just one instantiation of many models that the author could have built, given their stated goals. Would other models behave similarly? This question is not well explored, and as a result, claims about interpreting these models and using them to make experimental predictions should be taken warily. I give more detailed and specific comments below.  

      We thank the reviewer for the summary of the work. But the criticism “that this is one instantiation of many models [we] could have built” is unfair as it can apply to any model. We chose one of the most minimalistic models which implements known biological mechanisms including activity-independent and -dependent phases of dendritic growth, and constrained parameters based on experimental data. We compare the proposed model to other alternatives in the Discussion section. In the revised manuscript, we additionally investigate the sensitivity of model output to variations of specific parameters, as explained below.

      Point 1.1. Line 109. After reading the rest of the manuscript, I worry about the conclusion voiced here, which implies that the model will extrapolate well to manipulations of all the model components. How were the values of model parameters selected? The text implies that these were selected to be biologically plausible, but many seem far off. The density of potential synapses, for example, seems very low in the simulations compared to the density of axons/boutons in the cortex; what constitutes a potential synapse? The perfect correlations between synapses in the activity groups is flawed, even for synapses belonging to the same presynaptic cell. The density of postsynaptic cells is also orders of magnitude of, etc. Ideally, every claim made about the model's output should be supported by a parameter sensitivity study. The authors performed few explorations of parameter sensitivity and many of the choices made seem ad hoc.  

      We have performed detailed sensitivity analysis on the model parameters mentioned by the reviewer, including (I) the density of postsynaptic cells (somatas), (II) the density of potential synapses, and (III) the level of correlations between synapses. 

      (I) While the density of postsynaptic cells in our baseline model seems a bit low, at least when compared to densities observed in adulthood (Keller et al., 2018), we explored how altering this value affects the model dynamics. We found that the postsynaptic cell density does not affect the timing of dendritic outgrowth, overshoot and synaptic pruning. It only changes the final size of the dendritic arbor and the resulting number of connected synapses. This analysis is now included in Supplementary Figure 3-2.

      (II) The density of potential synapses and the density of connected synapses that we used in the manuscript are already in the range of densities that can be found in the literature (Leighton et al., 2024; Ultanir et al., 2007; Glynn et al., 2011; Yang et al., 2014), some of which we already cited in the original submission.

      A potential concern might be that the rapid slowing down of growth in the model could be due to a depletion of potential synapses. To illustrate that this is not the case, we showed that the number of available potential synapses over the time course of the simulations remains high (Figure 3, new panel e). Therefore, the initial density of potential synapses is sufficient and does not affect the final density of connected synapses.

      To further illustrate the robustness of our model dynamics to longer simulation times, we added a new supplementary figure (Supplementary Figure 3-1).

      These new figure additions (Figure 3e, Supplementary Figure 3-1, and Supplementary Figure 3-2) and their implications for the model dynamics are discussed in the Results section of the revised paper:

      p.9 line 198, “After the initial overshoot and pruning, dendritic branches in the model stay stable, with mainly small subbranches continuing to be refined (Figure 3-Figure Supplement 1). This stability in the model is achieved despite the number of potential synaptic partners remaining high (Figure 3e), indicating a balance between activity-independent and activitydependent mechanisms. The dendritic growth and synaptic refinement dynamics are independent of the postsynaptic somata densities used in our simulations (Figure 3-Figure Supplement 2). Only the final arbor size and the number of connected synapses decrease with an increase in the density of the somata, while the timing of synaptic growth, overshoot and pruning remains the same (Figure 3-Figure Supplement 2).”

      We also added more details to the description of our model in the Methods section:

      p.24 line 615, “For all simulations in this study, we distributed nine postsynaptic somata at regular distances in a grid formation on a 2-dimensional 185 × 185 pixel area, representing a cortical sheet (where 1 pixel = 1 micron, Figure 4). This yields a density of around 300 neurons per 𝑚𝑚2 (translating to around 5,000 per 𝑚𝑚3, where for 25 neurons in Figure 3Figure Supplement 2 this would be around 750 neurons per 𝑚𝑚2 or 20,000 per 𝑚𝑚3). The explored densities are a bit lower than compared to neuron densities observed in adulthood (Keller et al., 2018). In the same grid, we randomly distributed 1,500 potential synapses, yielding an initial density of 0.044 potential synapses per 𝜇𝑚2 (Figure 3e). At the end of the simulation time, around 1,000 potential synapses remain, showing that the density of potential synapses is sufficient and does not significantly affect the final density of connected synapses. Thus, the rapid slowing down of growth in our model is not due to a depletion of potential synaptic partners. The resulting density of stably connected synapses is approximately 0.015 synapses per 𝜇𝑚2 (around 60 synapses stabilized per dendritic tree, Figure 3b). This density compares well to experimental findings, where, especially during early development, synaptic densities are described to be within a range similar to the one observed in our model (Leighton et al., 2024; Ultanir et al., 2007; Glynn et al., 2011; Yang et al., 2014; Koshimizu et al., 2009; Tyler and Pozzo-Miller, 2001).”

      (III) Lastly, we investigated how the correlation between synapses of the same activity group might affect our conclusions. As correlations in our model mainly arise from patterns of spontaneous activity which are abundant in early postnatal development (retinal waves (Ackman et al., 2012) or endogenous activity in the form of highly synchronized events involving a large fraction of the cells (Siegel et al., 2012), we explored varying the correlations within each activity group, across activity groups and combinations of both. While this analysis supported our previously described intuition on how competition between synaptic activities should drive activity-dependent refinement, recently a study found direct evidence for such subcellular refinement of synaptic inputs specifically dependent on spontaneous activity between retinal ganglion cell axons and retinal waves in the superior colliculus (Matsumoto et al., 2024). The new analysis confirmed our earlier results that the competition between activity groups leads to activity-dependent refinement and yielded further insight into how the studied activity correlations can affect the competition. Those results are presented in a completely new figure (new Figure 5, supported by the Supplementary Figure 5-1 and 5-2) and discussed in the Results section:

      p.11 line 249, “Group activity correlations shape synaptic overshoot and selectivity competition across synaptic groups.

      Since correlations between synapses emerge from correlated patterns of spontaneous activity abundant during postnatal development (Ackman et al., 2012; Siegel et al., 2012), we explored a wide range of within-group correlations in our model (Figure 5a). Although a change in correlations within the group has only a minor effect on the resulting dendritic lengths (Figure 5b) and overall dynamics, it can change the density of connected synapses and thus also affect the number of connected synapses to which each dendrite converges throughout the simulations (Figure 5c,e). This is due to the change in specific selectivity of each dendrite which is a result of the change in within-group correlations (Figure 5d). While it is easier for perfectly correlated activity groups to coexist within one dendrite (Figure 5-Figure Supplement 1a, 100%), decreasing within-group correlations increases the competition between groups, producing dendrites that are selective for one specific activity group (60%, Figure 5d, Figure 5-Figure Supplement 1a). This selectivity for a particular activity group is maximized at intermediate (approximately 60%) within-group correlations, while the contribution of the second most abundant group generally remains just above random chance levels (Figure 5-Figure Supplement 1a). Further reducing within-group correlations (20%, Figure 5a) causes dendrites to lose their selectivity for specific activity groups due to the increased noise in the activity patterns (20%, Figure 5a). Overall, reducing within-group correlations increases synapse pruning (Figure 5f, bottom), also found experimentally (Matsumoto et al., 2024) as dendrites require an extended period to fine-tune connections aligned with their selectivity biases. This phenomenon accounts for the observed reduction in both the density and number of synapses connected to each dendrite.

      In addition to the within-group correlations, developmental spontaneous activity patterns can also change correlations between groups as for example retinal waves propagated in different domains (Feller et al., 1997) (Figure 5-Figure Supplement 2). An increase in between-group correlations in our model intuitively decreases competition between the groups since fully correlated global events synchronize the activity of all groups (Figure 5-Figure Supplement 2). The reduction in competition reduces pruning in the model, which can be recovered by combining cross-group correlations with decreased within-group correlations (Figure 5-Figure Supplement 2). Our simulations show that altering the correlations within activity groups increases competition (by lowering the within-group correlations) or decreases competition (by raising the across-group correlations). Hence, in our model, competition between activity groups due to non-trivially structured correlations is necessary to generate realistic dynamics between activity-independent growth and activity-dependent refinement or pruning.

      In sum, our simulations demonstrate that our model can operate under various correlations in the spike trains. We find that the level of competition between synaptic groups is crucial for the activity-dependent mechanisms to either potentiate or depress synapses and is fully consistent with recent experimental evidence showing that the correlation between spontaneous activity in retinal ganglion cells axons and retinal waves in the superior colliculus governs branch addition vs. elimination (Matsumoto et al., 2024)."

      Precise details on the implementation of the changed activity correlations were added to the Methods section:

      p. 25 line 638, “Within-group and across-group activity correlations. For the decreased withingroup correlations, we generated parent spike trains for each individual group with the firing rate 𝑟𝑖𝑛 = 𝑟𝑡𝑜𝑡𝑎𝑙 ∗ 𝑃𝑖𝑛 (e.g., 𝑃𝑖𝑛 = 100%; 60%; 20%, Figure 5). All the synapses of the same group share the same parent spike train and the remaining spikes for each synapse are uniquely generated with the firing rate 𝑟𝑟𝑒𝑠𝑡 = 𝑟𝑡𝑜𝑡𝑎𝑙 ∗ (1 − 𝑃𝑖𝑛) (e.g., (1 − 𝑃𝑖𝑛) = 0%; 40%; 80%), resulting in the desired firing rate 𝑟𝑡𝑜𝑡𝑎𝑙 (see Table 1). For the increase in across-group correlations, we generated one master spike train with the firing rate 𝑟𝑐𝑟𝑜𝑠𝑠 = 𝑟𝑡𝑜𝑡𝑎𝑙 ∗ 𝑃𝑐𝑟𝑜𝑠𝑠 for all the synapses of all groups (e.g., 𝑃𝑐𝑟𝑜𝑠𝑠 = 5%; 10%; 20%, Figure 5-Figure Supplement 2). This master spike train is shared across all groups and then filled up according to the within-group correlation (if not specified differently 𝑃𝑖𝑛 = 1 − 𝑃𝑐𝑟𝑜𝑠𝑠 to maintain the rate 𝑟𝑡𝑜𝑡𝑎𝑙). In all the cases, also in those where the change in across-group correlations is combined with the change in within-group correlations, the remaining spikes for each synapse are generated with a firing rate 𝑟𝑟𝑒𝑠𝑡 = 𝑟𝑡𝑜𝑡𝑎𝑙 ∗ (1 − 𝑃𝑖𝑛 − 𝑃𝑐𝑟𝑜𝑠𝑠) to obtain an overall desired firing rate of 𝑟𝑡𝑜𝑡𝑎𝑙.”

      Point 1.2. Many potentially important phenomena seem to be excluded. I realize that no model can be complete, but the choice of which phenomena to include or exclude from this model could bias studies that make use of it and is worth serious discussion. The development of axons is concurrent with dendrite outgrowth, is highly dynamic, and perhaps better understood mechanistically. In this model, the inputs are essentially static. Growing dendrites acquire and lose growth cones that are associated with rapid extension, but these do not seem to be modeled. Postsynaptic firing does not appear to be modeled, which may be critical to activity-dependent plasticity. For example, changes in firing are a potential explanation for the global changes in dendritic pruning that occur following the outgrowth phase.  

      Thanks to the reviewer for bringing up these important considerations. We do indeed write in the Introduction (e.g. lines 36-76) which phenomena we include in the model and why. The Discussion also compares our model to others (lines 433-490), pointing out that most models either focus on activity-independent or activity-dependent phases. We include both, combining the influence of both molecular gradients and growth factors as well as activity-dependent connectivity refinements instructed by spontaneous activity. We consider our model a tractable, minimalist mechanistic model which includes both activity-independent and activity-dependent aspects. 

      Regarding postsynaptic firing, this is indeed super relevant and an important point to consider. In one of our recent publications (Kirchner and Gjorgjieva, 2021), we studied only an activity-dependent model for the organization of synaptic inputs on non-growing dendrites which have a fixed length. There, we considered the effect of postsynaptic firing (via a back-propagating action potential) and demonstrated that it plays an important role in establishing a global organization of synapses on the entire dendritic tree of the neuron. For example, we showed that it could lead to the emergence of retinotopic maps on the dendritic tree which have been found experimentally (Iacaruso et al., 2017). Since we use the same activity-dependent plasticity model in this paper, we expect that the somatic firing will have the same effect on establishing synaptic distributions on the entire dendritic tree. This is now also discussed in the Discussion section of the revised manuscript:

      p. 21 line 491, “Although we did not explicitly model postsynaptic firing, our previous work with static dendrites has shown that it can play an important role in establishing a global organization of synapses on the entire dendritic tree of the neuron (Kirchner and Gjorgjieva, 2021). For example, we showed that it could lead to the emergence of retinotopic maps on the dendritic tree which have been found experimentally (Iacaruso et al., 2017). Since we use the same activity-dependent plasticity model in this paper, we expect that the somatic firing will have the same effect on establishing synaptic distributions on the entire dendritic tree.”

      Including the concurrent development of axons in the model is indeed very interesting. In fact, a recent tour-de-force techniques paper found similar to what we assume. Hebbian activity-dependent dynamics of axonal branches of retinal ganglion cells experiencing spontaneous activity in relation to retinal waves in the superior colliculus (Matsumoto et al., 2024). New branches tend to be added at the locations where spontaneous activity of individual branches is more correlated with retinal waves, whereas asynchronous activity is associated with branch elimination. We suspect the same Hebbian activity-dependent dynamics to apply also to dendritic growth. 

      To address simultaneous dynamic axons to our growing dendrites, in the revised version of the manuscript, we included a simplified form of axonal dynamics by allowing changes in the lifetime and location of potential synapses, which come from axons of presynaptic partners. We explored different median lifetimes of synapses in combination with several distances with which a synapse can move in the simulated space (new Supplementary Figure 3-3). Our results show that dynamically moving synapses only affect the dynamics and stability of our model when the rate of moving synapses combined with the distance of moving synapses is faster than the dendritic growth. In scenarios in which synapses can move across large distances, dendrites get further destabilized due to synapses transferring from one dendrite to another, perturbing the attractor fields of the potential synapses even in late phases of the simulations. Besides such non-biological scenarios, dynamically moving synapses do not affect the model dynamics too much. Thus, they mostly add additional noise and variability to the growth and pruning without changing the timing and amplitude of the dynamics. These results are discussed in the results section of the revised manuscript:

      p.9 line 207, “The development of axons is concurrent with dendritic growth and highly dynamic Matsumoto et al. (2024). To address the impact of simultaneously growing axons, we implemented a simple form of axonal dynamics by allowing changes in the lifetime and location of potential synapses, originating from the axons of presynaptic partners (Figure 3-Figure Supplement 3). When potential synapses can move rapidly (median lifetime of 1.8 hours), the model dynamics are perturbed quite substantially, making it difficult for the dendrites to stabilize completely (Figure 3–Figure Supplement 3c). However, slowly moving potential synapses (median lifetime of 18 hours) still yield comparable results (Figure 3-Figure Supplement 3). The distance of movement significantly influenced results only when potential synaptic lifetimes were short. For extended lifetimes, the moving distance had a minor impact on the dynamics, predominantly affecting the time required for dendrites to stabilize. This was the result of synapses being able to transfer from one dendrite to another, potentially forming new long-lasting connections even at advanced stages of synaptic refinement. In sum, our results show that potential axonal dynamics only affect the stability of our model when these dynamics are much faster than dendritic growth.”

      Precise details on the implementation of the dynamically moving synapses and their synaptic lifetimes are now in the Methods section:

      p. 25 line 650, “Dynamically moving synapses. For the moving synapses we introduced lifetimes for each synapse, randomly sampled from a log-normal distribution with median 1.8h (for when they move frequently), 4.5h or 18h (for when they move rarely) and variance equal to 1 (Figure 3-Figure Supplement 3b). The lifetime of a synapse decreases only when the synapse is not connected to any of the dendrites (i.e., is a potential synapse). When the lifetime of a synapse expires, the synapse moves to a new location with a new lifetime sampled from the same log-normal distribution. This enables synapses to move multiple times throughout a simulation. The exact locations and distances to which each synapse can move are determined by a binary matrix (dimensions: 𝑝𝑖𝑥𝑒𝑙𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒 × 𝑝𝑖𝑥𝑒𝑙𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒) representing a ring (annulus) with the inner radius 𝑑/4 and outer radius 𝑑/2 , where the synapse location is at the center of the matrix. All the locations of the matrix within the ring boundaries (between the inner radius and outer radius) are potential locations to which the synapse can move. The synapse then moves randomly to one of the possible locations where no other synapse or dendrite is located. For the movement distances, we chose the ring dimensions 3 × 3, 25 × 25 and 101 × 101, yielding the moving distances (radii) of 1 pixel per movement, 12 pixels per movement and 50 pixels per movement (𝑟 = (𝑑−1)/2). These pixel distances represent small movements, as much as a dendrite can grow in one step (1 micron), and larger movements which are far enough so that the synapse will not attract the same branches again (12 microns) or far enough so that it might attract a completely different dendrite (50 microns, Figure 3-Figure Supplement 3a).”

      Point 1.3. Line 167. There are many ways to include activity -independent and -dependent components into a model and not every such model shows stability. A key feature seems to be that larger arbors result in reduced growth and/or increased retraction, but this could be achieved in many ways (whether activity dependent or not). It's not clear that this result is due to the combination of activity-dependent and independent components in the model, or conceptually why that should be the case.

      We never argued for model uniqueness. There are always going to be many different models (at different spatial and temporal scales, at different levels of abstraction). We can never study all of them and like any modeling study in systems neuroscience we have chosen one model approach and investigated this approach. We do compare the current model to others in the Discussion. If the reviewers have a specific implementation that we should compare our model to as an alternative, we could try, but not if this means doing a completely separate project.

      Point 1.4. Line 183. The explanation of overshoot in terms of the different timescales of synaptic additions versus activity-dependent retractions was not something I had previously encountered and is an interesting proposal. Have these timescales been measured experimentally? To what extent is this a result of fine-tuning of simulation parameters?  

      We found that varying the amount of BDNF controls the timescale of the activity-dependent plasticity (see our Figure 6c). Hence, changing the balance between synaptic additions vs. retractions is already explored in Figure 6e and f. Here we show that the overshoot and retraction does not have to be fine-tuned but may be abolished if there is too much activity-dependent plasticity. 

      Regarding the relative timescales of synaptic additions vs. retractions: since the first is mainly due to activity-independent factors, and the second due to activity-dependent plasticity, the questions is really about the timescales of the latter two. As we write in the Introduction (lines 61-63), manipulating activity-dependent synaptic transmission has been found to not affect morphology but rather the density and specificity of synaptic connections (Ultanir et al. 2007), supporting the sequential model we have (although we do not impose the sequence, as both activity-independent and activitydependent mechanisms are always “on”; but note that activity-dependent plasticity can only operate on synapses that have already formed).

      The described results are robust to parameter variations (performed on the postsynaptic density, potential synapse density, and within- and across-group correlations) as described in the reply to reviewer #1 point 1.1.

      Point 1.5. Line 203. This result seems at odds with results that show only a very weak bias in the tuning distribution of inputs to strongly tuned cortical neurons (e.g. work by Arthur Konnerth's group). This discrepancy should be discussed.  

      First, we note that the correlated activity experienced by our modeled synapses (and resulting synaptic organization) does not necessarily correspond to visual orientation, or any stimulus feature, for that matter, but is rather a property of correlated spontaneous activity. 

      Nonetheless, there is some variability in what the experimental data show. Many studies have shown that synapses on dendrites are organized into functional synaptic clusters: across brain regions, developmental ages and diverse species from rodent to primate (Kleindienst et al., 2011; Takahashi et al., 2012; Winnubst et al., 2015; Gökçe et al., 2016; Wilson et al., 2016; Iacaruso et al., 2017; Scholl et al., 2017; Niculescu et al., 2018; Kerlin et al., 2019; Ju et al., 2020, Hedrick et al., 2022, Hedrick et al., 2024). Interestingly, some in vivo studies have reported lack of fine-scale synaptic organization (Varga et al., 2011; X. Chen et al., 2011; T.-W. Chen et al., 2013; Jia et al., 2010; Jia et al., 2014), while others reported clustering for different stimulus features in different species. For example, dendritic branches in the ferret visual cortex exhibit local clustering of orientation selectivity but do not exhibit global organization of inputs according to spatial location and receptive field properties (Wilson et al. 2016; Scholl et al., 2017). In contrast, synaptic inputs in mouse visual cortex do not cluster locally by orientation, but only by receptive field overlap, and exhibit a global retinotopic organization along the proximal-distal axis (Iacaruso et al., 2017). We proposed a theoretical framework to reconcile these data: combining activity-dependent plasticity similar to the BDNF-proBDNF model that we used in the current work, and a receptive field model for the different species (Kirchner and Gjorgjieva, 2021). This is now also discussed in the Discussion section of the revised manuscript:

      p. 20 line 471, “The correlated activity experienced by our modeled synapses (and resulting synaptic organization) does not necessarily correspond to visual orientation, or any stimulus feature, for that matter, but is rather a property of spontaneous activity. Nonetheless, there is some variability in what the experimental data show. Many have shown that synapses on dendrites are organized into functional synaptic clusters: across brain regions, developmental ages and diverse species from rodent to primate (Kleindienst et al., 2011; Winnubst et al., 2015; Iacaruso et al., 2017; Scholl et al., 2017; Niculescu et al., 2018; Takahashi et al., 2012; Gökçe et al., 2016; Wilson et al., 2016; Kerlin et al., 2019; Ju et al., 2020; Hedrick et al., 2022, 2024). Other studies have reported lack of fine-scale synaptic organization (Chen et al., 2013; Varga et al., 2011; Chen et al., 2011; Jia et al., 2010, 2014). Interestingly, some of these discrepancies might be explained by different species showing clustering with respect to different stimulus features (orientation or receptive field overlap) (Scholl et al., 2017; Wilson et al., 2016; Iacaruso et al., 2017). Our prior work proposed a theoretical framework to reconcile these data: combining activity-dependent plasticity as we used in the current work, and a receptive field model for the different species (Kirchner and Gjorgjieva, 2021).”

      Point 1.6. Line 268. How does the large variability in the size of the simulated arbors relate to the relatively consistent size of arbors of cortical cells of a given cell type? This variability suggests to me that these simulations could be sensitive to small changes in parameters (e.g. to the density or layout of presynapses).  

      We again thank the reviewer for the detailed explanation and feedback on parameters that should be tested in more detail. We have explored several of the suggested model parameters and believe that we have managed to explain and illustrate their effects on the model's dynamics clearly. The precise changes are explained in the reply to point 1.1 and are now available in the revised version of the manuscript.

      Point 1.7. The modeling of dendrites as two-dimensional will likely limit the usefulness of this model. Many phenomena- such as diffusion, random walks, topological properties, etc - fundamentally differ between two and three dimensions.  

      Indeed, there are many differences between two and three dimensions. We have ongoing work that extends the current model to 3D but is beyond the scope of the current paper. In systems neuroscience, people have found very interesting results making such simplified geometric assumptions about networks, for instance the one-dimensional ring model has been used to uncover fundamental insights about computations even though highly simplified and abstracted. We are convinced that our model, especially with the new sensitivity analysis, makes interesting and novel contributions and predictions.

      Point 1.8. The description of wiring lengths as 'approximately optimal' in this text is problematic. The plotted data show that the wiring lengths are several deviations away from optimal, and the random model is not a valid instantiation of the 2D non-overlapping constraints the authors imposed. A more appropriate null should be considered.  

      We appreciate the reviewer’s feedback regarding the use of the term “approximately optimal” in describing wiring lengths. We acknowledge that our initial terminology was imprecise and could be misleading. We had previously referred to the minimal wiring length as the optimal wiring length, which does not fully capture the nuances of neuronal wiring optimization. As noted in prior literature, such as the work by Hermann Cuntz (Cuntz et al., 2010 & 2012), neurons can optimize their wiring beyond simply minimizing dendritic length.

      To address this issue, to better capture the balance between wiring minimization and functional constraints, such as conduction delays, we have developed a new modeling approach based on minimum spanning trees with a balancing factor (Cuntz et al., 2010 & 2012). This factor modulates the trade-off between minimizing wiring length and accounting for conduction delays from synapses to the soma. Specifically, the model assumes a balance between minimizing the total dendritic length and minimizing the tree distance between synapses and the site of input integration, typically the soma. This balance is illustrated in Figure 8 (Figure 7 in the original manuscript), where we demonstrate that the deviation from the theoretical minimum length arises because direct paths to synapses often require longer dendrites in our models.

      Together with the new result, which we added as the new panels f, g and h to Figure 8 (originally Figure 7), we also adjusted panel a of Figure 8, to now illustrate the difference between random wiring, minimal wiring and minimal conductance delay. The updated Figure 8 and its new findings are discussed in the results section of the revised manuscript:

      p.17 line 387, “This deviation is expected given that real dendrites need to balance their growth processes between minimizing wire while reducing conduction delays. The interplay between these two factors emerges from the need to reduce conduction delays, which requires a direct path length from a given synapse to the soma, consequently increasing the total length of the dendritic cable. (Cuntz et al., 2010, 2012; Ferreira Castro et al., 2020).

      To investigate this further, we compared the scaling relations of the final morphologies of our models with other synthetic dendritic morphologies generated using a previously described minimum spanning tree (MST) based model. The MST model balances the minimization of total dendritic length and the minimization of conduction delays between synapses and the soma. This balance results in deviations from the theoretical minimum length because direct paths to synapses often require longer dendrites (Cuntz et al., 2008, 2010). The balance in the model is modulated by a balancing factor (𝑏𝑓 ). If 𝑏𝑓 is zero, dendritic trees minimize the cable only, and if 𝑏𝑓 is one, they will try to minimize the conduction delays as much as possible. It is important to note that the MST model does not simulate the developmental process of dendritic growth; it is a phenomenological model designed to generate static morphologies that resemble real cells.

      To facilitate the comparison of total lengths between our simulated and MST morphologies, we generated MST models under the same initial conditions (synaptic spatial distribution) as our models and simulated them to match several morphometrics (total length, number of terminals, and surface area) of our grown morphologies. This allowed us to create a corresponding MST tree for each of our synthetic trees. Consequently, we could evaluate whether the branching structures of our models were accurately predicted by minimum spanning trees based on optimal wiring constraints. We found that the best match occurred with a trade-off parameter 𝑏𝑓 = 0.9250 (Figure 8f). Using the morphologies generated by the MST model with the specified trade-off parameter (𝑏𝑓 ), we showed that the square root of the synapse count and the total length (𝐿) in both our model generated trees and the MST trees exhibit a linear scaling relationship (Figure 8g; 𝑅2 = 0.65). The same linear relationship can be observed for the square root of the surface area and the total length 𝐿 of our model trees and the MST trees (Figure 8h; 𝑅2 = 0.73). Overall, these results indicate that our model generate trees are wellfitted by the MST model and follow wire optimization constraints.

      We acknowledge that the value of the balancing factor 𝑏𝑓 in our model is higher than the range of balancing factors that is typically observed in the biological dendritic counterparts, which generally ranges between 0.2 and 0.4 (Cuntz et al., 2012; Ferreira Castro et al., 2020; Baltruschat et al., 2020). However, it is still remarkable that our model, which does not explicitly address these two conservation laws, achieves approximately optimal wiring. Why do we observe such a high 𝑏𝑓 value? We reason that two factors may contribute to this. First, in our models, local branches grow directly to the nearest potential synapse, potentially taking longer routes instead of optimally branching to minimize wiring length (Wen and Chklovskii, 2008). Second, the growth process in our models does not explicitly address the tortuosity of the branches, which can increase the total length of the branches used to connect synapses. In the future, it will be interesting to add constraints that take these factors into account. Taken together, combining activity-independent and -dependent dendrite growth produces morphologies that approximate optimal wiring.”

      Further details on the fitted MST model and the corresponding analysis were added to the methods section:

      p.26 line 669, “Comparison with wiring optimization MST models. To evaluate the wire minimization properties of our model morphologies (n=288), we examined whether the number of connected synapses (N), total length (L), and surface area of the spanning field (S) conformed to the scaling law 𝐿 ≈ 𝜋−1/2 ⋅ 𝑆1/2 ⋅ 𝑁1/2 (Cuntz et al., 2012). Furthermore, to validate that our model dendritic morphologies scale according to optimal wiring principles, we created simplified models of dendritic trees using the MST algorithm with a balancing factor (bf). This balancing factor adjusts between minimizing the total dendritic length and minimizing the tree distance between synapses and the soma (Cost = 𝐿 + 𝑏𝑓 ⋅ 𝑃 𝐿) (MST_tree; best bf = 0.925) (Cuntz et al., 2010); TREES Toolbox http://www.treestoolbox.org).

      Initially, we generated MSTs to connect the same distributed synapses as our models. We performed MST simulations that vary the balancing factor between 𝑏𝑓 = 0 and 𝑏𝑓 = 1 in steps of 0.025 while calculating the morphometric agreement by computing the error (Euclidean distance) between the morphologies of our models and those generated by the MST models. The morphometrics used were total length, number of terminals, and surface area occupied by the synthetic morphologies.”

      Point 1.9. It's not clear to me what the authors are trying to convey by repeatedly labeling this model as 'mechanistic'. The mechanisms implemented in the model are inspired by biological phenomena, but the implementations have little resemblance to the underlying biophysical mechanisms. Overall my impression is that this is a phenomenological model intended to show under what conditions particular patterns are possible. Line 363, describing another model as computational but not mechanistic, was especially unclear to me in this context.  

      What we mean by mechanistic is that we implement equations that model specific mechanisms i.e. we have a set of equations that implement the activity-independent attraction to potential synapses (with parameters such as the density of synapses, their spatial influence, etc) and the activitydependent refinement of synapses (with parameters such as the ratio of BDNF and proBDNF to induce potentiation vs depression, the activity-dependent conversion of one factor to the other, etc). This is a bottom-up approach where we combine multiple elements together to get to neuronal growth and synaptic organization. This approach is in stark contrast to the so-called top-down or normative approaches where the method would involve defining an objective function (e.g. minimal dendritic length) which depends on a set of parameters and then applying a gradient descent or other mathematical optimization technique to get at the parameters that optimize the objective function. This latter approach we would not call mechanistic because it involves an abstract objective function (who could say what a neuron or a circuit should be trying to optimize?) and a mathematical technique for how to optimize the function (we don’t know if neurons can compute gradients of abstract objective functions). 

      Hence our model is mechanistic, but it does operate at a particular level of abstraction/simplification. We don’t model individual ion channels, or biophysics of synaptic plasticity (opening and closing of NMDA channels, accumulation of proteins at synapses, protein synthesis). We do, however, provide a biophysical implementation of the plasticity mechanism through the BDNF/proBDNF model which is more than most models of plasticity achieve, because they typically model a phenomenological STDP or Hebbian rule that just uses activity patterns to potentiate or depress synaptic weights, disregarding how it could be implemented. To the best of our understanding, this is what is normally considered mechanistic in the field (in contrast to, for example, biophysical).

      Reviewer #2 (Public Review): 

      This work combines a model of two-dimensional dendritic growth with attraction and stabilisation by synaptic activity. The authors find that constraining growth models with competition for synaptic inputs produces artificial dendrites that match some key features of real neurons both over development and in terms of final structure. In particular, incorporating distance-dependent competition between synapses of the same dendrite naturally produces distinct phases of dendritic growth (overshoot, pruning, and stabilisation) that are observed biologically and leads to local synaptic organisation with functional relevance. The approach is elegant and well-explained, but makes some significant modelling assumptions that might impact the biological relevance of the results. 

      Strengths: 

      The main strength of the work is the general concept of combining morphological models of growth with synaptic plasticity and stabilisation. This is an interesting way to bridge two distinct areas of neuroscience in a manner that leads to findings that could be significant for both. The modelling of both dendritic growth and distance-dependent synaptic competition is carefully done, constrained by reasonable biological mechanisms, and well-described in the text. The paper also links its findings, for example in terms of phases of dendritic growth or final morphological structure, to known data well. 

      Weaknesses: 

      The major weaknesses of the paper are the simplifying modelling assumptions that are likely to have an impact on the results. These assumptions are not discussed in enough detail in the current version of the paper. 

      (1) Axonal dynamics. 

      A major, and lightly acknowledged, assumption of this paper is that potential synapses, which must come from axons, are fixed in space. This is not realistic for many neural systems, as multiple undifferentiated neurites typically grow from the soma before an axon is specified (Polleux & Snider, 2010). Further, axons are also dynamic structures in early development and, at least in some systems, undergo activity-dependent morphological changes too (O'Leary, 1987; Hall 2000). This paper does not consider the implications of joint pre- and post-synaptic growth and stabilisation.  

      We thank the reviewer for the summary of the strengths and weaknesses of the work. While we feel that including a full model of axonal dynamics is beyond the scope of the current manuscript, some aspects of axonal dynamics can be included and are now implemented and tested in the revised manuscript. Since this feedback covers similar aspects of the model that were also pointed out by reviewer #1, we refer here to our detailed reply to their comments 1.1 and 1.2, where we list and discuss all the analyses performed to address the raised issues.

      (2) Activity correlations 

      On a related note, the synapses in the manuscript display correlated activity, but there is no relationship between the distance between synapses and their correlation. In reality, nearby synapses are far more likely to share the same axon and so display correlated activity. If the input activity is spatially correlated and synaptic plasticity displays distance-dependent competition in the dendrites, there is likely to be a non-trivial interaction between these two features with a major impact on the organisation of synaptic contacts onto each neuron.  

      We have explored the amount of correlation (between and within correlated groups) in the revised manuscript (see also our reply to reviewer comment 1.1).

      However, previous experimental work, (e.g. Kleindienst et al., 2011) has provided anatomical and functional analyses that it is unlikely that the functional synaptic clustering on dendritic branches is the result of individual axons making more than one synapse (see pg. 1019).

      (3) BDNF dynamics 

      The models are quite sensitive to the ratio of BDNF to proBDNF (eg Figure 5c). This ratio is also activity-dependent as synaptic activation converts proBDNF into BDNF. The models assume a fixed ratio that is not affected by synaptic activity. There should at least be more justification for this assumption, as there is likely to be a positive feedback relationship between levels of BDNF and synaptic activation.  

      The reviewer is correct. We used the BDNF-proBDNF model for synaptic plasticity based on our previous work (Kirchner and Gjorgjieva, 2021).  

      There, we explored only the emergence of functionally clustered synapses on static dendrites which do not grow. In the Methods section (Parameters and data fitting) we justify the choice of the ratio of BDNF to proBDNF from published experimental work. We also performed sensitivity analysis (Supplementary Fig. 1) and perturbation simulations (Supplementary Fig. 3), which showed that the ratio is crucial in regulating the overall amount of potentiation and depression of synaptic efficacy, and therefore has a strong impact on the emergence and maintenance of synaptic organization. Since we already performed all this analysis, we expect that the same results will also apply to the current model which includes dendritic growth, as it involves the same activity-dependent mechanism.

      A further weakness is in the discussion of how the final morphologies conform to principles of optimal wiring, which is quite imprecise. 'Optimal wiring' in the sense of dendrites and axons (Cajal, 1895; Chklovskii, 2004; Cuntz et al, 2007, Budd et al, 2010) is not usually synonymous with 'shortest wiring' as implied here. Instead, there is assumed to be a balance between minimising total dendritic length and minimising the tree distance (ie Figure 4c here) between synapses and the site of input integration, typically the soma. The level of this balance gives the deviation from the theoretical minimum length as direct paths to synapses typically require longer dendrites. In the model this is generated by the guidance of dendritic growth directly towards the synaptic targets. The interpretation of the deviation in this results section discussing optimal wiring, with hampered diffusion of signalling molecules, does not seem to be correct. 

      We agree with this comment. We had wrongly used the term “optimal wiring” as neurons can optimize their wiring not only by minimizing their dendritic length but other factors as noted by the reviewer. In the revised manuscript we replaced the term “optimal wiring” with “minimal wiring” wherever it was incorrectly used. On top of that, we performed further analysis and discussed these differences, as pointed out in the reply to reviewer #1 point 1.8.

      To summarize, we want to again thank the reviewer for their in-depth review and all the suggestions that helped us improve the analysis and implementation of our model.

      Reviewer #3 (Public Review): 

      The authors propose a mechanistic model of how the interplay between activity-independent growth and an activity-dependent synaptic strengthening/weaken model influences the dendrite shape, complexity and distribution of synapses. The authors focus on a model for stellate cells, which have multiple dendrites emerging from a soma. The activity independent component is provided by a random pool of presynaptic sites that represent potential synapses and that release a diffusible signal that promotes dendritic growth. Then a spontaneous activity pattern with some correlation structure is imposed at those presynaptic sites. The strength of these synapses follow a learning rule previously proposed by the lab: synapses strengthen when there is correlated firing across multiple sites, and synapses weaken if there is uncorrelated firing with the relative strength of these processes controlled by available levels of BDNF/proBDNF. Once a synapse is weakened below a threshold, the dendrite branch at that site retracts and loses its sensitivity to the growth signal 

      The authors run the simulation and map out how dendrites and synapses evolve and stabilize. They show that dendritic trees growing rapidly and then stabilize by balancing growth and retraction (Figure 2). They also that there is an initial bout of synaptogenesis followed by loss of synapses, reflecting the longer amount of time it takes to weaken a synapse (Figure 3). They analyze how this evolution of dendrites and synapses depends on the correlated firing of synapses (i.e. defined as being in the same "activity group"). They show that in the stabilized phase, synapses that remain connected to a given dendritic branch are likely to be from same activity group (Figure 4). The authors systemically alter the learning rule by changing the available concentration of BDNF, which alters the relative amount of synaptic strengthening, which in turn affects stabilization, density of synapses and interestingly how selective for an activity group one dendrite is (Figure 5). In addition the authors look at how altering the activity-independent factors influences outgrowth (Figure 6). Finally, one of the interesting outcomes is that the resulting dendritic trees represent "optimal wiring" solutions in the sense that dendrites use the shortest distance given the distribution of synapses. They compare this distribute to one published data to see how the model compared to what has been observed experimentally.  

      There are many strengths to this study. The consequence of adding the activity-dependent contribution to models of synapto- and dendritogenesis is novel. There is some exploration of parameters space with the motivation of keeping the parameters as well as the generated outcomes close to anatomical data of real dendrites. The paper is also scholarly in its comparison of this approach to previous generative models. This work represented an important advance to our understanding of how learning rules can contribute to dendrite morphogenesis.

      We thank the reviewer for the positive evaluation of the work and the suggestions below.

      To improve the clarity of the manuscript, we adjusted and fixed some figures and corresponding paragraphs as follows:

      (1) We increased the number of ticks and their corresponding numbers in all the figures to make them easier to read and interpret.

      (2) In Figure 3 panel d, showing the evolution of synaptic weight, we corrected the upper limit at the yaxis to 1 (from previously 2).

      (3) Due to a typo in the implementation of the BDNF concentration, we had to correct the used BDNF concentrations from 49%, 45% and 40%, to 49%, 46.5% and 43% respectively.

      (4) The y-axis labels of Figure 6 (old Figure 5) panel e and f were changed to make the plots clearer (e: “morphology change explained (%)” to "effect on morphology (%)", and f: “synapse connection explained (%)” to "effect on connected synapses (%)").

      (5) The values for the eta and tau-w in the supplementary Table were corrected. Previously tau-w was falsely 6000 time steps which was corrected to 3000 time steps, and eta was 45% and is now 46.5%.

      We believe that all the changes to the manuscript will address the reviewer’s concerns and enhance the clarity and accuracy of the findings described in the manuscript.

    1. Author response:

      We thank the reviewers for their thoughtful comments. We are working to revise our manuscript and address each of the reviewers comments. A summary of our planned revisions and responses to some of the reviewers’ major concerns are included below.

      Cultivation Density: Reviewers #1 and #2 suggested that additional studies testing the effects of varying bacterial density during animal development (cultivation) would strengthen our findings. While we agree with the reviewers that this is a very interesting experiment, it is not feasible. Indeed, we attempted this experiment but found it nontrivial to maintain stable bacterial density conditions over long timescales as this requires matching the rate of bacterial growth with the rate of bacterial consumption. Despite our best efforts, we have not been able to identify conditions that satisfy these requirements. We will focus our revised manuscript to include only assertions about the effects of recent experiences.

      Transfer Method: Reviewers #1 and #2 expressed concern that the stress of transferring animals to a new plate may have resulted in an increased arousal state and thus a greater probability of rejecting patches. We thank the reviewers for this thoughtful remark and plan to conduct additional analyses to address this hypothesis. We did, however, anticipate this possibility and, to mitigate the stress of moving, we used an agar plug method where animals were transferred using the flat surface of small cylinders of agar. Importantly, the use of agar as a medium to transfer animals provides minimal disruption to their environment as all physical properties (e.g. temperature, humidity, surface tension) are maintained. Qualitatively, we observe no marked change in behavior from before to after transfer with the agar plug method, especially as compared to the often drastic changes observed when using a metal or eyelash pick.

      Time Parameter: Related to the transfer method, Reviewer #1 expressed concern that the simplest time parameter (time since start of the assay) might better predict animal behavior. We thank the reviewer for pointing out the need to specifically test whether the time-dependent change in explore-exploit decision-making corresponds better with satiety (time off patch) or arousal (time since transfer/start of assay) state. We will conduct additional analyses to address these alternative hypotheses.

      Parameter Initialization: Reviewer #1 pointed out an oversight in our methods section regarding the model parameter values used for the first encounter. We plan to clarify the initialization of parameters in the manuscript. In short, for the first patch encounter where k = 1:

      ρk is the relative density of the first patch.

      τs is the duration of time spent off food since the beginning of the recorded experiment. For the first patch, this is equivalent to the total time elapsed.

      ρh is the approximated relative density of the bacterial patch on the acclimation plates (see Assay preparation and recording in Methods). Acclimation plates contained one large 200 µL patch seeded with OD600 = 1 and grown for a total of ~48 hours. As with all patches, the relative density was estimated from experiments using fluorescent bacteria OP50-GFP as described in Bacterial patch density estimation in Methods.

      ρe is equivalent to ρh.

      Sensing vs. non-sensing: Reviewer #3 suggested that the term “non-sensing” may not be ethologically accurate. We thank the reviewer for their comment and agree that we do not know for certain whether the animals sensed these patches or were merely non-responsive to them. We are, however, confident that these encounters lack evidence of sensing. Specifically, we note that our analyses used to classify events as sensing or non-sensing examined whether an animal’s slow-down upon patch entry could be distinguished from either that of events where animals exploited or that of encounters with patches lacking bacteria. We found that  “non-sensing” encounters are indeed indistinguishable from encounters with bacteria-free patches where there are no bacteria to be sensed (see Figure 2 - Supplement 7C-D and Patch encounter classification as sensing or non-sensing in Methods). Regardless, we agree with the reviewer that all that can be asserted for certain about these events is that animals do not respond to the bacterial patch in any way that we measured. Therefore, we will replace the term “non-sensing” with “non-responding” to better indicate the ethological interpretation of these events.

      Time-dependent changes in sensing vs. non-sensing: Reviewer #1 remarked that the sensation of dilute patches increases with time. We agree with the reviewer that we observe increased responsiveness to dilute patches with time. Although this is interesting, our primary focus was on what decision an animal made given that they clearly sensed the presence of the bacterial patch. Nonetheless, we will add this observation to the discussion as an area of future work to investigate the sensory mechanisms behind this effect.

      Classification of sensing vs. non-sensing: Reviewers #2 and #3 expressed concerns about the validity of the two clusters identified using the semi-supervised QDA approach described. We are grateful to the reviewers for pointing out the difficulty in visualizing the clusters and the need for additional clarity in explaining the supervised labeling. We will use additional visualizations and methods to validate the clusters we have discovered. Specifically, we aim to provide additional evidence that the sensing vs. nonsensing data is bi-modal (i.e. a two-cluster classification method fits best). Further, it seems that there may be some confusion as to how we arrived at 3 encounter types (i.e. search, sample, exploit) that we plan to clarify in the manuscript. Specifically, it’s important to note that two methods were used on two different (albeit related) sets of parameters. We first used a two-cluster GMM to classify encounters as explore or exploit. We then used a two-cluster semi-supervised QDA to classify encounters as sensing or non-sensing (to be changed to “non-responding”, see above response) using a different set of parameters. We thus separated the explore cluster into two (sensing and non-sensing exploratory events) resulting in three total encounter types: exploit, sample (explore/sensing), and search (explore/non-sensing). We will clarify this in the text. Additionally, we will clarify the labelling used for “supervising” QDA. Specifically, we made two simple assumptions: 1) animals must have sensed the patch if they exploited it and 2) animals must not have sensed the patch if there were no bacteria to sense. Thus, we labeled encounters as sensing if they were found to be exploitatory as we assume that sensation is prerequisite to exploitation; and we labeled encounters as non-sensing for events where animals encountered patches lacking bacteria (OD600 = 0). All other points were non-labeled prior to learning the model. In this way, our labels were based on the experimental design and results of the GMM, an unsupervised method; rather than any expectations we had about what sensing should look like. The semi-supervised QDA method then used these initial labels to iteratively fit a paraboloid that best separated these clusters, by minimizing the posterior variance of classification.

      Accept-reject vs. stay-switch: Reviewers #1 and #2 ask for additional discussion on how the accept-reject decision-making framework differs from the stay-switch framework. We thank the reviewers for alerting us to this gap in our discussion. We intend to clarify that these frameworks ask two different types of questions (i.e. “Do you want to eat it?” versus “If so, how long do you want to eat it for?”). These concepts are well described in canonical foraging theory literature (see Pyke, Pulliam & Charnov 1977 for a review on the subject) and are easily distinguishable for animals that forage using the following framework: 1) search for prey, 2) encounter prey from a distance, 3) identify prey type, 4) decide to pursue (accept-reject decision), 5) pursue and capture the prey, 6) exploit prey, and 7) decide to stop exploiting and start searching again (stay-switch decision). In this case, it is easy to see the distinction between accept-reject and stay-switch decisions. However, in some scenarios, animals must physically encounter prey prior to identification and then must make an accept-reject decision. In these cases where pursuit and capture are not visualized, it is harder to distinguish between accept-reject and stay-switch decisions. In our experiments, we find significant bimodality in encounter duration (see Figure 2H) where short duration (exploratory) encounters appear to represent a lower bound where animals spend the minimum amount of time possible on a patch (less than 2 minutes), which we interpret as a rejection of the patch. On the other hand, exploitatory encounters span a large range of durations from 2 to 60+ minutes which we interpret as an initial acceptance of the patch followed by a series of stay-switch decisions which determine the overall duration of the encounter. While one could certainly model our data using only stay-switch decision-making, we ascertain that an encounter of minimal duration is better interpreted ethologically as a rejection than as an immediate switch decision. We will revise the text to further extrapolate upon our point of view on this somewhat philosophical distinction and what it predicts about C. elegans behavior.

      Sensory mutant behavior: Reviewers #1 and #3 ask for further speculation on the observed behavior of osm-6 and mec-4 animals. We will further elaborate on our findings, how they relate to previous studies, and what they suggest about the mechanisms behind these foraging decisions.

      Model design: Reviewer #3 suggested several alterations to the behavioral model. While the proposed model seems entirely reasonable and could aid in elucidating the time component of how prior experience affects decision-making, we chose the present model based on our experience with model selection using these data. Indeed, as the reviewer suggested, we did a great number of analyses involving model selection including model selection criteria (AIC, BIC) and optimization with regularization techniques (LASSO and elastic nets). We found that the problem of model selection was compounded by the enormous array of highly correlated variables we had to choose from. Additionally, we found that both interaction terms and non-linear terms of our task variables could be predictive of accept-reject decisions but that the precise set of terms selected depended sensitively on which model selection technique was used and generally made rather small contributions to prediction. The diverse array of results and combinatorial number of predictors to possibly include failed to add anything of interpretable value. We therefore chose to take a different approach to this problem. Rather than trying to determine what the “best” model was we instead asked whether a minimal model could be used to answer a set of core questions. Indeed, our goal was not maximal predictive performance but rather to distinguish between the effects of different influences enough to determine if encounter history had a significant, independent effect on decision making. We thus chose to only include task variables that spanned the most basic components of behavioral mechanisms to ask very specific questions. For example, we selected a time variable that we thought best encapsulated satiety. While we could have included many additional terms, or made different choices about which terms to include, based on our analyses these choices would not have qualitatively changed our results. Further, we sought to validate the parameters we chose with additional studies (i.e. food-deprived and sensory mutant animals). We regard our study as an initial foray into demonstrating accept-reject decision-making in nematodes. The exact mechanisms and, consequently, the best model design is therefore beyond the scope of this study. Lastly, Reviewer #3 criticized the use of only sensed patches in the model. While we acknowledge that we are not certain as to whether the “non-sensing” encounters are truly not sensed, we find qualitatively similar results when including all exploratory patches in our analyses. In fact, when all encounters are used, we find stronger correlations between our task variables and the accept-reject decision. However, we take the position that sensation is necessary for decision-making and thus believe that while our model’s predictive performance may be better using all encounters, the interpretation of our findings is stronger when we only include sensing events.

    1. Author response:

      First of all, I'd like to express my heartfelt thanks to you for your meticulous and professional review comments. Your feedback is very important to our work. It not only helps us identify the shortcomings in the paper, but also provides valuable guidance for improving the quality of the paper.

      We carefully read every suggestion you made and were deeply inspired. Please rest assured that we will carefully consider and revise each opinion to ensure that our research work is more rigorous and clear. We promise to revise the manuscript accordingly to meet the standards of the journal and enhance the credibility and influence of the research.

      The main modifications include the experiment of A Mid1 supplementation experiment in Mid1 knockout micesupplementing Mid1 in Mid1 knockout mice; Detection of kinases such as CaMKII, PKA and ERK1/2; Supplementary references; Supplement the behavioral experiment of new object recognition; Electrophysiological measurement experiment of supplementing LTP; Supplementary neuron-specific immunohistochemical staining experiment; Supplementing the information of knockout mice used in the study; Modify the language expression of the article and the problem of too few pictures.

      Thank you again for your valuable time and professional advice. We look forward to submitting the revised manuscript to you for further review.

    1. Author response:

      Reviewer #1 (Public review):

      Summary:

      Cesar, Santos & Cogni use a meta-analysis to report on the direction and magnitude of three fundamental fitness components in defensive symbioses. Specifically, the work focuses on interactions between three arthropod host families (Aphididae, Culicidae, Drosophilidae, and others) and common bacterial endosymbionts (Wolbachia, Serratia, Hamiltonella, Spiroplasma, Rickettsia, Regiella X-type and Arsenophonus). The results of the overall analysis confirm common assumptions and previous work on such fitness components, showing that defensive symbionts provide strong protection to hosts and cause detectable costs to both hosts and the enemy. The analysis provides insight into the extent of the cost/benefit tradeoff for hosts, reporting that the cost is six times lower than the protective effect. The confirmation that natural enemies attacking hosts infected with symbionts have a reduction in their fitness is also an interesting one, as this shows that the majority of defensive symbionts provide protection by resisting enemy infection, as opposed to tolerating it. This finding has important consequences for evolutionary counter-responses in the enemy species. Of course, this result has less relevance for certain types of enemies (such as parasitoids) where successful infection is dependent upon host killing.

      Interesting results also emerge from the subgroup analysis. For the full dataset, both natural and introduced symbionts were similarly effective in positively influencing the fitness of hosts. However, in the Wolbachia-specific analysis, the artificially introduced symbionts caused costs to the hosts where the natural strain did not. These findings have potentially important ramifications for schemes that use endosymbionts for biocontrol or vector competence, suggesting that (in some cases) natural strains may be the more stable choice for deploying (as they are associated with lower costs).

      The analysis draws from an impressively large dataset, but the interpretation of the full impact of the results would be helped by greater detail on the species/strain level systems included, the data extraction approach, and inclusion criteria. Accounting for phylogenetic nonindependence and alternative coding of one of the moderator variables could also strengthen the biological relevance of the models. Suggestions and thoughts are outlined below.

      We sincerely thank Reviewer #1 for the time and effort dedicated to reviewing our manuscript. The suggestions provided are highly constructive and will greatly assist us in improving both our analyses and the manuscript overall.

      Strengths & Potential Improvements:

      An impressively large number of effect sizes (3000) from only 226 studies is collected, robustly confirming common assumptions on the magnitude of fundamental fitness components. However the paper would benefit from a clear breakdown in the main text of the specificities of each system included (e.g. a table at the host species/symbiont strain level, where it is possible). Currently, there is not enough detail for those who want a deep dive to understand what data was extracted for the analysis from these 226 studies, or those who want to understand the underlying diversity in the dataset.

      We thank the reviewer for the suggestion, and we will add this information to our revised manuscript.

      Currently, when the 'natural enemy group' is tested as a moderator it is coded broadly by type of organism (e.g. virus, bacterium, fungi, parasitoid). But this doesn't adequately capture the mode of killing/fitness reduction by the enemy, which would be the much more biologically relevant categorisation for your questions. For example, parasitoid infection is dependent upon host death (thus host fecundity is not relevant, because the host either survived or did not). Among bacterial and viral pathogens antagonists there is scope for both fecundity and survival to be affected. This in turn may be a very influential factor for the outcome. You could consider recoding this enemy moderator.

      We agree, and we will implement this in the analysis to our revised manuscript.

      The analysis is restricted to arthropod hosts and defensive symbionts that are also classed as endosymbionts. This focus should be made clear early on in the paper, as there are many systems (that are classed by many as defensive symbioses) that are not part of the analysis.

      We agree, and we will implement this to our revised manuscript.

      There is fairly minimalistic testing of moderators/sub-groups (which probably has its statistical strengths) but perhaps there are also some missed opportunities for testing other ecological contributors to variance, including coinfection (although perhaps limited by power) and other approaches to coding enemy group (as detail above).

      We agree, and we will implement this in the analysis to our revised manuscript.

      Looking at the overview of systems included, there's likely a high degree of phylogenetic non-independence in the dataset. Where it is possible, using phylogenetically controlled models could strengthen this analysis.

      We thank the reviewer for the suggestion. We will explore the possibility of using phylogenetically controlled models in our analyses, although we recognize the challenges associated with their implementation, particularly in the case of the natural enemies, given the great diversity of distant related groups included in our study - viruses, bacteria, fungi, protozoans, nematodes and parasitoids wasps.

      Looking at your included systems (Table S5), you might be able to test the effect of coinfection on the 3 variables of interest. For example, it would be particularly important to see if the effects of two symbionts are additive or not.

      We agree, and we will implement this in the analysis to our revised manuscript.

      No code for the analysis is provided for review at this stage and full details of the dataset are also not available. This slightly limits the ability to assess the full scope and robustness of the study. It would be helpful to have an extensive table in the supplementary detailing (minimum) the reference, study, experiment, host species, symbiont strain, and a description of the exact data extraction source (e.g.table/figure/in text), and method of extraction.

      The code for the analysis and the full raw data with the suggested information are available at https://github.com/cassiasqr/MetaSymbiont (The link is available at the end of the manuscript).

      Reviewer #2 (Public review):

      Summary:

      In this exciting study, Cesar and co-authors perform a meta-analysis on the influence of arthropod symbionts on the fitness of their hosts when they are exposed or not to natural enemies. These so-called defensive symbionts are increasingly recognized as key elements in arthropod survival against natural enemies, with effects that ripple through entire terrestrial ecosystems. The topic is timely, the approach is sound, and the manuscript is well-written. I believe this manuscript will attract the attention of entomologists and of microbiologists interested in symbiosis. This study builds on a previous meta-analysis that I was involved in, which was based on phloem-feeding insects. This novel data set is much larger and includes flies (including the model system Drosophila) and mosquitoes (a group of high medical interest). While the previous metaanalysis considered only parasitoids as natural enemies, this study also includes fungi, bacteria, and viruses.

      Strengths:

      The authors compile a very large dataset and provide a broad quantitative overview of the effects of defensive symbionts in insects. By measuring symbiont effects in the presence and absence of natural enemies, the authors are able to infer whether a trade-off between defense and the costs of mutualism in the absence of enemy pressure exists. Defensive symbioses are an important research topic that had its initial "momentum" a decade ago, so the timing for such a systematic review is very appropriate.

      We sincerely thank Reviewer #2 for dedicating their time and effort to reviewing our manuscript. The suggestions are very insightful and will significantly contribute to improving our manuscript.

      Weaknesses:

      I think the manuscript could be improved by clarifying several sections, particularly the introduction and methods. The introduction section is too specific and heavily reliant on particular examples. In my view, the theoretical background of the study could be made clearer, and the knowledge gap identified more explicitly. A focus on how widespread defensive symbioses are, along with a brief, up-to-date review of the groups possessing such symbionts, would help. This lack of focus is also observed in the methods section, where more details are needed in many instances to better understand how data was collected and analyzed. Regarding the analyses, the multi-level analysis contains many moderators, but it's unclear why these moderators were included. While this may seem a minor issue, it highlights a disconnection between the analyses, the conceptual background, and the hypotheses tested. 

      We thank the reviewer for the suggestions, and we will try to make the introduction and the methods section clearer. 

      Another important weakness is that the analyses are too general, and much-hidden information is not immediately apparent. For instance, readers cannot easily identify which species of symbionts are studied (and the effects they have), or which natural enemies are involved. Although this information is found in the supplementary material, including it in the main body would significantly improve the manuscript.

      We agree, and we will implement this to our   revised manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      (1) The technology requires a halo-tagged derivation of the active compound, and the linked position will have a huge impact on the potential "target hits" of the molecules. Given the fact that most of the active molecules lack of structure-activity relationship information, it is very challenging to identify the optimal position of the halo tag linkage.

      We appreciate your insightful comment. While finding the optimal position to attach a chemical linker to a small molecule of interest is indeed a challenging but necessary step, this is a common difficulty across all target-ID methods, except for those that are modification-free, as we described in Discussion. However, modification-free approaches such as DARTS, CETSA, and TPP have their own limitations, such as low sensitivity and a high false-positive rate. Additionally, DARTS and SPROX are limited to use with cell lysates. Please refer to the introduction in our manuscript for more details on these approaches. On the other hand, synthesizing HTL derivatives is relatively straightforward compared to other modifications, and we provide helpful guidelines for chemical linker design, provided the optimal chemical moiety has been identified, which is crucial for target identification. We selected dasatinib and HCQ/CQ as model compounds because previous studies offered insights into their derivative synthesis. Our data also show that DH5 retains strong kinase inhibitory activity (Figure 4—figure supplement 2), and DC661-H1 demonstrates potent inhibition of autophagy (Figure 6—figure supplement 1). For novel compounds, conducting a thorough structure-activity relationship (SAR) study is essential to determine the optimal position for HTL derivative synthesis.

      (2) Although POST-IT works in zebrafish embryos, there is still a long way to go for the broad application of the technology in other animal models.

      Thank you for your constructive comment. Yes, there is still a long way to go in developing the POST-IT system for broader applications in other animal models, especially in mice. However, we hope that our study provides valuable insights and inspiration to scientists and experts for applying the POST-IT system in various models. We are also committed to further improving its applicability.

      (3) The authors identified SEPHS2 as a new potential target of dasatinib and further validated the direct binding of dasatinib with this protein. However, considering the super strong activity of dasatinib against c-Src (sub nanomolar IC50 value), it is hard to conclude the contribution of SEPHS2 binding (micromolar potency) to its antitumor activity.

      Thank you for your insightful comment. We agree that the anticancer activity of dasatinib primarily results from inhibiting tyrosine kinases such as SRC and ABL. However, SEPHS2 contains an “opal" termination codon, UGA, at the 60th amino acid residue, which codes for selenocysteine. Due to the technical challenge of expressing selenoproteins in E. coli, we mutated it to cysteine for expression in E. coli to avoid premature translation termination, as described in the Materials and Methods section. Although the purified recombinant SEPHS2 shows a Kd of about 10 µM for dasatinib, the binding affinity to endogenous SEPHS2 may be higher since selenocysteine is larger and more electronegative than cysteine. This presents an interesting area for future investigation. Furthermore, our study of dasatinib’s binding to SEPHS2 could help facilitate the development of new SEPHS2 inhibitors, potentially targeting the active site of SEPHS2.

      Reviewer #3 (Public review):

      (1) Target Specificity: It is crucial for the authors to differentiate between the primary targets of the POST-IT system and those identified as side effects. This distinction is essential for assessing the specificity and utility of the technology.

      Thank you for your insightful comment. Drugs inevitably bind to various proteins with differing affinities, which can contribute to both side effects and beneficial outcomes. Typically, the primary targets exhibit high affinities. In this manuscript, we ranked the identified protein targets of DH5 based on affinity from mass spectrometry and p-values (Fig. 5A), and for DC661-H1, we used the SILAC ratio (Fig. 6A). We also individually assessed many drug-protein binding affinities using the MST assay, as well as in vitro and in cellulo assays, demonstrating their specificity. Moreover, we believe it is essential to identify as many protein targets as possible at physiological drug concentrations to better understand the drug’s side effects. Of course, further investigation is required to assess the roles and effects of these target proteins.

      (2) In Vivo Target Identification: The manuscript lacks detailed clarity on which specific targets were successfully identified in the in vivo experiments. Expanding on this information would provide a clearer view of the system's effectiveness and scope in complex biological settings.

      Thank you for your insightful comment regarding in vivo target identification. In this manuscript, we utilized a cell line as the primary method for in vivo target identification and validation after optimizing our system in test tubes. We successfully validated many of the targets identified using our POST-IT system (Figure 6—figure supplement 3). To demonstrate the proof of principle for in vivo application, we employed zebrafish embryos as an in vivo model, showing that endogenous SRC can be effectively pulled down by DH5 treatment (Fig. 7). While we could have explored the entire proteome to identify endogenous target proteins in zebrafish that bind to DH5 or dasatinib, we felt this would extend beyond our original scope, given that we have already demonstrated POST-IT’s ability to identify target proteins for dasatinib. Specific target identification and validation are crucial when using zebrafish for drug discovery. Additionally, we acknowledge that drugs likely interact with a range of protein targets in living organisms and may undergo metabolism and interactions within the circulatory system, which we address in our discussion.

      (3) Reproducibility and Scalability: Discussion on the reproducibility of the POST-IT system across various experimental setups and biological models, as well as its scalability for larger-scale drug discovery programs, would be beneficial.

      Thank you for the suggestion. While our system has shown  high reproducibility in our experiments, further improving both reproducibility and scalability would be advantageous. One potential approach to address this is through the generation of stable-expressing cell lines and transgenic zebrafish lines, which we have discussed in the revised manuscript. Establishing stable cell lines with robust POST-IT expression could enhance scalability for drug discovery applications.

      (4) Quantitative Analysis: A more detailed quantitative analysis of the protein interactions identified by POST-IT, including statistical significance and comparative data against other technologies, would enhance the manuscript.

      Thank you for your suggestion. In our assessment of drug-protein affinity, we included Kd values as quantitative measures using MST assays. The protein targets of dasatinib identified through mass spectrometry are also accompanied by p-values for quantitative analysis (Fig. 5A), and the detailed procedures are described in the Material and methods section. While it is challenging to provide direct comparative data against other technologies, our system successfully identified many known target proteins for dasatinib, as well as SEPHS2 and VPS37C as new targets for dasatinib and for HCQ/CQ, respectively, which were not detected by other methods.

      (5) Technological Limitations: The authors should discuss any limitations or potential pitfalls of the POST-IT system, which would be crucial for future users and for guiding subsequent improvements.

      Thank you for your insightful suggestion We agree that clearly defining the technological limitations is important. Therefore, we have expanded our original discussion on the limitations of our POST-IT system (Discussion section, paragraph 6).

      (6) Long-Term Stability and Activity: Information on the long-term stability and activity of the POST-IT components in different biological environments would ensure the reliability of the system in prolonged experiments.

      Yes, this is an important question. We did not notice any stability or toxicity issues with Halo-PafA and Pup substrates in HEK293T cells or zebrafish, which is an important factor for stable cell lines and transgenic zebrafish lines. However, HTL derivatives of the drug could be toxic or unstable due to the nature of the drug or its metabolism, which needs to be taken into account when designing experiments, and we have included this in the Discussion.

      (7) Comparison with Existing Technologies: A detailed comparison with existing proximity tagging and target identification technologies would help position POST-IT within the current landscape, highlighting its unique advantages and potential drawbacks.

      We appreciate your valuable feedback and agree that such comparisons are crucial. We have included a detailed overview and comparison of existing proximity-tagging systems and their related target identification technologies in the Introduction (lines 78-100) and Discussion (lines 391-412), highlighting their respective pros and cons. Additionally, we have expanded the discussion to further compare these technologies with our POST-IT system, addressing its advantages and limitations (lines 378-390, lines 448-467). We hope this provides sufficient context and information to effectively position POST-IT among the landscape of proximity-tagging target identification technologies.

      (8) Concerns Regarding Overexposed Bands: Several figures in the manuscript, specifically Figure 3A, 3B, 3C, 3F, 3G, Figure 4D, and the second panels in Figure 7C as well as some figures in the supplementary file, exhibit overexposed bands.

      We appreciate your astute observation regarding the overexposed bands and apologize for any confusion. The “overexposed” bands represent the unpupylated proteins, while the bands above them correspond to the pupylated proteins. We intended to clearly show both pupylated and unpupylated bands, although the latter are generally much weaker. We are currently working on further improving our POST-IT system to enhance pupylation efficiency.

      (9) Innovation Concern: There is a previous paper describing a similar approach: Liu Q, Zheng J, Sun W, Huo Y, Zhang L, Hao P, Wang H, Zhuang M. A proximity-tagging system to identify membrane protein-protein interactions. Nat Methods. 2018 Sep;15(9):715-722. doi: 10.1038/s41592-018-0100-5. Epub 2018 Aug 13. PMID: 30104635. It is crucial to explicitly address the novel aspects of POST-IT in contrast to this earlier work.

      Thank you for bringing this to our attention. Proximity-tagging systems like BioID, TurboID, NEDDylator, and PafA (Lui Q et al., Nat Methods 2018) were initially developed to study protein-protein interactions or identify protein interactomes, as these applications are of broader interest and generally easier to implement. However, applying proximity-tagging systems for small molecule target identification requires significant optimization. As described in the introduction (lines 78-100), target protein identification systems have since been developed using TurboID and NEDDylator (Tao AJ et al., Nat Commun 2023; Hill ZB et al., J Am Chem Soc 2016). It is conceivable that a PafA-based proximity-tagging system could also be adapted for target-ID, and other groups may pursue this approach in the future. Although the PafA-Pup system shows great promise for target-ID applications, extensive optimization was needed to enable its use for this purpose. Finally, we demonstrate that POST-IT offers distinct advantages over other proximity-tagging-based target-ID systems. For more details, please refer to the introduction and discussion sections.

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      (1) Figure 1- Figure Supplement 1A: The Pup substrate "HB-Pup" is mentioned, but the main text or figure legend provides no introduction or description.

      We appreciate your astute observation. We have added a description in the main text and figure legend as follows: “…and used HB-Pup as a control, which contains 6´His and BCCP at the N terminus of Pup” in the main text (line 142) and “HB, TS, and SBP refer to 6´His and BCCP, twin-STII (Strep-tag II), and streptavidin binding peptide, respectively.” in the Figure 1-figure supplement 1A.

      (2) Figure 1 - Figure Supplement 3B: The authors used TS-sPupK61R as a substrate but did not explain why. The main text mentions that mutating sPup alone did not affect polypupylation, raising the question of why TS-sPupK61R was used in this figure. Furthermore, while the authors state that polypupylation becomes evident after 1 hour of incubation (more pronounced after 2 or 3 hours), the reactions here were conducted for only 30 minutes.

      Thank you for your question. Figure 1 - Figure Supplement 3B was conducted to test self-pupylation levels in the different Halo-PafA derivatives. For this purpose, we could use any Pup substrate such as SBP-sPup and SBPK4R-sPupK61R, instead of Ts-sPup and TS-sPupK61R, as they do not show any differences in pupylation activity. We chose Ts-sPup and TS-sPupK61R simply because any Pup substrates could be used for this purpose. Similarly, we did not need to incubate the reaction for a longer time to detect polypupylation, as our intention was to test “self-pupylation”. We demonstrated in Figure 1 – figure supplement 2 that polypupylation is dependent on the number or position of lysine residues in Pup substrate or tags. The results clearly showed that self-pupylation was almost completely abolished by the Halo8KR mutation. To clarify this, we added the following description in lines 168-169: “Ts-sPup and TS-sPupK61R were chosen as sPup substrates for this experiment, although any Pup substrates could have been used. The levels of self-pupylation were assessed.”

      (3) Line 156: The statement that "the TS-tag completely abolished polypupylation in TS-sPup" is inaccurate. Using TSK8R-sPupK61R as the substrate, several bands appear, which likely represent Halo-PafA with varying degrees of polypupylation. Some bands also appear to correspond to those seen when using TS-sPup as a substrate. The authors should clarify how they distinguish between multipupylation and polypupylation in this case.

      We sincerely appreciate your insight into clarifying the distinction between multipupylation and polypupylation. Polypupylation refers to the addition of a new Pup onto a previously linked Pup on the target protein, akin to polyubiquitination. In contrast, multipupylation involves multiple single pupylations at different positions on the target proteins. Since pupylation occurs exclusively at lysine residues in tag-Pup substrates, mutating all lysine residues to arginine, as in TSK48R-sPupK61R, prevents the mutant tag-Pup from linking to another Pup. This means that only single pupylation can proceed with this type of mutant Pup substrate. If multiple pupylated bands are observed with this mutant substrate, it indicates “multipupylation” rather than “polypupylation”, as shown in Figure 1-figure supplement 2D. The same applies to the pupylation bands in Figure 1-figure supplement 2E and F, as sSBP-sPupK61R and SBPK4R-sPupK61R lack lysine residues. By comparing these multipupylation bands, it is also possible to distinguish them from polypupylation bands, which are marked by yellow arrows. However, after 2-3 pupylation bands, higher-order bands become increasingly difficult to distinguish.

      To clarify the mutation in the TS-tag, we revised the sentence in line 156 from “However, further mutations within the TS-tag completely abolished polypupylation in TS-sPup” to “However, further mutations of two lysine residues within the TS-tag, creating TSK8R-sPupK61R, completely abolished polypupylation in TS-sPup”. Additionally, we have inserted sentences in line 152 to define polypupylation and multipupylation, as described here.

      (4) Line 160: Similar to the above concern about line 156, the claim that SBPK4R and sSBP completely prevented polypupylation is unconvincing and requires more supporting evidence.

      Thank you for raising this concern. As mentioned above, both SBPK4R and sSBP lack lysine residues required for pupylation. As a result, these mutants can only undergo multiple single pupylations on the lysine residues of the target protein, which leads to “multipupylation”. In Figure 1-figure supplement 2E and F, pupylation bands by sSBP-sPupK61R or SBPK4R-sPupK61R do not display doublet bands (one from multipupylation and the other from polypupylation), as seen with SBP-sPup, marked by yellow arrows. Notably, Halo-PafA containing polypupylated branches migrates more slowly than one with an equal number of multipupylation events. To clarify this point, we have added the phrase “as shown in sSBP-sPupK61R and SBP4KR-sPupK61R” at the end of the sentence in line 160.

      (5) Lines 176-177: The authors claim that PafAS126A exhibited reduced polypupylation compared to PafA, but given that PafAS126A may reduce depupylase activity, how could it reduce polypupylation levels? Moreover, it is hard to find any data supporting this conclusion in Figure 1 - Figure Supplement 3B.

      We appreciate your insightful comment. At this point, we do not fully understand how the mutation that reduces depupylase activity also decreases polypupylation. It is possible that PafAS126A has a lower preference for pupylated Pup as a prey, which is required for polypupylation, since depupylase activity depends on recognizing pupylated Pup as a prey to remove it. Nonetheless, Halo-PafAS126A shows reduced levels of higher molecular weight bands compared to Halo-PafA, as shown in Figure 1-figure supplement 3B, while exhibiting increased pupylation in lower molecular weight bands, which represent either multipupylation or low-degree polypupylation. Since higher molecular weight bands (> 150 kD) are likely due to polypupylation, this result suggests reduced polypupylation and increased multipupylation in Halo-PafAS126A. To clarify this in the main text, we have added the following description in line 177: “as evidenced by the decreased levels of high molecular weight bands and an increase in low molecular weight bands”

      (6) POST-IT system in cellulo validation: The system was developed using the Halo-tag, yet the in-cell validation uses FRB and FKBP instead, without explaining this switch. This inconsistency makes the logic of the experiment unclear.

      We appreciate your insightful comment. The interaction between rapamycin and FRB or FKBP is known to be highly specific and robust, making this system useful in various biological contexts. Due to this property, rapamycin can induce interaction between two proteins when one is fused with FRB and the other with FKBP. Before testing or optimizing the POST-IT system in cells, we hypothesized that using the rapamycin-induced interaction between FRB and FKBP could introduce pupylation of the target protein, provided that PafA is fused with FRB or FKBP and the target protein is fused with the other. The results demonstrate that PafA can introduce pupylation of the target protein in a proximity-dependent manner via this chemically induced interaction. To further clarify this in the main text, we modified the original sentence in lines 214-216 as follows: “To mimic drug-target interaction-induced pupylation in live cells and assess the potential of PafA as a proximity-tagging system for target-ID, we incorporated the rapamycin-induced interaction between FRB and FKBP into our PL system, as this interaction between a small molecule and a protein is known to be highly specific and robust (Figure 3—figure supplement 1A).”

      (7) Line 209: The authors decided to use the SBP-tag for further studies due to better performance, but in Figure 3 - Figure supplement 1, they still used the unintroduced HB-Pup as the substrate, which is confusing and lacks explanation.

      Thank you for raising your question. The SBP-tag is not superior to the TS-tag in terms of pupylation activity. However, the TSK8R mutant cannot bind to Strep-Tactin beads, while the SBP mutants, SBPK4R and sSBP, can bind to streptavidin. Therefore, we chose the SBP-tag instead of the TS-tag for further studies as a Pup substrate in POST-IT system, as we needed to pull down the target proteins. HB-Pup is consistently used as a control throughout various experiments, as it is the original Pup substrate. In Figure 3-figure supplement 1B and C, HB-Pup was used to test chemically induced pupylation by PafA. In these cases, it was not so critical which Pup substrate was chosen. Furthermore, we compared HB-Pup and different SBP-sPup substrates in Figure 3-figure supplement 1D, where HB-Pup was used as a control or for comparison. Although pupylation bands with HB-Pup appear more robust, this substrate contains multiple lysine residues, leading to high levels of polypupylation. To make it clear, we modified the sentence in line 209 to “Therefore, we decided to use the SBP-tag as a Pup substrate in the POST-IT system for further studies.”.

      (8) Line 220: Both SBP-sPup and SBPK4R-sPupK61R are described as exhibiting efficient pupylation, but the data show mostly self-pupylation and little to no pupylation of the target protein.

      Thank you for your concern. However, pupylation of the target protein is actually quite substantial, as the intensities of the free form and pupylated proteins are relatively similar, as shown in the upper panel of Figure 3-figure supplement 1D. Self-pupylation is always much higher than target pupylation, because PafA constantly pupylates itself, whereas pupylation of the target protein occurs only through interaction. Furthermore, V5-FRB-mKate2-PafA contains many lysine residues, which increases the levels of self-pupylation.

      (9) Lines 222-224: The authors chose SBPK4R-sPupK61R to avoid polypupylation, although SBP-sPup did not cause detectable polypupylation. Neither substrate caused pupylation of the target protein, so the rationale behind this choice is unclear.

      Thank you for raising your question. Similar to the above comment (#8), please refer to the pupylation bands of the target protein, as shown in the upper panel of Figure 3-figure supplement 1D. The pupylation band of the target protein is quite remarkable, as the intensities of the free form and pupylated proteins are comparable. Additionally, there are no multiple pupylation bands in either case, except for one additional weak multipupylation band, indicating no polypupylation by SBP-sPup, which does not have K-to-R mutations. Of course, SBPK4R-sPupK61R can only undergo single pupylation, as it does not contain lysine residues. Although we did not observe polypupylation by SBP-sPup in this experimental condition, it is possible that SBP-sPup may cause polypupylation under different experimental conditions or with other target proteins. Since SBPK4R-sPupK61R exhibits comparable pupylation of the target protein at least in this experiment setting as SBP-sPup, we selected SBPK4R-sPupK61R as the Pup substrate for POST-IT system to avoid any potential polypupylation that could be caused by SBP-sPup in other cases. We believe that polypupylation can introduce bias into the analysis and hinder the comprehensive discovery of additional target proteins for small molecules.

      (10) Line 224: The authors conclude that rapamycin greatly reduced self-pupylation, but the supporting data are unclear.

      Thank you for your constructive comments on our manuscript. Please refer to the lower panel of Figure 3-figure supplement 1D. When using either SBPK4R-sPupK61R or SBP-sPup, rapamycin treatment results in reduced levels of self-pupylation compared to the no-treatment control. However, we did not observe this reduction with HB-Pup and do not know the reason. To clarify this in the main text, we added the following description to the end of the sentence: “when using either SBPK4R-sPupK61R or SBP-sPup, as shown in the lower panel of Figure 3—figure supplement 1D”

      (11) Line 234: The authors selected an 18-amino acid linker, but given that linkers longer than 10 amino acids enhance labeling, this choice should be explained.

      Thank you for raising your question. In fact, a linker of 10 amino acids (aa) or longer is likely to behave similarly. We chose an 18 aa linker instead of a 40 aa linker primarily for the convenience of cloning and to reduce the potential for DNA sequence recombination associated with longer repeats. Additionally, a longer, flexible linker may behave like an intrinsically disordered protein (Harmon et al., 2017), which can lead to unwanted protein-protein interactions or phase separation. To elaborate on this, we added the following sentences after the sentence in line 233-235: “We chose the 18-amino acid linker instead of the 40-amino acid linker for easier cloning and to lower the risk of DNA recombination from longer repeats. Additionally, a longer, flexible linker may behave like an intrinsically disordered protein (Harmon et al., 2017), an unwanted feature for target-ID.”

      (12) S126A and K172R mutations: The authors claim that these mutations additively enhanced pupylation under cellular conditions, but in Figure 3B, the band intensities appear similar for the wild-type and mutant versions.

      Thank you for raising your concern. Although a single pupylation band appears similar among the three different Halo-PafA proteins, multipupylation bands are slightly but noticeably increased by the S126A and K172R mutations compared to Halo8KR-PafA. Since we used SBPK4R-sPupK61R as a Pup substrate, all higher molecular weight bands result from multipupylation rather than polypupylation. This illustrates why it is preferable to use SBPK4R-sPupK61R over SBP-sPup, as the pupylation bands with SBP-sPup are mixtures of poly- and multipupylation, making it difficult to assess levels of target labeling. To clarify this in the main text, we added the following description after the sentence in line 236: “as the higher molecular weight multipupylation bands are slightly but noticeably increased with these mutations compared to Halo8KR-PafA”

      (13) Line 263: The authors selected DH5 for further experiments due to its efficiency, but the data suggest that the performance of DH1 to DH5 is similar.

      We appreciate your question about the different dasatinib HTL derivatives. However, our data clearly show that DH2-5 derivatives bind significantly more effectively to Halo-PafA in vitro and in live cells compared to DH1 (Figure 4A and B). Additionally, the DH2-5 derivatives result in dramatically increased pupylation of the target protein in vitro and noticeable enhancement in live cells (Figure 4C and D). Among DH2 to DH5, there is no obvious difference in binding to Halo-PafA or pupylation of the target protein. Therefore, we chose DH5, as we believe that the longer linker in DH5 may facilitate the binding of a more diverse range of target proteins to dasatinib, enabling the discovery of additional target proteins.

      (14) Line 309: The authors introduce HCQ and CQ as important drugs but then investigate the mechanism using DC661 without introducing or justifying the choice of this compound.

      Thank you for your point. We explained the reason to choose DC661, a dimer form of CQ, instead of CQ for the synthesis of an HTL derivative in line 310. “assuming that a dimer would enhance binding affinity as previously described.” As the dimer forms of a drug or a small molecule such as testosterone dimers, estrogen dimers, and numerous anticancer drug dimers have been often developed to enhance drug effects (Paquin A et., Molecules 2021). Similarly, dimer forms of HCQ/CQ have been introduced and shown to be more potent (Hrycyna CA et al., ACS Chem Biol 2014; Rebecca VW et al., Cancer Discovery 2019). We expected that using a dimer form might offer higher probability to identify target proteins for HCQ/CQ.

      (15) The authors suggest that multipupylation levels were enhanced but do not explain whether this might benefit the system or introduce other issues. Clarifying this point would provide valuable insight for potential users of this system.

      Thank you for your thoughtful suggestion. Polypupylation likely leads to biased enrichment of a limited set of target proteins, and its levels may not correlate with the binding affinity of target proteins to the small molecule of interest, features that can negatively impact target-ID. In contrast, multipupylation may be correlated with binding affinity or interaction frequency, as we observed increased levels of multipupylation with higher Pup concentrations and longer incubation times. This suggests that target proteins with multiple lysines in proximity to PafA can be sequentially pupylated, starting with the most accessible lysine. However, if a target protein has only one accessible lysine, pupylation will occur only once, regardless of the protein’s affinity to the small molecule. In summary, while polypupylation may be a drawback for target-ID, multipupylation could be useful for both target-ID and understanding binding mode. To elaborate on this, we added the following additional explanation after the sentence in line 152: “, whereas multipupylation is more likely correlated with binding affinity or interaction frequency.”

      (16) The author should address whether the Halotag ligand modification of the drug alters the binding properties between the drug and targets. That may be causing artifact binding of the drug and other proteins.

      Thank you for your insightful comment. Yes, it is true that chemical modifications of the small molecule of interest, such as linker derivatization (e.g., HTL) or photo-affinity labeling, generally lead to reduced activity or affinity compared to the original molecule. Synthesizing a derivative is a common challenge across all target-ID methods, except for modification-free approaches, as we mentioned in the Discussion. However, modification-free methods like DARTS, CETSA, and TPP have their own limitations, including low sensitivity or high false positive rates. Identifying the optimal position for chemical modification on the small molecule of interest is critical. We chose dasatinib and HCQ/CQ as model compounds, because previous studies provided insights into their derivative synthesis. In addition, our data show that DH5 retains robust kinase inhibitory activity (Figure 4-figure supplement 2), and DC661-H1 exhibits potent autophagy inhibition (Figure 6-figure supplement 1). For novel compounds, a thorough structure-activity relationship study is essential to identify the optimal position for HTL derivative synthesis.

      (17) The author stated there is no observable toxicity in zebrafish without providing a detailed analysis or enough data. Further analysis of the expression of Halo-PafA and its substrate sPup influence on toxicity or side effects to the living cells or animals would be needed. It is important for in vivo applications.

      Thank you for your constructive suggestion. We have now included additional experimental data in Figure 7-figure supplement 1, showing no toxicity in zebrafish embryos expressing the POST-IT system. We assessed toxicity in two ways: by injecting the POST-IT DNA plasmid into one-cell-stage embryos for acute expression, and by using embryos from transgenic zebrafish expressing POST-IT under a heat-shock inducible promoter. Neither the injection nor the heat-shock activation of POST-IT expression resulted in any noticeable toxicity.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife Assessment

      This important work presents two studies on predictive processes in subjects with and without tinnitus. The evidence supporting the authors' claims is compelling, as their second study serves as an independent replication of the first. Rigorous matching between study groups was performed, especially in the second study, increasing the probability that the identified differences in predictive processing can truly be attributed to the presence of tinnitus. This work will be of interest to researchers, especially neuroscientists, in the tinnitus field.

      We thank the editors at elife very much for their favorable assessment of our manuscript. Based upon the comments of the reviewer, we aimed to further improve our manuscript to be a valuable addition to the tinnitus research field.

      Public Reviews:

      Reviewer #2 (Public review):

      Summary:

      This study aimed to test experimentally a theoretical framework that aims to explain the perception of tinnitus, i.e., the perception of a phantom sound in the absence of external stimuli, through differences in auditory predictive coding patterns. To this aim, the researchers compared the neural activity preceding and following the perception of a sound using MEG in two different studies. The sounds could be highly predictable or random, depending on the experimental condition. They revealed that individuals with tinnitus and controls had different anticipatory predictions. This finding is a major step in characterizing the top-down mechanisms underlying sound perception in individuals with tinnitus.

      Strengths:

      This article uses an elegant, well-constructed paradigm to assess the neural dynamics underlying auditory prediction. The findings presented in the first experiment were partially replicated in the second experiment, which included 80 participants. This large number of participants for an MEG study ensures very good statistical power and a strong level of evidence. The authors used advanced analysis techniques - Multivariate Pattern Analysis (MVPA) and classifier weights projection - to determine the neural patterns underlying the anticipation and perception of a sound for individuals with or without tinnitus. The authors evidenced different auditory prediction patterns associated with tinnitus. Overall, the conclusions of this paper are well supported, and the limitations of the study are clearly addressed and discussed.

      Weaknesses:

      Even though the authors took care of matching the participants in age and sex, the control could be more precise. Tinnitus is associated with various comorbidities, such as hearing loss, anxiety, depression, or sleep disorders. The authors assessed individuals' hearing thresholds with a pure tone audiogram, but they did not take into account the high frequencies (6 kHz to 16 kHz) in the patient/control matching. Moreover, other hearing dysfunctions, such as speech-in-noise deficits or hyperacusis, could have been taken into account to reinforce their claim that the observed predictive pattern was not linked to hearing deficits. Mental health and sleep disorders could also have been considered more precisely, as they were accounted for only indirectly with the score of the 10-item mini-TQ questionnaire evaluating tinnitus distress. Lastly, testing the links between the individuals' scores in auditory prediction and tinnitus characteristics, such as pitch, loudness, duration, and occurrence (how often it is perceived during the day), would have been highly informative.

      Thank you very much for your careful evaluation of our manuscript. We agree with you that our study design has some limitations such as the assessment of higher frequencies, comorbidities, and tinnitus characteristics. In our discussion, we aimed to acknowledge these issues for future research to improve this study design and gain more insights into neural tinnitus processes.

      See e.g.:

      Line 946-949:

      “Additionally, we rigorously controlled for hearing loss in Study 2, however, pure-tone audiometric testing was solely performed up to 8kHz and we were therefore not able to draw conclusions regarding hearing impairments in higher frequencies and their influence on the effects.”

      Line 949-954:

      “Moreover, we did not screen our participants for hyperacusis. This hypersensitivity to mild sounds is widely correlated with the sensation of tinnitus and underlying neural mechanisms are potentially intertwined with tinnitus processes (Schilling et al., 2023; Yukhnovich et al., 2023; Zheng, 2020). Screening for hyperacusis in future work can therefore reveal more details on participant characteristics influencing predictive processing.”

      Line 955-958:

      “In both studies, tinnitus distress was not correlated with the reported prediction effects. Nevertheless, tinnitus can also be characterized by other features such as its loudness, pitch or duration which were not included in the experimental assessment.”

      Line 958-963:

      “Additionally, we solely used a short version of the Mini-TQ (Goebel and Hiller, 1992) in Study 2, which did not allow us to relate prediction scores to subscales like sleep disturbances which potentially influence cognitive functioning and thus predictive processing. Next to sleeping disorders and distress, tinnitus is often also accompanied by psychological comorbidities such as depression or anxiety (Langguth, 2011) which are potential confounds of the results.”

      Comments on revisions:

      Thank you for your responses. There are a few remaining points that, if addressed, could further enhance the manuscript:

      - While the manuscript acknowledges the limitation of not matching groups on hearing thresholds in Study 1, a deeper analysis of participants' hearing abilities and their impact on MEG results, similar to that conducted in Study 2, would be valuable. Specifically, including a linear model that considers all frequencies, group membership, and their interactions could highlight differences across groups. Additionally, examining the effect of high-frequency hearing loss on prediction scores, as performed in Study 2, would strengthen the analysis, particularly given the trend noted (line 719). Such an addition could make a significant contribution to the literature by exploring how hearing abilities may influence prediction patterns.

      We appreciate your feedback and agree with you that it is a crucial question how hearing abilities influence prediction patterns in tinnitus. However, as hearing status was not assessed in the control group in study 1, we are unfortunately not able to include linear models to investigate differences across groups in this sample. This led us to the implementation of study 2 with a comprehensive hearing assessment to investigate group differences. We highlighted this issue in our methods section.

      Line 170-172:

      “As pure-tone audiometric testing was not included for the control subjects, group comparisons between hearing thresholds were not feasible.”

      - The connection with the hippocampal regions (line 864) remains somewhat unclear. While the inclusion of the Paquette reference appropriately links temporal region activity with tinnitus, it does not fully support the statement: "An increased focus on hippocampal regions, e.g., in fMRI, patient, or animal studies, could be a worthwhile complement to our MEG work, given the outstanding relevance of medial temporal areas in the formation of associations in statistical learning paradigms"

      Thank you for your constructive input. This section is purely speculative, and we do not aim to provide strong claims or expected results but solely point out potential future research directions.

      - Authors should add a comparison of participants mini-TQ scores on both studies

      We appreciate your input and added a comparison of mini TQ-scores between samples. For study 1, all subscales were included, however, we computed the comparison solely based on the items of the mini-TQ to increase comparability. The results were not significant, i.e., tinnitus distress values did not differ between studies.

      Line 629-632:

      “We additionally compared tinnitus distress values assessed by the mini-TQ (Goebel and Hiller, 1992) between study 1 and study 2 to detect potential differences between the samples, however, results of the Welch’s t-test were not significant with t(30.7)=1.27, p\=.214.”

      - Authors should add significant level on Fig 6.B as in Fig 3.C, and a n.s on Fig 6.D

      Thank you very much for your input, we added significance levels and a n.s. to the Figures 6B and 6D.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This article identifies ADGR3 as a candidate GPCR for mediating beige fat development. The authors use human expression data from Human Protein Atlas and Gtex databases and combine this with experiments performed in mice and a murine cell line. They refer to a GPCR bioactivity screening tool PRESTO-Salsa, with which it was found that Hesperetin activates ADGR3. From their experiments, authors conclude that Hesperetin activates ADGR3, inducing a Gs-PKA-CREB axis resulting in adipose thermogenesis.

      Strengths:

      The authors analyze human data from public databases and perform functional studies in mouse models. They identify a new GPCR with a role in thermogenic activation of adipocytes.

      Considerations:

      Selection of ADGRA3 as a candidate GPCR relevant for mediating beiging in humans:

      The authors identify GPCRs that are expressed more highly in murine iBAT compared to iWAT in response to cold and assess which of these GPCRs are expressed in human subcutaneous or visceral adipocytes. Although this strategy will identify GPCRs that are expressed at higher levels in brown fat compared to beige and thus possibly more active in thermogenic function, the relevance in choosing GPCRs that also are expressed in unstimulated human white adipocytes should be considered. Thermogenic activity is not normally present in human white adipocytes. It would have strengthened the GPCR selection if the authors instead had assessed the intersection with human brown adipocytes that were activated with norepinephrine.

      We appreciate your constructive feedback and believe that by adopting this refined strategy, we will strengthen our selection of GPCRs related to adipose thermogenesis in other ongoing studies. We look forward to continuing our research in this area and contributing to the understanding of adipose thermogenesis and its potential therapeutic applications. Thank you once again for your valuable input. 

      Strategy to investigate the role of ADGRA3 in WAT beiging:

      Having identified ADGRA3 as their candidate receptor, the authors investigated the receptor in mouse models, the murine inguinal adipocyte cell line 3T3 and in human subcutaneous adipose progenitors (HAdsc) differentiated in vitro. Calling the human cells "beige" is a stretch as these cells are derived from a white adipose depot. The authors do observe regulation in UCP1 and abundance of mitochondria following modification of ADGRA3 in the cells. However, in future studies, it should be considered if the receptor rather plays a role in differentiation per se, and perhaps not specifically in thermogenic differentiation/activity.

      Regarding the reviewer's suggestion to consider whether ADGRA3 plays a role in differentiation per se, rather than specifically in thermogenic differentiation/activity, we acknowledge that this is an important consideration. Our current studies have focused on the role of ADGRA3 in regulating UCP1 expression and mitochondrial abundance, which are hallmarks of adipose thermogenic activity. However, we recognize that ADGRA3 may also have broader roles in adipocyte differentiation and function that are not limited to thermogenesis.

      To address this point, in future studies, we plan to conduct additional experiments to investigate the potential role of ADGRA3 in adipocyte differentiation, including its effects on the expression of markers of adipocyte differentiation and its impact on adipocyte metabolism and function. These studies will provide further insights into the mechanisms by which ADGRA3 regulates adipocyte biology.

      According to the Human Protein Atlas and Gtex databases, ADGRA3 is not only expressed in adipocytes, but also in other tissues and cell types. The authors address this by measuring the expression in a panel of these tissues, demonstrating a knockdown not only in the adipose tissue, but also in the liver and less pronounced in the muscle (Figure S2). It should thus be emphasized that the decreased TG levels in serum and liver in the mice might in fact depend on Adgra3 overexpression in the liver. Even though this might not have been the purpose of the experiment, it is important to highlight this as it could serve as hypothesis building for future studies of the function of this receptor.

      Thank you for your thoughtful comments and feedback. We appreciate the insight provided by the Human Protein Atlas and Gtex databases regarding the tissue distribution of ADGRA3. We fully acknowledge that the decreased TG levels observed in both the serum and liver of the mice might be linked to the overexpression of Adgra3 in the liver.

      Although this was not the primary objective of our experiment, we agree that this observation is worth highlighting as it could serve as a basis for future hypothesis-driven research on the functional role of ADGRA3 in different tissues. In light of your comments, we emphasized this potential link between Adgra3 overexpression in the liver and reduced TG levels in discussion, as follows.

      “…the precise mechanisms underlying the influence of on adipose thermogenesis. Furthermore, it is crucial to highlight that the observed decrease in TG levels in both serum and liver (Figure 4-figure supplement 2C-D) might be attributed to the significant increase in Adgra3 expression in the liver, which is a consequence of the nanoparticle-mediated overexpression of Adgra3. While the exact mechanism remains to be fully elucidated, this correlation suggests a potential link between Adgra3 overexpression in the liver and reduced TG levels in the serum. We will employ more sophisticated models in subsequent studies to further…”

      Reviewer #3 (Public Review):

      Summary:

      The manuscript by Zhao et al. explored the function of adhesion G protein-coupled receptor A3 (ADGRA3) in thermogenic fat biology.

      Strengths:

      Through both in vivo and in vitro studies, the authors found that the gain function of ADGRA3 leads to browning of white fat and ameliorates insulin resistance.

      Weaknesses:

      There are several lines of weak methodologies such as using 3T3-L1 adipocytes and intraperitoneal(i.p.) injection of virus. Moreover, as the authors stated that ADGRA3 is constitutively active, how could the authors then identify a chemical ligand?

      Comments on revised version:

      The revised manuscript by Zhao et al. has limited improvement. The authors refused to perform revised experiments using primary cultures even though two reviewers pointed out the same weakness (3T3-L1 adipocytes are unsuitable). Using infrared thermography to measure body temperature is also problematic.

      Thanks for your comments. We regret that human adipocytes induced from human adipose-derived stem cells (hADSCs) were not recognized as primary cultures by multiple reviewers. Therefore, we have included relevant experimental results of mouse primary adipocytes induced from stromal vascular fraction (SVF) in Figure 8E-H as a supplement. The thermal imaging device was used to measure the temperature of BAT, while the body temperature was measured at 9:00 using a rectal probe connected to a digital thermometer.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      This paper presents a data processing pipeline to discover causal interactions from time-lapse imaging data, and convicingly illustrates it on a challenging application for the analysis of tumor-on-chip ecosystem data. The core of the discovery module is the original tMIIC method of the authors, which is shown in supplementary material to compare favourably to two state-of-the-art methods on synthetic temporal data on a 15 nodes network.

      Strengths:

      This paper tackles the problem of learning causal interactions from temporal data which is an open problem in presence of latent variables. The core of the method tMIIC of the authors is nicely presented in connection to Granger- Schreiber causality and to the novel graphical conditions used to infer latent variables and based on a theorem about transfer entropy. tMIIC compares favourably to PC and PCMCI+ methods using different kernels on synthetic datasets generated from a network of 15 nodes. A full application to tumor-onchip cellular ecosystems data including cancer cells, immune cells, cancer-associated fibroblasts, endothelial cells and anti cancer drugs, with convincing inference results with respect to both known and novel effects between those components and their contact.

      The code and dataset are available online for the reproducibility of the results.

      We thank Reviewer #1 for highlighting the main results and strengths of our paper, as well as, for his/her recommendations below to further improve the manuscript.

      Weaknesses:

      The references to ”state-of-the-art methods” concerning the inference of causal networks should be more precise by giving citations in the main text, and better discussed in general terms, both in the first section and in the section of presentation of CausalXtract. It is only in the legend of the figures of the supplementary material that we get information. Of course, comparison on our own synthetic datasets can always be criticized but this is rather due to the absence of common benchmark and I would recommend the authors to explicitly propose their datasets as benchmark to the community.

      Following Reviewer #1’s suggestion, we now compare tMIIC’s performance to other state-of-the-art causal discovery methods for time series data in the main text and in a new Figure 2. This Figure 2 also highlights the relation between graph-based causal discovery methods for time series data and Granger-Schreiber temporal causality, as discussed in more details in Methods (Theorem 1).

      We also agree about the importance of sharing benchmark datasets with the community. This is the reason why we provide the dynamical equations of the 15-node benchmarks in Supplementary Tables 1 & 2, so that anyone can generate equivalent time series datasets of any desired length.

      Reviewer #2 (Public review):

      Summary:

      The authors propose a methodology to perform causal (temporal) discovery. The approach appears to be robust and is tested in the different scenarios: one related with live-cell imaging data, and another one using synthetic (mathematically defined) time series data. They compare the performance of their findings against another well-know method by using metrics like F-score, precision and recall,

      Strengths:

      Performance, robustness, the text is clear and concise, The authors provide the code to review.

      We thank Reviewer #2 for his/her positive assessment of our work and the suggestions below to improve the manuscript.

      Weaknesses:

      One concern could be the applicability of the method in other areas like climate, economy. For those areas, public data are available and might be interesting to test how the method performs with this kind of data.

      While our main expertise concerns the analysis of biological and biomedical data, we agree that tMIIC (which is included in MIIC R package) could in principle be applied to other areas, like climate, economy.

      We have not included benchmarks on such diverse types of datasets in the present manuscript, which focuses on CausalXtract’s pipeline for the analysis and causal interpretation of live-cell time-lapse imaging data from complex cellular systems.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The authors aim to elucidate the diversity and gene expression patterns of marine plankton using innovative collection and sequencing methodologies. Their work investigates the taxonomic and functional profiles of planktonic communities, providing insights into their ecological roles and responses to environmental changes.

      Strengths:

      The methodology utilized in this study, particularly the combination of single-cell sequencing and advanced bioinformatics techniques, represents a significant advancement in the field of plankton research. The application of the Smart-seq2 protocol for cDNA synthesis, followed by rigorous quality control measures, ensures high-quality data generation. This comprehensive approach not only enhances the resolution of the obtained genetic information but also allows for a more detailed exploration of the diversity and functional potential of the phytoplankton community.

      One of the major strengths of this study is the rigorous methodological approach, including precise sampling techniques and robust data analysis protocols, which enhance the reliability of the results. The use of advanced sequencing technologies allows for a comprehensive assessment of gene expression, significantly contributing to our understanding of plankton diversity and its implications for marine ecosystems.

      Weaknesses:

      While the evidence presented is solid, there are areas where the analysis could be expanded. The authors could further explore the ecological interactions within plankton communities, which would provide a more holistic view of their functional roles. Additionally, a broader discussion of the implications of their findings for marine conservation efforts could enhance the manuscript's impact.

      The choice of both the plankton net and filter pore size during the plankton collection process is critical, as these factors directly impact the types of phytoplankton collected. The use of a 25 μm filter paper, in particular, may result in the omission of many eukaryotic phytoplankton species. This limitation, combined with the characteristics of the plankton net, could affect the comprehensiveness and accuracy of the results, potentially influencing the study's conclusions regarding phytoplankton diversity.

      The timing of fixation is crucial, as it directly affects whether the measured transcriptome accurately represents the organisms' actual transcriptional state in their native water environment. If fixation occurred a significant time after sample collection, the transcriptomic data may not reflect their true in situ transcriptional activity, which greatly reduces the relevance of this method.

      Thank you for your time, effort, and expertise.

      We agree that additional analyses could improve our understanding of the plankton communities sampled. We have conducted an array of alternative analyses that were not included in the current manuscript and plan to perform new analyses over the next few months as part of a deeper revision of the manuscript. We are especially interested in “providing a more holistic view of the functions” of individual plankton within the community.

      As for the protocol details, the pore size of the filter paper was chosen to focus on ~100 micron-sized organisms as a starting point: they are likely to contain more RNA than smaller organisms, making them well suited for an initial proof of concept of the methodology. That choice, however, is not particularly tightly constrained, therefore smaller plankton could be captured. This is supported by the lack of correlation, in our data, between organismal size and number of detected sequencing reads.

      Timing to cell death/fixation is a common question we receive not just in this manuscript but any RNA-Seq from primary samples. In this case, plankton were seen swimming until picking, and after picking each organism was deposited within two seconds into a lysis buffer for fixation. Therefore, we do not have reason to believe that the transcriptional activity sampled in the sequencing reads differs in any major way from the one in living plankton. Nonetheless, a study specifically testing the effect of time between ocean sampling and reverse transcription would provide more quantitative information on this point.

      Reviewer #2 (Public review):

      Summary:

      The paper introduces Ukiyo-e-Seq, a novel method integrating microscopy with single-cell transcriptomics to study individual, uncultured eukaryotic plankton cells. By combining microscopic imaging with transcriptomic analysis, the approach links plankton morphology to gene expression, enabling taxonomic identification and functional protein exploration. Ukiyo-e-Seq was tested on 66 microbial eukaryotic cells, revealing taxonomic diversity across four superkingdoms and allowing analysis of protein complexes and developmental genes in individual species. According to the authors, this method has the potential to advance single-cell marine biodiversity studies by addressing limitations in traditional taxonomy and metatranscriptomics, especially for rare or uncultured organisms.

      However, the study's conclusions are often weakly supported by data, particularly given that this is not the first study to combine microscopy and single-cell transcriptomics of eukaryotic plankton using Smart-seq2.

      Strengths:

      A notable strength is the authors' generation of several single-cell transcriptomes for the diatom Chaetoceros, which could benefit from greater focus rather than broadly addressing eukaryotic single cells.

      Weaknesses:

      The study lacks comparison with other single-cell transcriptomics studies and it was presented as the first study that combines imaging and single-cell transcriptomics (smart-seq2) of eukaryotic plankton while in fact it is not. The sampling methodology is not replicable as the authors used a tea strainer instead of standard plankton collection equipment to filter larger cells. Terminology throughout the paper is unconventional, such as "public and private contigs," "single-organism genomics," "highly expressed contigs," and "optical methods." Additionally, the authors did not specify which database was used for taxonomic assignments. These issues may stem from the authors' limited background in microbial ecology. Overall, the study has many drawbacks and it could benefit from complete rewriting and focusing mainly on single-cell transcriptomics of diatoms.

      Thank you for your time, effort, and expertise.

      There might be a bit of confusion between single-cell and single-organism sequencing, likely due to lack of clarity in our initial submission. In particular, in this manuscript no effort was spent trying to dissociate oligocellular plankton into individual cells before sequencing. While probably feasible, we expect that to be technically much harder than single-organism sequencing as performed here. The reviewer does not reference a published paper where combined imaging and RNA-Seq of individual uncultured plankton has been achieved, and we were unable to find one in the scientific literature. As stated in the manuscript, others have already performed some work on cultured plankton and single-organism sequencing (without matching images) of uncultured environmental microorganisms.

      The suggestion to focus on a smaller biological niche such as diatoms and adopt language more familiar to that specific community is well received. Indeed, given that organisms as diverse as fish larvae and diatoms could be profiled with Ukiyo-e-Seq, future studies could use the same method to address specific questions with a deeper and more narrow scope. However, this manuscript is demonstrating the feasibility of Ukiyo-e-Seq and its ability to produce usable data for a broad spectrum of organisms: part of the scientific audience might not have a specific interest in diatoms.

      The tea strainer was used for coarse pre-filtering: the exact pore size, geometry and factory tolerance on those measurements are inconsequential because each organism is later chosen (or not) based on a high-resolution microscopy image (or multiple, if fluorescence is considered). This really is a strength of Ukiyo-e-Seq over FACS or droplet-based sorters, which can only collect coarse optical information from each organism for (typically) less than 1 millisecond. In Ukiyo-q-Seq, while the actual decision to pick an individual is currently manual (by the operator of the picker), it can be automated in principle. For instance, one could build a machine learning model of plankton taxonomy based on a large collection of labelled images and use predictions from such a model to automatically drive the picker (e.g. focussing on diatoms), increasing throughput. Even in that case, however, the initial filtering stages using tea strainers, plankton nets, filter paper etc. would not be critical for the final selection of individuals as long as they are not too restrictive.

      The database used for taxonomic assignment was the NCBI non-redundant nucleotide database, accessed through the reference library provided by Kraken2 (nt).

      Reviewer #3 (Public review):

      Gatt et al. present a novel take on single-cell RNA-sequencing from complex planktonic samples, introducing an approach they aptly named Ukiyo-e-Seq. This work combines environmental sampling with cell picking, microscopic imaging, and Smart-seq2 single-cell RNA sequencing to profile uncultured eukaryotic plankton. Developing single-cell approaches for such ecosystems is critical, given the poor representation of many planktonic species in cultures and reference databases. This work could help bridge existing technological gaps between morphological and molecular studies of aquatic microeukaryotes

      The authors argue that microscopy does not provide information on the biochemistry of species under consideration. At best, it provides taxonomic labeling of species within a sample, yet imaging fails to assess their metabolic state or to disentangle cryptic species. In a standard metatranscriptomic setup, the sequence pool is described by aligning assembled contigs with reference databases to obtain functional and taxonomic information. This complex community-level data is impossible to parse at the single-organism level. Moreover, by relying on reference datasets, a lot of potential information can be missed. The aim of the approach is to combine the strengths of both methods, generating single-cell transcriptomic data linked to individual plankton images.

      Strengths:

      Ukiyo-e-Seq generated a valuable dataset by combining imaging and transcriptomics for individual planktonic organisms from environmental samples. This multimodal approach has the potential to improve taxonomic predictions and functional insights at the single-organism level. This manuscript demonstrates the technical feasibility of such an approach. Data of this type is rare and thus represents a valuable resource to further advance single-cell sequencing of planktonic species from environmental samples.

      Weaknesses:

      (1) The merge-split strategy, where single-cell reads are pooled prior to assembly, is counterintuitive. Pooling obscures the single-organism resolution that single-cell methods aim to achieve. The approach might be useful for assembling low-coverage contigs, but risks masking unique expression profiles for transcripts unique to a given well. As an alternative, the authors could assemble each well independently to obtain well-specific transcriptomic bins. Assemblies could then be clustered based on sequence similarity, thereby imposing strict clustering parameters to maintain resolution, to create a common reference for downstream analysis if needed. In my opinion, better results would be obtained by implementing a per-well assembly and read mapping.

      (2) The focus on the top five most expressed contigs throughout the manuscripts' data analysis is a limiting choice, as it excludes most contigs. In the preprint, we are presented with a very narrow view of the data. Visualising the entire range of assembled contigs would provide a better picture of the transcriptomic composition and diversity per well. It would be interesting to assess if the full information could be used to preliminary bin transcriptomic sequences from individual wells, for example, by gathering all 'private' contigs with high read coverage in a single well. Does such a set represent a single complete eukaryotic transcriptome?

      (3) I missed a verification with (broad-scale) taxonomic assessments based on the associated microscopic images. In their goals, the authors state that a joint approach has the potential to discover new taxonomic biodiversity. I agree, and to me, this is what is exciting about the preprint, yet I miss an example or the right bioinformatic implementation to drive home this claim. Are there organisms in wells where poor taxonomic annotations, based on alignment to a reference database or the LCA approach implemented in Kraken2, would usually result in ignoring the species in classic metatranscriptomics? Can you advance the taxonomic annotation by referring back to the organisms' picture? Can manual assessment of taxonomy advance the results from the LCA approach?

      (4) The current use of AlphaFold to predict protein structures does not convincingly add to the study's core objectives.

      Overall, Ukiyo-e-Seq presents a promising method for studying single-cell diversity in environmental samples, though the bioinformatic pipeline requires refinement to support some of the claims made by the authors. Additionally, the manuscript would benefit from clarity and additional details in its methods and a more consistent approach to presenting results and summary statistics across all assembled contigs and all sampled wells, rather than focusing on selected wells.

      Thank you for your time and effort, and for your expertise on the matter.

      The suggestions to conduct additional bioinformatic analyses to explore more fully the criticality and potential of various design choices (e.g. meta-assembly) are well received. We have tried some of those ideas already (e.g. assembling individual wells) and we have considered but not yet conducted or polished others (e.g. a more thorough taxonomic verification). We will endeavour to carry out as many of those analyses as possible during the deeper revision process in the coming months.

      AlphaFold 3’s use was designed to demonstrate the ability to investigate protein-protein interactions from individual species. When two peptide sequences are detected within the same well, they are more likely to be potential interacting partners than in a metatranscriptomic study, because the compartmentalisation of reads into tens or hundreds of wells greatly reduces the search space of potential interaction partners (which has a baseline runtime complexity of n squared, where n is the number of peptide sequences identified).

      ----------

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      In this manuscript, Liu et al. present CROWN-seq, a technique that simultaneously identifies transcription-start nucleotides and quantifies N6,2'-O-dimethyladenosine (m6Am) stoichiometry. This method is derived from ReCappable-seq and GLORI, a chemical deamination approach that differentiates A and N6-methylated A. Using ReCappable-seq and CROWN-seq, the authors found that genes frequently utilize multiple transcription start sites, and isoforms beginning with an Am are almost always N6-methylated. These findings are consistently observed across nine cell lines. Unlike prior reports that associated m6Am with mRNA stability and expression, the authors suggest here that m6Am may increase transcription when combined with specific promoter sequences and initiation mechanisms. Additionally, they report intriguing insights on m6Am in snRNA and snoRNA and its regulation by FTO. Overall, the manuscript presents a strong body of work that will significantly advance m6Am research.

      Strengths:

      The technology development part of the work is exceptionally strong, with thoughtful controls and well-supported conclusions.

      We appreciate the reviewer for the very positive assessment of the study. We have addressed the concerns below.

      Weaknesses:

      Given the high stoichiometry of m6Am, further association with upstream and downstream sequences (or promoter sequences) does not appear to yield strong signals. As such, transcription initiation regulation by m6Am, suggested by the current work, warrants further investigation.

      We thank the reviewer for the insightful comments. We have softened the language related to m6Am and transcription regulation. We totally agree with the reviewer that future investigation is required to determine the molecular mechanism behind m6Am and transcription regulation.

      Reviewer #2 (Public review):

      Summary:

      In the manuscript "Decoding m6Am by simultaneous transcription-start mapping and methylation quantification" Liu and co-workers describe the development and application of CROWN-Seq, a new specialized library preparation and sequencing technique designed to detect the presence of cap-adjacent N6,2'-O-dimethyladenosine (m6Am) with single nucleotide resolution. Such a technique was a key need in the field since prior attempts to get accurate positional or quantitative measurements of m6Am positioning yielded starkly different results and failed to generate a consistent set of targets. As noted in the strengths section below the authors have developed a robust assay that moves the field forward.

      Furthermore, their results show that most mRNAs whose transcription start nucleotide (TSN) is an 'A' are in fact m6Am (85%+ for most cell lines). They also show that snRNAs and snoRNAs have a substantially lower prevalence of m6Am TSNs.

      Strengths:

      Critically, the authors spent substantial time and effort to validate and benchmark the new technique with spike-in standards during development, cross-comparison with prior techniques, and validation of the technique's performance using a genetic PCIF1 knockout. Finally, they assayed nine different cell lines to cross-validate their results. The outcome of their work (a reliable and accurate method to catalog cap-adjacent m6Am) is a particularly notable achievement and is a needed advance for the field.

      Weaknesses:

      No major concerns were identified by this reviewer.

      We thank the reviewer for the positive assessment of the method and dataset. We have addressed the concerns below.

      Mid-level Concerns:

      (1) In Lines 625 and 626, the authors state that “our data suggest that mRNAs initate (mis-spelled by authors) with either Gm, Cm, Um, or m6Am.” This reviewer took those words to mean that for A-initiated mRNAs, m6Am was the ‘default’ TSN. This contradicts their later premise that promoter sequences play a role in whether m6Am is deposited.

      We thank the reviewer for the comment. We have changed this sentence into “Instead, our data suggest that mRNAs initiate with either Gm, Cm, Um, or Am, where Am are mostly m6Am modified.” The revised sentence separates the processes of transcription initiation and m6Am deposition, which will not confuse the reader.

      (2) Further, the following paragraph (lines 633-641) uses fairly definitive language that is unsupported by their data. For example in lines 637 and 638 they state “We found that these differences are often due to the specific TSS motif.” Simply, using ‘due to’ implies a causative relationship between the promoter sequences and m6Am has been demonstrated. The authors do not show causation, rather they demonstrate a correlation between the promoter sequences and an m6Am TSN. Finally, despite claiming a causal relationship, the authors do not put forth any conceptual framework or possible mechanism to explain the link between the promoter sequences and transcripts initiating with an m6Am.

      (3) The authors need to soften the language concerning these data and their interpretation to reflect the correlative nature of the data presented to link m6Am and transcription initiation.

      For (2) and (3). We have softened the language in the revised manuscript. Specifically, for lines 633-641 in the original manuscript, we have changed “are often due to” into “are often related to” in the revised manuscript, which claims a correlation rather than a causation.

      Reviewer #3 (Public review):

      Summary:

      m6Am is an abundant mRNA modification present on the TSN. Unlike the structurally similar and abundant internal mRNA modification m6A, m6Am’s function has been controversial. One way to resolve controversies surrounding mRNA modification functions has been to develop new ways to better profile said mRNA modification. Here, Liu et al. developed a new method (based on GLORI-seq for m6A-sequencing), for antibody-independent sequencing of m6Am (CROWN-seq). Using appropriate spike-in controls and knockout cell lines, Liu et al. clearly demonstrated CROWN-seq’s precision and quantitative accuracy for profiling transcriptome-wide m6Am. Subsequently, the authors used CROWN-seq to greatly expand the number of known m6Am sites in various cell lines and also determine m6Am stoichiometry to generally be high for most genes. CROWN-seq identified gene promoter motifs that correlate best with high stoichiometry m6Am sites, thereby identifying new determinants of m6Am stoichiometry. CROWN-seq also helped reveal that m6Am does not regulate mRNA stability or translation (as opposed to past reported functions). Rather, m6Am stoichiometry correlates well with transcription levels. Finally, Liu et al. reaffirmed that FTO mainly demethylates m6Am, not of mRNA but of snRNAs and snoRNAs.

      Strengths:

      This is a well-written manuscript that describes and validates a new m6Am-sequencing method: CROWN-seq as the first m6Am-sequencing method that can both quantify m6Am stoichiometry and profile m6Am at single-base resolution. These advantages facilitated Liu et al. to uncover new potential findings related to m6Am regulation and function. I am confident that CROWN-seq will likely be the gold standard for m6Am-sequencing henceforth.

      Weaknesses:

      Though the authors have uncovered a potentially new function for m6Am, they need to be clear that without identifying a mechanism, their data might only be demonstrating a correlation between the presence of m6Am and transcriptional regulation rather than causality.

      We thank the reviewer for the very positive assessment of the CROWN-seq method. We have softened the language which is related to the correlation between m6Am and transcription regulation.

    1. Author response:

      Reviewer 1 (Public Review)

      (1) The proposed design is not sufficient to answer the research question. The rationale of the study proposed in the introduction is that auditory stimulation may explain the analgesic effects of RPMS. To answer this question, the authors should have used a factorial design using 4 groups (active RPMS + sound; active RPMS + no sound; sham RPMS + sound; sham RPMS + no sound). Using this design, it would have been possible to determine if the sound, the afferent stimulation, or both are necessary to produce analgesia. Rather, they tested two types of RPMS (iTBS, cTBS) without real rationale, one electrical stimulation and a placebo.

      We will clarify that the study design employed was originally designed to determine whether iTBS or cTBS would be more effective to reduce pain. We included TENS as a positive control, and sham as a negative control. We were indeed surprised by the findings, and present them herein. Future RCTs should be performed to reproduce these findings.

      (2) There are multiple ways that the current design could have introduced biases. The study was not randomized but pseudo-randomised. What does that mean? Was their allocation concealment? Was the assessor and data analyst blinded to group allocation? Did an intention to treat analyses were performed? Did the participants were adequately blinded (was it measured)?

      This study was not designed as an RCT, but rather as experimental study. The study was pseudo-randomized to ensure that the groups had equal allocation and distribution of sexes.

      The groups were blinded to the other stimulations (they were not informed of the various arms of the study, through different consent forms).

      It was not possible to blind the experimenter as the iTBS and cTBS protocols are very different: iTBS has multiple bursts separated by brief intervals, whereas cTBS is continuous). The data were masked for analysis, and only unblinded at the final stage. We will update the manuscript to reflect these changes.

      (3) The TENS parameters used were not optimal and are not those commonly used in clinical practice. This could have explained the lack of TENS effects. The lack of TENS effects has not been discussed and it is concerning. If TENS had been effective (as expected), the story about the auditory effects would not have been presented as the primary mechanisms underlying the current results.

      We acknowledge that this is a limitation of the study. A future study should address this. However, we will not remove the arm for transparency.

      (4) No primary outcome has been identified. It is important to mention that the interpretation of results is based on the presence of only one statistically significant result. Pain intensity and pain unpleasantness are not affected. This was not properly addressed in the Discussion. What does that mean that secondary hyperalgesia is affected but not pain?

      We reiterate that this study was not designed as an RCT, but rather an experimental study with The primary outcomes measures that capture change in  were measures of pain sensitivity (pain intensity NRS, pain unpleasantness NRS, and secondary hyperalgesia). We will clarify this in the revised manuscript.

      We will now include discussion of the effects being solely on secondary hyperalgesia, and not on pain intensity and unpleasantness.

      (5a) The use of secondary hyperalgesia variable is concerning. How is it possible to measure secondary hyperalgesia if there is no lesioned tissue?

      Secondary hyperalgesia refers to hyperalgesia assessed in an area adjacent to or remote of the site of stimulation. In general, it is not required to lesion a tissue to activate the nociceptive system or to induce pain. We have cited other studies that have employed secondary hyperalgesia as a pain outcome measure without inducing a lesion.

      Hyperalgesia reflects increased pain on suprathreshold stimulation. Then, one measures the subjective response to a painful (i.e. suprathreshold) stimulation, then applies a conditioning stimulation (e.g. heat), and measures the subjective response to the same original stimulus. If the response after conditioning is higher than the baseline measure, hyperalgesia has been induced. Secondary hyperalgesia just refers to hyperalgesia assessed in an area adjacent to or remote of the site of stimulation. In general, it is not required to lesion a tissue to activate the nociceptive system or to induce pain.

      (5b) If heat creates secondary hyperalgesia without lesion, what does that mean physiologically?

      Secondary hyperalgesia is normally interpreted as a perceptual correlate of central sensitization.

      (5c) Is it a valid and reliable "pain" variable?

      Yes and yes. A noxious heat stimulus can reliably elicit secondary hyperalgesia (see section 3.2 from Quesada et al. 2021). We also cite several studies that have used secondary hyperalgesia as an outcome measure of central sensitization in pain.

      (6) The follow-up study has been designed to cover the RPMS sound using pink noise. However, the pink noise was also present during the PHP measurement. How can we determine whether the absence of change is due to the pink noise during the RPMS or the presence of pink noise during PHP? I don't think this is possible to discriminate.

      We will add a third study that performs the control analysis with the sound of the rPMS masked, but no pink noise otherwise. The study will be performed in two groups: one with pink noise, and one without pink noise.

      Appraisal

      (7) Despite all these potential issues, authors interpret their data with high confidence and with several overstatements in the Title, Abstract, and Discussion. The results do not support their conclusions. The fact that auditory stimulation may produce an analgesic effect is a hypothesis, but the current study cannot ascertain it.

      We believe that the chief concern with the interpretation lies with concerns with the second study. The proposed third experiment will address these concerns.

      Reviewer 2 (Public Review):

      (1) My biggest concern in this paper is that the stimulation protocols are not applied after pain was induced in the subjects, but before. This is not bad in itself, but as the paper presents the stimulations as potential "treatments" it generates a severe mismatch between the objective, context (introduction), and impact (discussion) presented for the experiments, and how they are actually designed. This adds to the fact that healthy volunteers are used here to generate a study with low translational capability, that aims to be translational and provide an indication for clinics (maybe this is why the reduction in pain intensity caused by PMS when applied in patients, reported in references [29, 35 and 39], is not observed here).

      We will reframe these as prophylaxis, rather than treatment. This study was an experimental study originally designed to determine which stimulation parameters (cTBS or iTBS) would be better suited to modulate pain. We performed the study in healthy individuals undergoing acute pain, akin to a person undergoing painful procedure, which could lead to central sensitization and pain persistence (e.g., post-surgical pain). However, before testing this in individuals undergoing actual procedures, it is essential to determine efficacy in people before translation.

      Khan et al [29] is a case study with neuropathic pain, whereas our study uses a nociceptive pain model. Lim et al [35] employed 10 sessions of rPMS stimulation in patients with acute low back pain. Similar to our study, the change in VAS driven by rPMS was no different than the sham stimulation. We notice that there is no reference 39, and will correct this.

      (2) TENS treatment duration is simply too short (90s) to be considered a therapeutic TENS intervention. I get that this duration was chosen to match the one of PMS, but TENS is never applied like this in the clinics, in which the duration varies from 10 minutes to an hour (or more). This specific study comparing different durations recommends 40 minutes for knee osteoarthritis pain relief (PMID: 12691335). Under these conditions, this stimulation is more similar to a sham TENS than to a real TENS treatment: I would suggest interpreting it as such. As the paper is right now, it could give the impression that PMS could produce clinical effects not observed in TENS, but while the PMS application resembles a clinical one, the TENS application does not (due to its extremely short duration). As an example, giving paracetamol at a dose 10 times below its effective dose is a placebo, not a paracetamol treatment.

      We acknowledge that this is a limitation, and will address this in the Discussion of the revised manuscript.

      (3) This study measured pain, not central sensitization. Specifically, the effects refer to the area of secondary hyperalgesia. The IASP definition for central sensitization is "Increased responsiveness of nociceptive neurons in the central nervous system to their normal or subthreshold afferent input." (PMID: 32694387). No neuronal results are reported in this article. Therefore, central sensitization is not measured here, and we do not know if it is reduced by sound. This frontally clashes with the title of the article and with many interpretations of the results. For a deep review on this topic, I recommend PMID: 39278607 and the short article PMID: 30416715.

      It is widely accepted that central sensitization is the neurophysiological basis of secondary hyperalgesia (see PMID: 11313449; PMID: 10581220).

      The reviewer is conflating secondary hyperalgesia due to central sensitization and chronic pain. Whether chronic pain is driven or maintained by central sensitization is not the goal of our study. However, there is ample evidence that nociceptive drive can induce plasticity in the CNS, which alters pain sensitivity, and that these changes facilitate pain.

      (4a) There is no mention of blinding/masking/concealing in this manuscript. Was the therapist blind to whether they applied one protocol, another, or a placebo? Were the evaluators blind, as this can heavily influence their measurements? And the volunteers? Was allocation concealed? Was this blinding measured afterwards? Blinding is, together with randomization, the most important methodological feature for those interventional studies. For example, not introducing blinding and concealing directly makes a study lose 4 out of 10 points in the PEDro scale, failing to fulfill criteria 3, 5, 6, and 7 (https://pedro.org.au/english/resources/pedro-scale/).

      This study was not designed as an RCT, but rather as experimental study. The study was pseudo-randomized to ensure that the groups had equal allocation and distribution of sexes.

      The groups were blinded to the other stimulations (they were not informed of the various arms of the study, through different consent forms). However, blinding was not measured afterwards (again, this was not meant to be an RCT).

      It was not possible to blind the experimenter as the iTBS and cTBS protocols are very different: iTBS has multiple bursts separated by brief intervals, whereas cTBS is continuous). The data were masked for analysis, and only unblinded at the final stage. We will update the manuscript to reflect these changes.

      (4b) Continuing with methodological considerations, the dropout percentage is high (18% for the first and 25% for the second study), both above the 15% cutoff for criterion 8 of the PEDro, losing another point.

      In the study, only 2 withdrew after feeling the heat, 2 were lost to follow up, and 2 had incomplete data. That totals 6/123 in Study 1. In study 2, none of the participants that met inclusion/exclusion criteria, and who were ‘allocated’ to the study were included (0% dropout/data loss).

      We are unsure how to address this point, as we had clear inclusion/exclusion criteria, and these could only be measured after consenting. As this is an experimental study performed on healthy individuals in a university setting, we are not able to collect any study related data prior to consent.

      We openly reported individuals who did not meet the criteria, and thus were excluded. These criteria are a combination of what is required to collect good quality data, and what we are ethically permitted to do. We understand that in an interventional trial where >15% drop out due to intolerance, or adverse events would indeed be concerning.

      (5) Data reporting and statistical treatment can be improved, as only differences are reported and regression to the mean is not accounted for in this study. Moreover, baseline levels for the dependent variables (control session) are not accessible for evaluation and they are not compared statistically, making it impossible to know if the groups were similar at baseline. This will imply failing criterion 3 of the PEDro, for a total of 2/10 points.

      This only concerns study 1, as study 2 is a within subject study design. Study 1 provides the raw data in Figure 4. We will provide the raw data for each of the primary outcome measures in a supplemental table in the revision.

    1. Author response:

      In this initial response to the public review, we outline our plan to address the major concerns raised. Below, we provide a general categorization of the suggestions and our corresponding responses

      Weakness #1: Statistical Concerns - using the number of seizures (rather than the number of animals) may identify small effects that could be insignificant. Effect size should be taken into consideration.

      Reviewer 1:

      “While the data generally supports the authors' conclusions, a weakness of this manuscript lies in their analytical approach where EEG feature-space comparisons used the number of spontaneous or evoked seizures as their replicates as opposed to the number of IHK mice; these large data sets tend to identify relatively small effects of uncertain biological significance as being highly statistically significant.”

      Reviewer 2:

      “In several sections of the paper, the authors argue that two different groups are similar on the basis that no statistical difference was found between the two groups (i.e., p > 0.05); however, the failure to find a statistically significant difference, particularly with relatively small sample sizes, is not rigorous evidence that the two groups are actually similar - they are just "not significantly different.”

      Reviewer 3:

      “(3) The utility of increasing the number of seizures for enhancing statistical power is limited unless the sample size under evaluation is the number of seizures. However, the standard practice is for the sample size to be the number of mice.”

      Reviewer 3:

      “(1) Evaluation of seizure similarity using the SVM modeling and clustering is not sufficiently explained to show if there are meaningful differences between induced and spontaneous seizures. SVM modeling did not include analysis to assess the overfitting of each classifier since mice were modeled individually for classification.”

      We understand the reviewers’ concerns. In this work, we used linear mixed effect model to address two levels of variability –between animals and within animals. The interactive linear mixed effect model shows that most (~90%) of the variability in our data comes from within animals (Residual), the random effect that the model accounts for, rather than between animals. Since variability between animals are low, the model identifies common changes in seizure propagation across animals, while accounting for the variability in seizures within each animal. Therefore, the results we find are of changes that happen across animals, not of individual seizures. We will make text edits to enhance understanding of the linear mixed effect model.

      To address the point raised about similarity, we will explain how the SVM classifier was trained. The purpose of the SVM is not to identify meaningful differences between induced and spontaneous seizures. Rather, it is to classify EEG sections as “seizures” or non-seizures, demonstrating the gross similarity between induced and spontaneous seizures despite minor differences. We will make text clarifications for the SVM model.

      Weakness #2: Clinical and biological significance is unclear.

      Reviewer 1:

      “Furthermore, the clinical relevance of similarly small differences in EEG feature space measurements between seizure-naïve and epileptic mice is also uncertain.”

      Reviewer 2:

      “While the paper may be relevant for the ETSP and contract research organizations (CROs), the paper was not written to attract the interest of biological scientists, even those in this specific area of epilepsy research. It may be of low interest to other neuroscientists… The key issue the authors aim to address is the 30-40% of patients with DRE, but the real problem with DRE patients is not that these people have seizures with no effect of the ASDs; rather, although ASD may reduce seizure burden, these patients continue to have some remaining seizures even after high doses of ASDs, which often leads to adverse effects from the particular ASDs… It remains unclear that the optogenetically induced seizures in this model are better than similarly induced seizures in a naïve animal, and there is no evidence that the model will be useful for finding new ASDs to treat DRE.”

      Reviewer 3:

      “(6) Human epilepsy is extensively heterogeneous in both etiology and individual phenotype, and it may be hard to generalize the approach.”

      Reviewer 2:

      “The authors state that this approach should be used to test for and discover new ASDs for DRE, and also used for various open/closed loop protocols with deep-brain stimulation; however, the paper does not actually discuss rigorously or critically the background literature on other published studies in these areas or how this approach will improve future research for a broader audience than the ETSP and CROs. Thus, it is not clear whether the utility will apply more widely and how extensive a readership will be attracted to this work.”

      We appreciate the reviewer’s concerns. We will revise the manuscript to better emphasize the potential significance of our approach. The on-demand seizure model can be applied to address biologically and clinically relevant questions beyond its utility in drug screening. For example, crossing the Thy1-ChR2 mouse line with genetic epilepsy models, such as Scn1a mutants, could reveal how optogenetic stimulation differentially induces seizures in mutant versus non-mutant mice, providing insights into seizure generation and propagation in Dravet Syndrome. Due to the cellular specificity of optogenetics, we also envision this approach being used to study circuit-specific mechanisms of seizure generation and propagation. Regarding drug-resistant epilepsy (DRE) and anti-seizure drug (ASD) screening, we agree with the reviewer that probing new classes of ASDs for DRE represents the critical goal. However, we believe a full exploration of additional ASD classes and/or modeling DRE lies outside the scope of this manuscript.

      Weakness #3: Definition of Seizure is unclear

      Reviewer 2:

      “Although the figures provide excellent examples of individual electrographic seizures and compare induced seizures in epileptic and naïve animals, it is unclear which criteria were used to identify an actual seizure induced by the optogenetic stimulus, versus a hippocampal paroxysmal discharge (HPD), an "afterdischarge", an "electrophysiological epileptiform event" (EEE, Ref #36, D'Ambrosio et al., 2010 Epilepsy Currents), or a so-called "spike-wave-discharge" (SWD). Were HPDs or these other non-seizure events ever induced using stimulation in animals with IH-KA? A critical issue is that these other electrical events are not actual seizures, and it is unclear whether they were included in the column showing data on "electrographic afterdischarges" in Figure 5 for the studies on ASDs”

      Reviewer 3:

      “(2) The difference between seizures and epileptiform discharges or trains of spikes (which are not seizures) is not made clear.”

      Reviewer 2:

      “The differences between the optogenetically evoked seizures in IH-KA vs naïve mice are interpreted to be due to the "epileptogenesis" that had occurred, but the lesion from the KA-induced injury would be expected to cause differences in the electrically and behaviorally recorded seizures - even if epileptogenesis had not occurred. This is not adequately addressed.”

      Thank you for pointing out the unclear definition of the seizures analyzed. We agree and will revise the text to clarify this issue. In this manuscript, we focused on tonic-clonic seizures. We analyzed animal behavior during evoked events, and a high percentage of induced electrographic events were accompanied by behavioral seizures with a Racine scale of three or above. Regarding epileptogenesis, our model is based on the IHK model, in which spontaneous tonic-clonic seizures occur a few to several days after KA injection. These mice are, by definition, epileptogenic. We will further clarify this methodology in the text.

      Weakness #4: Similarity/Difference with Kindling Not Clear

      Reviewer 2:

      “The authors did not test whether an apparent "kindling" effect, apparently seen in naïve controls, also occurred in animals micro-injected with kainic acid (KA). This effect could cause model instability that might result in variability in response to ASDs. It is not clear whether the number of optogenetically induced seizures in epileptic animals would affect the response to drugs. It is also unclear how much of an improvement the animal model in the present work is over other similar models of TLE, where electrically triggered seizures could simply be applied to one of them.”

      Reviewer 3:

      “(5) It is unlikely that long-term adaptation to CA1-stimulated seizure induction is absent in these mice. A duration of evaluation longer than 16 days is warranted in light of the downward slope at days 13-16 for induced seizures in Figure 4C.”

      We appreciate the reviewer’s comments regarding the “kindling effect” as well as its similarity to the kindling model. We will carefully assess the data and address this in the revised manuscript. In electrical kindling, the activated cellular population is non-specific, including both excitatory and inhibitory neurons. In our model, we specifically activate predominantly excitatory neurons (Thy1-positive neurons), which we observed to participate in convulsant-induced seizures (as demonstrated in Thy1-GCaMP experiments). We consider this specificity an improvement over the kindling model, making our approach more biologically relevant.

      Weakness #5: Time needed to generate model is significant. Unclear if animals were pre-selected

      Reviewer 1:

      “Finally, the multiple surgeries and long timetable to generate these mice may limit the value compared to existing models in drug-testing paradigms.

      Reviewer 2:

      “The authors offer little mention of other research using animal models of TLE to screen ASDs, of which there are many published studies - many of them with other strengths and/or weaknesses. For example, although Grabenstatter and Dudek (2019, Epilepsia) used a version of the systemic KA model to obtain dose-response data on the effects of carbamazepine on spontaneous seizures, that work required use of KA-treated rats selected to have very high rates of spontaneous seizures, which requires careful and tedious selection of animals. The ETSP has published studies with an intra-amygdala kainic acid (IA-KA) model (West et al., 2022, Exp Neurol), where the authors claim that they can use spontaneous seizures to identify ASDs for DRE; however, their lack of a drug effect of carbamazepine may have been a false negative secondary to low seizure rates. The approach described in this paper may help with confounds caused by low or variable seizure rates. These types of issues should be discussed, along with others.”

      We appreciate the reviewer’s insights. In an existing model investigating spontaneous tonic-clonic seizures (such as the intra-amygdala kainate injection model), the time investment is back-loaded, requiring two to three weeks per condition while counting spontaneous seizures, which may occur only once a day. In contrast, our model requires a front-loaded time investment. Once the animals are set up, we can test multiple drugs within a few weeks, providing significant time savings. Additionally, we did not pre-screen animals in our study. Existing models often pre-select mice with high rates of spontaneous seizures, whereas in our model, seizures can be induced even in animals with few spontaneous seizures. We believe that bypassing the need for pre-screening is a key advantage of our induced seizure model.

      Reviewer 3:

      “(7) No mention or assessment of mouse sex as a biological variable.”

      Thank you for pointing this out. Both female and male animals were included in this study: Epileptic cohort: 7 males, 3 females; Naïve cohort: 3 males, 4 females

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      Wilson's Disease (WD) is an inherited rare pathological condition due to a mutation in ATP7B that alters mitochondrial structure and dysfunction. Additionally, WD results in dysregulated copper metabolism in patients. These metabolic abnormalities affect the functions of the liver and can result in cholecystitis. Understanding the immune component and its contribution to WD and cholecystitis has been challenging. In this work, the authors have performed single-cell RNA sequencing of mesenchymal tissue from three WD patients and three liver hemangioma patients.

      Strengths:

      The authors describe the transcriptomic alterations in myeloid and lymphoid compartments.

      Weaknesses:

      In brief, this manuscript lacks a clear focus, and the writing needs vast improvement. Figures lack details (or are misrepresented), the results section only catalogs observations, and the discussion needs to focus on their findings' mechanistic and functional relevance. The major weakness of this manuscript is that the authors do not provide a mechanistic link between the absence of ATP7B and NK cells' impaired/altered functions. While the work is of high clinical relevance, there are various areas that could be improved.

      In this study, we reported for the first time that ATP7B mutation and the resulting metabolic abnormalities in hepatocytes cause functional alteration of immune cells in WD patients. We dissected the transcriptional profiles of liver mesenchymal cells and delineated the functional differences of main immune cells in WD patients through scRNA-seq. The NK cell exhaustion and its clinical significance were further demonstrated.

      The mechanism study is of our concern. Given that the ATP7B mutation is hepatocyte-specific, its effect on immune cells is most probably through intercellular communication rather than through the direct action of ATP7B protein. How ATP7B mutation disturbs the metabolic homeostasis in hepatocyte, how metabolic pathways regulate the release of signal substances, and how signal substances act on the NK cells need to be explained. These contents, together with this manuscript, are beyond the scope of a single article, so we put the novelty in this manuscript.

      We sincerely appreciate the comments. We have improved the manuscript based on your valuable suggestions. The mechanism study is our subsequent research topic. We are actively promoting it and have found that ATP7B mutation rewires a certain metabolism pathway in hepatocyte, and that a critical metabolite functions as the mediator causing NK cell exhaustion.

      Reviewer #2 (Public Review):

      Summary:

      Wilson's disease is a rare genetic disorder caused by mutations in the ATP7B gene. Previous studies have documented that ATP7B mutations can disrupt copper metabolism, affecting brain and liver function. In this paper, the authors performed a retrospective clinical study and found that Wilson's disease has a high incidence of cholecystitis. Single-cell RNA-seq analysis revealed changes in the immune microenvironment, including the activation of immune responses and the exhaustion of natural killer cells.

      Strengths:

      A key finding of this study is that the predominant ATP7B gene mutation in the Chinese population is the 2333G>T (p. R778L) mutation. The authors reported associations between Wilson's disease and cholecystitis, as well as the exhaustion of natural killer cells.

      Weaknesses:

      The underlying mechanisms linking ATP7B mutations to cholecystitis and natural killer cell exhaustion remain unclear. Specifically, it is not yet determined whether copper metabolism alterations directly cause cholecystitis and natural killer cell exhaustion, or if these effects are secondary to liver dysfunction.

      In this study, we reported for the first time that ATP7B mutation and the resulting metabolic abnormalities in hepatocytes cause functional alteration of immune cells in WD patients. We dissected the transcriptional profiles of liver mesenchymal cells and delineated the functional differences of main immune cells in WD patients through scRNA-seq, focusing on the NK cell exhaustion and its clinical significance.

      The mechanism study is of our concern. Given that the ATP7B mutation is hepatocyte-specific, its effect on immune cells is most probably through intercellular communication, so we prioritize the studying of this aspect. How ATP7B mutation disturbs the metabolic homeostasis in hepatocyte, how metabolic pathways regulate the release of signal substances, and how signal substances act on the NK cells need to be explained. These contents, together with this manuscript, are beyond the scope of a single article, so we put the novelty in this manuscript.

      We sincerely appreciate the comments. The mechanism study is the topic of our follow-up study. We are actively promoting the research and we have found that ATP7B mutation rewires a certain metabolism pathway in hepatocyte, and that a critical metabolite functions as the mediator causing NK cell exhaustion.

      Reviewer #1 (Recommendations For The Authors):

      Major:

      (1) Abstract. A major portion of this manuscript focuses on non-NK cells. Data that describes NK cell exhaustion is only minimal. Therefore, the authors should modify the abstract.

      Thank you for your valuable suggestion. We have supplemented the description of functional changes in other immune cells, and have modified the abstract (line 31-35).

      (2) Introduction. There are three paragraphs. The first paragraph discusses cholecystitis. However, there are too many repetitions, and the information is unclear. In the second part, the authors discuss NK cells and their exhaustion. The authors do not establish a clear rationale or logic linking NK cells to WD or cholecystitis. In the last paragraph, the authors describe their findings. Their correlation between NK cell exhaustion and the poor healing process of cholecystitis has no direct experimental proof.

      Thank you for your comments. We have deleted the repetitions and rephrased some sentences (line 72-74). Briefly, in the first paragraph, we proposed the significant prognostic value of immune cell dysfunction for cholecystitis. In the second paragraph, we introduced NK cell exhaustion and its potential to predict prognosis of certain diseases. In the third paragraph, we introduced that the liver is a central organ involved in metabolism and immunity, holding a large number of NK cells. Liver pathologies commonly impact the development and outcome of inflammation-associated diseases such as cholecystitis. WD was selected as a research model. In the last paragraph, we introduced our findings from clinical study, scRNA-seq, clinical samples, and bioinformatics analysis, and concluded at the end.

      (3) Results. Overall, the results section lacks clarity and a clear focus. Figure legends need to be significantly detailed. The authors make too many broad statements without any support. The authors also make too many overstatements.

      Thank you for your valuable suggestion. We have improved the inaccurate statements and made detailed refinement of figure legends. All the changes are marked in the manuscript, and related responses are described below.

      Figure 1: No information is provided about the functional impairment of ATP7B protein due to the mutation found in the cohort of Chinese patients. What does 'immune abnormalities' (line 127) mean? What is the relevance of showing liver fibrosis and copper accumulation in the eye in Figure 1c and d, respectively? Total cholesterol concentrations are still within the range in the plasma of WD patients, but the authors call it higher. ECAR has not changed in WD patients, but the authors claim it has (line 117).

      (1) All these gene mutations in WD disable the protein function and cause the same outcome. (2) We have deleted the inappropriate statement. (3) In clinical observation, we found that WD not only causes copper accumulation in hepatocytes, but also leads to a variety of diseases, including liver fibrosis, Kayser-Fleischer Ring, and lower risk of hyperglycemia. We showed these together with the data of cholecystitis incidence. We think these might suggest the significance of intercellular communication between hepatocytes and other cells in microenvironment. (4) We have deleted the inappropriate statement (line 108-110, 112-113).

      Figure 2: Did the authors use the liver mesenchymal tissue or mesenchymal cells? Figure 2 states that they used mesenchymal cells, different from liver mesenchymal tissue. Numbers within Figure 2b UMAP are not visible. Were the initial T and NK cells annotated as indicated in Figure S2 (CD3D, CD#E, CD3G)? If so, that does not include NK cells.

      (1) The liver mesenchymal cells were used for scRNA-seq. (2) It is possible that the image resolution was reduced due to the compression of files by the submission system during merging process. We confirm that the image resolution of all figures meets publishing requirements, and that all characters on the figures are visible. You can download figure files to view details. (3) It was our negligence that the incomplete cell markers were shown in Figure S2. We have updated the markers (CD3D, CD3E, NKG7), references (Ref #53, #55, and #56), and related figures (Figure 2e, and Figure S2c).

      Figure 3: The authors should change 'Case' to 'WD patients' both in the text and figures. DEGs in Figure 3C indicate a transcriptomic alteration in the B cell compartment, which the authors do not delineate. Also, the rationale and explanation for the CellChat analyses are minimal. Concluding that a change occurred within the TME with minimal data and explanations is unfair.

      Thank you for your comments. (1) We apologize for the confusion caused by the use of nomenclatures and abbreviations in the text and figures. In all scRNA-seq data analysis, presentation, and description, we used specific terms (CASE and CON) to refer to the group of WD patients and controls, as well as their cell population. We have now unified the use of nomenclature in full text and defined them when first appeared (line 126-127), avoiding using lowercase form to prevent confusion. (2) We have now compared the expression of key genes of B cell between the two group in the next section “The dysfunction of main immune cells in WD patients” (line 230-235, Figure 4e, Figure S4e). (3) We have described the results of cellular communication in more detail (line 188-194). (4) We have modified the conclusion and all the related statement in full text (line 29-31, 82-84, 149, 194-195).

      Figure 4: This section deals with multiple cell types with minimal explanations. This section discusses various cell types, but it lacks focus. In particular, the T cell section should be separated and elaborated more in detail.

      (1) In this section, we intended to show the comparison in function of main immune cells that account for a considerable proportion, instead of just showing differently expressed genes that provide minimal information. The evaluation of functional signature, based on the integration of multiple gene expression, allows a direct understanding of the final outcome owing to transcriptional changes. (2) Given that the main functions of T cells did not change significantly and there were more significant changes in innate immunity, the T cell section is relatively short and unsuitable as a separated part.

      Figure 5: What are the distinct subsets of NK cells authors have found in the WD patients and controls? How do these subsets differ between the two groups in numbers and their transcriptomes? The presentation and labeling of Figure 5 and Supplementary Figure 5 need to be vastly improved. The pseudotime presentation in Figure 5b should be presented separately for the patients and the controls. Are the changes in gene expression presented in Figure 5a due to the change in the subset compositions? Figure 5c immuno-staining is not at all visible. A clear explanation should be given for the differences between Figure 5c and Figure 5e, where NKG2A expressions are shown. A better explanation for Figure 5d is required. Did the authors use all the antibodies with the same fluorochrome? If so, what color is that? Can the authors include the individual samples in the bar diagram in Figure 5e? Again, the data in Figure 5 is insufficient to conclude that NK cells are exhausted in WD patients. While the role of changes in the expression of T-BET and EOMES can be related to dysfunction and cellular exhaustion of NK cells, the statement made by the authors needs to be toned down as they do not test with independent experiments.

      (1) The subsets of NK cell were clustered by gene expression profile and labeled by the characteristically expressed gene, using certain algorithm in the routine procedure. They cannot be distinguished in clinical samples by one or several genes or other sorting methods. Thus, we were not able to analyze these subsets in clinical samples. (2) We have supplemented the comparison of numbers and transcriptomes of three NK subtypes between the two groups (line 268-273). (3) We have checked the figures and confirmed that all characters on the figures are visible. (4) We have separately presented the plot in Figure S5d. (5) We compared the expression level of genes presented in Figure 5a between the two groups in three NK subtypes and supplemented this part (line 264-268). The results were very consistent across the three subtypes, suggesting that the results in total NK population were contributed by all three subtypes and not affected by a single composition. (6) KLRC1 is also known as NKG2A. We are sorry for not making a clear explanation, and now we use KLRC1 only in all text to avoid confusion. We have made a more clear and detailed description for Figure 5c, 5d, and 5e (now labeled as Figure 5b, 5c, and 5d), and have included the fluorochrome in Figure 5d (now labeled as Figure 5c) and the individual value in Figure 5e (now labeled as Figure 5d) (line 293-299). (7) In this section, we found the upregulated expression of inhibitory receptors, downregulated expression of effector molecules, and the impaired NK cell-mediated cytotoxicity in NK cell of WD patients from scRNA-seq. Then we validated the findings in clinical liver section samples and clinical blood samples by mIHC and flow cytometry, respectively. According to the recent articles, exhausted NK cells are characterized by decreased production of effector cytokines (e.g., IFNγ), as well as by impaired cytolytic activity, and downregulate expression of certain activating receptors and upregulate expression of inhibitory receptors (e.g., 10.3389/fimmu.2017.00760, 10.1038/s41590-018-0132-0, 10.1038/s41467-019-09212-y, 10.1080/2162402X.2016.1264562). Therefore, we concluded NK cell exhaustion in WD patients. (8) In the part about transcription factors, we kept the description of objective data and deleted the statement of the contribution of transcription factors to NK exhaustion.

      Figure 6: Data presented in Figure 6 and the conclusion made in this manuscript are predictive. There is no direct testing of ATP7B in NK cells to show the functions of this gene. Extension of this to patient survival is purely speculative. As long as authors state these facts clearly in their text, it can be acceptable. However, they do not extend their conclusions to similar liver diseases.

      ATP7B mutation is hepatocyte-specific, and it does not occur in any immune cells. The function of ATP7B in NK cell was not studied. We found the NK exhaustion and poor prognosis of cholecystitis in WD patients. Given that there were researches demonstrating that NK exhaustion is correlated with poor liver cancer prognosis, we hypothesized that NK exhaustion contributes to the poor prognosis of cholecystitis. Bioinformatics studies confirmed our hypothesis and supported the extension of this result to other inflammatory diseases. We had no experimental data, but this result was reliable in bioinformatics method.

      (4) Discussion: While the authors analyzed multiple cell types, the discussion is primarily focused on NK cells. There is no clear link between copper utilization, NK cell function, and exhaustion that the authors articulate.

      Thank you for your comments. The focus of our study is NK cell exhaustion, which is experimentally proven, so we discussed this aspect. We prioritize the effect of intercellular communication and metabolic alteration on the NK cell exhaustion in our follow-up study. Excess copper is released into the circulation in some circumstances in WD patients, but generally they receive long-term de-coppering therapy to maintain intracellular copper at a non-lethal level. Thus, we do not tend to consider copper as a critical factor in this study. In original manuscript, we mentioned the cuproptosis and its potential as a novel target. It is likely to lead to ambiguity and misunderstanding, so we deleted this part to put our point of view clearly.

      (5) Supplementary Figures: The presentation and labeling of these figures need to be changed.

      Thank you for your suggestions. We have modified the figures and confirmed that all characters on the figures are visible.

      Reviewer #2 (Recommendations For The Authors):

      It is better to test whether ATP7B mutation can directly affect immune functions.

      Thank you for your suggestions. Given that the ATP7B mutation is hepatocyte-specific, its effect on immune cells is most probably through intercellular communication. Thus, we prioritize the effect of intercellular communication on the NK cell exhaustion and we are actively promoting the research.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public reviews:

      Reviewer 1

      We would like to express our gratitude to Reviewer 1 for providing a thorough summary of our work and highlighting its strengths. With regards to the weaknesses, we are committed to improve the manuscript by performing the necessary changes. First, we will specify the exact p-value in all cases.

      Regarding the discussion section, we acknowledge the feedback regarding its potential confusion. In line with the reviewer's suggestion, we will reduce the literature review and highlight our findings.

      Finally, for the preprint we did not include cofounders such as HIV infection and ethnicity as our study population did not exhibit viral infections and comprised only Hispanic individuals. We will make a more thorough description of the population of study and address these characteristics explicitly in both the methods section and the initial part of the results.

      Reviewer 2

      We appreciate and thank reviewer 2 for the commentaries. Although it is true that several papers have described the role of microbiome in COVID-19 severity, we firmly believe that our current work stands out. There is not much information related to this association in Mediterranean countries, especially in the south of Spain. In addition, most of the studies only describe microbiota composition in stool or nasopharyngeal samples separately, without investigating any potential relationships between them as we do.

      (1) We agree with the reviewer idea of a limited sample size. We faced the challenge of collecting the samples during the peak of COVID-19 pandemia. Thus, doctors and nurses were overwhelmed and not always available for carrying out patient recruitment following the inclusion criteria. Despite these constraints, we ensured that all included samples met our specified inclusion criteria and were from subjects with confirmed symptomatology.

      In addition, our main goal was to identify whether severity of the disease could be assessed through microbiota composition. Therefore we did not include a healthy group. Despite not having a large N, our results should be reproducible as they are supported by statistical analysis.

      (2) We thank reviewer commentary, and since our original sentence may have lacked clarity, we intend to modify it to ensure it conveys the intended meaning more effectively.

      Nonetheless, we remain confident in the significance of our findings. Not only have we found correlation between microbiota and COVID severity, but we have also described how specific bacteria from each condition is associated with key biochemical parameters of clinical COVID infection.

      (3) We appreciate the feedback provided by the reviewer. In this case, we have performed 16S analysis due to its cost-effectiveness compared to metagenomic approaches. Furthermore, 16S analysis has undergone refinements that ensure comprehensive coverage and depth, along with standardized analysis protocols. Unlike 16S, metagenomic approaches lack software tools such as QIIME that facilitate standardization of analysis and, thus, reduce reproducibility of results.

      (4) We sincerely appreciate this insightful suggestion. simply listing associations between both microbiomes and COVID-19 severity could not be enough, we intend to discuss how microbiota composition may be linked to the mechanisms underlying COVID-19 pathogenesis in our discussion.

      (5) We are grateful for the constructive criticism and intend to rewrite our abstract to enhance clarity. Additionally, we will thoroughly review all figures and their descriptions to ensure accuracy and comprehensibility.

      Reviewer 3

      We acknowledge the annotations made by reviewer 3 and are committed to addressing all identified weaknesses to enhance the quality of our work. Our idea is to modify the methods section and figures to make them easier to understand.

      Specifically, in the case of Figure 1, we recognize an error in the description of the Bray-Curtis test. We appreciate the commentary and we will make the necessary changes. Moreover, there is another observation related to Figure 1 description. We are going to modify it in order to gain accuracy.

      For figure 2 we are planning to add a supplementary table showing the abundance of detected genus. Nevermind, we will also update the manuscript text to provide clarification on how we obtained this result.

      Regarding the clarification about "1% abundance," we want to emphasize that we are referring to relative abundance, where 1 represents 100%. To avoid confusion, we will explicitly state this in both the methods section and figure descriptions. Besides, it is true that the statistical test employed for the analysis is not mentioned in the figure description and we recognize that the image may be difficult to interpret. Therefore, we will modify the text and a supplementary table displaying the abundance and p values is going to be added.

      Furthermore, we agree with the reviewer's suggestion to investigate whether the bacteria identified as potential biomarkers for each condition are specific to their respective severity index or if there is a threshold. Thus, we will reanalyze the data and include a supplementary table with the abundance of each biomarker for each condition. We will also place greater emphasis on these results in our discussion.

      Finally, in response to the reviewer's suggestion, we are going to go through the nasopharyngeal-fecal axis part in the discussion. It is well described that COVID-19 induces a dysbiosis in both microbiomes. Consequently, we understand that the ratio we have described could be an interesting tool for assessing COVID severity development as it considers alterations in both environments. However, we acknowledge that there may be room for improvement in clarifying the significance of this intriguing finding and its implications.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      This manuscript from Schwintek and coworkers describes a system in which gas flow across a small channel (10^-4-10^-3 m scale) enables the accumulation of reactants and convective flow. The authors go on to show that this can be used to perform PCR as a model of prebiotic replication.

      Strengths:

      The manuscript nicely extends the authors' prior work in thermophoresis and convection to gas flows. The demonstration of nucleic acid replication is an exciting one, and an enzyme-catalyzed proof-of-concept is a great first step towards a novel geochemical scenario for prebiotic replication reactions and other prebiotic chemistry.

      The manuscript nicely combines theory and experiment, which generally agree well with one another, and it convincingly shows that accumulation can be achieved with gas flows and that it can also be utilized in the same system for what one hopes is a precursor to a model prebiotic reaction. This continues efforts from Braun and Mast over the last 10-15 years extending a phenomenon that was appreciated by physicists and perhaps underappreciated in prebiotic chemistry to increasingly chemically relevant systems and, here, a pilot experiment with a simple biochemical system as a prebiotic model.

      I think this is exciting work and will be of broad interest to the prebiotic chemistry community.

      Weaknesses:

      The manuscript states: "The micro scale gas-water evaporation interface consisted of a 1.5 mm wide and 250 µm thick channel that carried an upward pure water flow of 4 nl/s ≈ 10 µm/s perpendicular to an air flow of about 250 ml/min ≈ 10 m/s." This was a bit confusing on first read because Figure 2 appears to show a larger channel - based on the scale bar, it appears to be about 2 mm across on the short axis and 5 mm across on the long axis. From reading the methods, one understands the thickness is associated with the Teflon, but the 1.5 mm dimension is still a bit confusing (and what is the dimension in the long axis?) It is a little hard to tell which portion (perhaps all?) of the image is the channel. This is because discontinuities are present on the left and right sides of the experimental panels (consistent with the image showing material beyond the channel), but not the simulated panels. Based on the authors' description of the apparatus (sapphire/CNC machined Teflon/sapphire) it sounds like the geometry is well-known to them. Clarifying what is going on here (and perhaps supplying the source images for the machined Teflon) would be helpful.

      We understand. We will update the figures to better show dimensions of the experimental chamber. We will also add a more complete Figure in the supplementary information. Part of the complexity of the chamber however stems from the fact that the same chamber design has also been used to create defined temperature gradients which are not necessary and thus the chamber is much more complex than necessary.

      We added the scheme of the whole PTFE Chip to Figure 2 in the top left corner, indicating the ROI shown in the fluorescence micrographs. Additionally, the channel walls are now clearly indicated by white dotted lines. The dimensions of the setup are now shown clearer, by showing the total width of the channel as well as its height until the gas flux channel, as well as its depth. Changed caption of the figure accordingly and it now reads: “[…] The PTFE chip cutout in the top left corner shows the ROI used for the micrographs. The color scale is equal for both simulation and experiment and Channel dimensions are 4 x 1.5 x 0.25 mm as indicated. Dotted lines visualize the location of the channel walls. […]“

      The data shown in Figure 2d nicely shows nonrandom residuals (for experimental values vs. simulated) that are most pronounced at t~12 m and t~40-60m. It seems like this is (1) because some symmetry-breaking occurs that isn't accounted for by the model, and perhaps (2) because of the fact that these data are n=1. I think discussing what's going on with (1) would greatly improve the paper, and performing additional replicates to address (2) would be very informative and enhance the paper. Perhaps the negative and positive residuals would change sign in some, but not all, additional replicates?

      To address this, we will show two more replicates of the experiment and include them in Figure 2.

      We are seeing two effects when we compare fluorescence measurements of the experiments.

      Firstly, degassing of water causes the formation of air-bubbles, which are then transported upwards to the interface, disrupting fluorescence measurements. This, however, mostly occurs in experiments with elevated temperatures for PCR reactions, such as displayed in Figure 4.

      Secondly, due to the high surface tension of water, the interface is quite flexible. As the inflow and evaporation work to balance each other, the shape of the interface adjusts, leading to alterations in the circular flow fields below.

      Thus the conditions, while overall being in steady state, show some fluctuations. The strong dependence on interface shape is also seen in the simulation. However, modeling a dynamic interface shape is not so easy to accomplish, so we had to stick to one geometry setting. Again here, the added movies of two more experiments should clarify this issue.

      We performed three more replicates of the experiment and included the averaged data points together with their respective standard deviation as error bars in Figure 2d. Additionally, the videos of each individual repeat are now added to the supplementary files for the reader to better understand where the strong fluctuations around half an hour come from. The Figure caption was adjusted to “ […] The maximum relative concentration of DNA increased within an hour to ~30 X the initial concentration, with the trend following the simulation. Error bars are the standard deviation from four independent measurements. […].

      The main text was also changed to better explain how the fluctuations impact the measurements: […] Water continuously evaporated at the interface, but nucleic acids remained in the aqueous phase accumulating near the interface. They could only escape downward either by diffusion or by the vortex induced by the gas flowing across the interface, pushing the molecules back deeper into the bulk (See the flow lines in Fig2(b) taken from the simulation).  As the gas flow continuously removed excess vapor, the evaporation rate remained constant. Thus, except for fluctuations, a stable interface shape should be expected. However, due to the high surface tension of water, the interface is very flexible. As the inflow and evaporation work to balance each other, the shape of the interface adjusts, likely in response to small fluctuations in gas pressure and spatial variations in water surface tension. This is leading to alterations in the circular flow fields below (Supplementary Movie 2).

      As these fluctuations are difficult to simulate, we decided to stick with one interface shape, matching evaporation and inflow speeds. The evaporation rate at the interface was therefore set to be proportional to the vapor concentration gradient and varied spatially along the interface between 5 and 10.5 µm/s (See Suppl. Fig. VI.1(d)). Using the known diffusion coefficient of 95 µm²/s for the 63mer[9]}, the simulation closely matched the experimental results. In both cases, DNA accumulated in regions with circular flow patterns driven by the gas flux (Fig.2(b), right panel).

      5 minutes after starting the experiment, the maximum DNA accumulation was 3-fold, while after one hour of evaporation, around 30-fold accumulation was observed. Due to molecules residing in very shallow volumes when directly at the interface, the fluorescence signal can vary drastically compared to measurements deeper in the bulk. This can be seen in the fluctuations between independent measurements (See Supplementary Movies 2b,2b,2c), especially around 0.5~h shown in Figure 2(d). The simulated maximum accumulation followed the experimental results and starts saturating after about one hour (Fig.2(d)). […]”

      The authors will most likely be familiar with the work of Victor Ugaz and colleagues, in which they demonstrated Rayleigh-Bénard-driven PCR in convection cells (10.1126/science.298.5594.793, 10.1002/anie.200700306). Not including some discussion of this work is an unfortunate oversight, and addressing it would significantly improve the manuscript and provide some valuable context to readers. Something of particular interest would be their observation that wide circular cells gave chaotic temperature profiles relative to narrow ones and that these improved PCR amplification (10.1002/anie.201004217). I think contextualizing the results shown here in light of this paper would be helpful.

      Thanks for pointing this out and reminding us. We apologize. We agree that the chaotic trajectories within Rayleigh-Bénard convection cells lead to temperature oscillations similar to the salt variations in our gas-flux system. Although the convection-driven PCR in Rayleigh-Bénard is not isothermal like our system, it provides a useful point of comparison and context for understanding environments that can support full replication cycles. We will add a section comparing approaches and giving some comparison into the history of convective PCR and how these relate to the new isothermal implementation.

      We added a main text paragraph after the last paragraph in section “Strand Separation Dynamics”: “[…]Rayleigh-Bénard convection cells generate similar patterns to those seen in Fig. 3(c) The oscillations in salt concentration resemble the temperature fluctuations observed in convection-based PCR reactions from earlier studies [32,33], which showed that chaotic temperature variations, compared to periodic ones, enhanced the efficiency of the PCR reaction.[…]

      Again, it appears n=1 is shown for Figure 4a-c - the source of the title claim of the paper - and showing some replicates and perhaps discussing them in the context of prior work would enhance the manuscript.

      We appreciate the reviewer for bringing this to our attention. We will now include the two additional repeats for the data shown in Figure 4c, while the repeats of the PAGE measurements are already displayed in Supplementary Fig. IX.2. Initially, we chose not to show the repeats in Figure 4c due to the dynamic and variable nature of the system. These variations are primarily caused by differences at the water-air interface, attributed to the high surface tension of water. Additionally, the stochastic formation of air bubbles in the inflow—despite our best efforts to avoid them—led to fluctuations in the fluorescence measurements across experiments. These bubbles cause a significant drop in fluorescence in a region of interest (ROI) until the area is refilled with the sample.

      Unlike our RNA-focused experiments, PCR requires high temperatures and degassing a PCR master mix effectively is challenging in this context. While we believe our chamber design is sufficiently gas-tight to prevent air from diffusing in, the high surface-to-volume ratio in microfluidics makes degassing highly effective, particularly at elevated temperatures. We anticipate that switching to RNA experiments at lower temperatures will mitigate this issue, which is also relevant in a prebiotic context.

      The reviewer’s comments are valid and prompt us to fully display these aspects of the system. We will now include these repeats in Figure 4c to give readers a deeper understanding of the experiment's dynamics. Additionally, we will provide videos of all three repeats, allowing readers to better grasp the nature of the fluctuations in SYBR Green fluorescence depicted in Figure 4c.

      The data from the triplicates are now added to Figure 4c, showing how air bubbles, forming through degassing at the high temperatures required for Taq polymerase, disrupt the measurement, as they momentarily dry off the channel and stop the reaction until the channel fills again. Figure caption has been adapted and now reads: “[…] Dotted lines show the data from independent repeats. Air bubbles formed through degassing can momentarily disrupt the reaction. […]”

      We additionally changed the main text to explain the reader the experimental difficulties: “[…] In other repetitions of the reaction, this increase was sometimes even observed earlier, around the one-hour mark (dotted lines). However, air bubbles nucleated by degassing events rise and temporarily dry out the channel, interrupting the reaction until the liquid refills the channel (Supplementary Movies 4,4b,4c\&5). Despite our best efforts, we were unable to fully prevent this, especially given the high temperatures required for Taq polymerase activity. In an identical setting when the gas- and water flux were switched off, no fluorescence increase was found (See Fig. 4(c) red lines). Fluorescence variations are additionally caused by fluctuations in the position of the gas-water interface, as discussed earlier. […]”

      I think some caution is warranted in interpreting the PCR results because a primer-dimer would be of essentially the same length as the product. It appears as though the experiment has worked as described, but it's very difficult to be certain of this given this limitation. Doing the PCR with a significantly longer amplicon would be ideal, or alternately discussing this possible limitation would be helpful to the readers in managing expectations.

      This is a good point and should be discussed more in the manuscript. Our gel electrophoresis is capable of distinguishing between replicate and primer dimers. We know this since we were optimizing the primers and template sequences to minimize primer dimers, making it distinguishable from the desired 61mer product. That said, all of the experiments performed without a template strand added did not show any band in the vicinity of the product band after 4h of reaction, in contrast to the experiments with template, presenting a strong argument against the presence of primer dimers.

      We added a main text section explaining this to the reader: “[…]Suppl. Fig. IX.2 shows all independent repeats of the corresponding experiments. No product was detected in any of these cases, ruling out reaction limitations such as primer dimer formation. Primer dimers would form even in the absence of a template strand and would be identifiable through gel electrophoresis. As Taq polymerase requires a significant overlap between the two dimers to bind, this would result in a shorter product compared to the 61mer used here.  […]”

      Reviewer #2 (Public review):

      Schwintek et al. investigated whether a geological setting of a rock pore with water inflow on one end and gas passing over the opening of the pore on the other end could create a non-equilibrium system that sustains nucleic acid reactions under mild conditions. The evaporation of water as the gas passes over it concentrates the solutes at the boundary of evaporation, while the gas flux induces momentum transfer that creates currents in the water that push the concentrated molecules back into the bulk solution. This leads to the creation of steady-state regions of differential salt and macromolecule concentrations that can be used to manipulate nucleic acids. First, the authors showed that fluorescent bead behavior in this system closely matched their fluid dynamic simulations. With that validation in hand, the authors next showed that fluorescently labeled DNA behaved according to their theory as well. Using these insights, the authors performed a FRET experiment that clearly demonstrated the hybridization of two DNA strands as they passed through the high Mg++ concentration zone, and, conversely, the dissociation of the strands as they passed through the low Mg++ concentration zone. This isothermal hybridization and dissociation of DNA strands allowed the authors to perform an isothermal DNA amplification using a DNA polymerase enzyme. Crucially, the isothermal DNA amplification required the presence of the gas flux and could not be recapitulated using a system that was at equilibrium. These experiments advance our understanding of the geological settings that could support nucleic acid reactions that were key to the origin of life.

      The presented data compellingly supports the conclusions made by the authors. To increase the relevance of the work for the origin of life field, the following experiments are suggested:

      (1) While the central premise of this work is that RNA degradation presents a risk for strand separation strategies relying on elevated temperatures, all of the work is performed using DNA as the nucleic acid model. I understand the convenience of using DNA, especially in the latter replication experiment, but I think that at least the FRET experiments could be performed using RNA instead of DNA.

      We understand the request only partially. The modification brought about by the two dye molecules in the FRET probe to be able to probe salt concentrations by melting is of course much larger than the change of the backbone from RNA to DNA. This was the reason why we rather used the much more stable DNA construct which is also manufactured at a lower cost and in much higher purity also with the modifications. But we think the melting temperature characteristics of RNA and DNA in this range is enough known that we can use DNA instead of RNA for probing the salt concentration in our flow cycling.

      Only at extreme conditions of pH and salt, RNA degradation through transesterification, especially under alkaline conditions is at least several orders of magnitude faster than spontaneous degradative mechanisms acting upon DNA [Li, Y., & Breaker, R. R. (1999). Kinetics of RNA degradation by specific base catalysis of transesterification involving the 2 ‘-hydroxyl group. Journal of the American Chemical Society, 121(23), 5364-5372.]. The work presented in this article is however focussed on hybridization dynamics of nucleic acids. Here, RNA and DNA share similar properties regarding the formation of double strands and their respective melting temperatures. While RNA has been shown to form more stable duplex structures exhibiting higher melting temperatures compared to DNA [Dimitrov, R. A., & Zuker, M. (2004). Prediction of hybridization and melting for double-stranded nucleic acids. Biophysical Journal, 87(1), 215-226.], the general impact of changes in salt, temperature and pH [Mariani, A., Bonfio, C., Johnson, C. M., & Sutherland, J. D. (2018). pH-Driven RNA strand separation under prebiotically plausible conditions. Biochemistry, 57(45), 6382-6386.] on respective melting temperatures follows the same trend for both nucleic acid types. Also the diffusive properties of RNA and DNA are very similar [Baaske, P., Weinert, F. M., Duhr, S., Lemke, K. H., Russell, M. J., & Braun, D. (2007). Extreme accumulation of nucleotides in simulated hydrothermal pore systems. Proceedings of the National Academy of Sciences, 104(22), 9346-9351.].

      Since this work is a proof of principle for the discussed environment being able to host nucleic acid replication, we aimed to avoid second order effects such as degradation by hydrolysis by using DNA as a proxy polymer. This enabled us to focus on the physical effects of the environment on local salt and nucleic acid concentration. The experiments performed with FRET are used to visualize local salt concentration changes and their impact on the melting temperature of dissolved nucleic acids.  While performing these experiments with RNA would without doubt cover a broader application within the field of origin of life, we aimed at a step-by-step / proof of principle approach, especially since the environmental phenomena studied here have not been previously investigated in the OOL context. Incorporating RNA-related complexity into this system should however be addressed in future studies. This will likely require modifications to the experimental boundary conditions, such as adjusting pH, temperature, and salt concentration, to account for the greater duplex stability of RNA. For instance, lowering the pH would reduce the RNA melting temperature [Ianeselli, A., Atienza, M., Kudella, P. W., Gerland, U., Mast, C. B., & Braun, D. (2022). Water cycles in a Hadean CO2 atmosphere drive the evolution of long DNA. Nature Physics, 18(5), 579-585.].

      (2) Additionally, showing that RNA does not degrade under the conditions employed by the authors (I am particularly worried about the high Mg++ zones created by the flux) would further strengthen the already very strong and compelling work.

      Based on literature values for hydrolysis rates of RNA [Li, Y., & Breaker, R. R. (1999). Kinetics of RNA degradation by specific base catalysis of transesterification involving the 2 ‘-hydroxyl group. Journal of the American Chemical Society, 121(23), 5364-5372.], we estimate RNA to have a half-life of multiple months under the deployed conditions in the FRET experiment (High concentration zones contain <1mM of Mg2+). Additionally, dsRNA is multiple orders of magnitude more stable than ssRNA with regards to degradation through hydrolysis [Zhang, K., Hodge, J., Chatterjee, A., Moon, T. S., & Parker, K. M. (2021). Duplex structure of double-stranded RNA provides stability against hydrolysis relative to single-stranded RNA. Environmental Science & Technology, 55(12), 8045-8053.], improving RNA stability especially in zones of high FRET signal. Furthermore, at the neutral pH deployed in this work, RNA does not readily degrade. In previous work from our lab [Salditt, A., Karr, L., Salibi, E., Le Vay, K., Braun, D., & Mutschler, H. (2023). Ribozyme-mediated RNA synthesis and replication in a model Hadean microenvironment. Nature Communications, 14(1), 1495.], we showed that the lifetime of RNA under conditions reaching 40mM Mg2+ at the air-water interface at 45°C was sufficient to support ribozymatically mediated ligation reactions in experiments lasting multiple hours.

      With that in mind, gaining insight into the median Mg2+ concentration across multiple averaged nucleic acid trajectories in our system (see Fig. 3c&d) and numerically convoluting this with hydrolysis dynamics from literature would be highly valuable. We anticipate that longer residence times in trajectories distant from the interface will improve RNA stability compared to a system with uniformly high Mg2+ concentrations.

      Added a new Supplementary section for this. We used the trace from Figure 3(c) and calculated the hydrolysis rate for each timestep by using literature values from RNA [Li, Y., & Breaker, R. R. (1999). Kinetics of RNA degradation by specific base catalysis of transesterification involving the 2 ‘-hydroxyl group. Journal of the American Chemical Society, 121(23), 5364-5372.]. We conclude that the conditions deployed for the experiment are not harsh on RNA, with hydrolysis rates in the E-6 1/min regime. The figure below (also now in the supplementary information) shows the hydrolysis of RNA deployed under the conditions of the experiment in Figure 3. RNA is not expected to hydrolyze under these conditions and timescales, in which a replication reaction would occur. With a half life of around 83 days, even a prebiotically plausible – very slow – replication reaction would not be constrained by hydrolysis boundary conditions in this scenario.

      Referenced to this section in the supplementary information in the maintext: […] In the experimental conditions used here, RNA would also not readily degrade, even if the strand enters the high salt regimes (See Suppl. Sec. IX). Using literature values for hydrolysis rates under the deployed conditions, we estimate dissolved RNA to have a half life of around 83 days. […]

      (3) Finally, I am curious whether the authors have considered designing a simulation or experiment that uses the imidazole- or 2′,3′-cyclic phosphate-activated ribonucleotides. For instance, a fully paired RNA duplex and a fluorescently-labeled primer could be incubated in the presence of activated ribonucleotides +/- flux and subsequently analyzed by gel electrophoresis to determine how much primer extension has occurred. The reason for this suggestion is that, due to the slow kinetics of chemical primer extension, the reannealing of the fully complementary strands as they pass through the high Mg++ zone, which is required for primer extension, may outcompete the primer extension reaction. In the case of the DNA polymerase, the enzymatic catalysis likely outcompetes the reannealing, but this may not recapitulate the uncatalyzed chemical reaction.

      This is certainly on our to-do list for future experiments in this setting. Our current focus is on templated ligation rather than templated polymerization and we are working hard to implement RNA-only enzyme-free ligation chain reaction, based on more optimized parameters for the templated ligation from 2’3’-cyclic phosphate activation that was just published [High-Fidelity RNA Copying via 2′,3′-Cyclic Phosphate Ligation, Adriana C. Serrão, Sreekar Wunnava, Avinash V. Dass, Lennard Ufer, Philipp Schwintek, Christof B. Mast, and Dieter Braun, JACS doi.org/10.1021/jacs.3c10813 (2024)]. But we first would try this at an air-water interface which was shown to work with RNA in a temperature gradient [Ribozyme-mediated RNA synthesis and replication in a model Hadean microenvironment, Annalena Salditt, Leonie Karr, Elia Salibi, Kristian Le Vay, Dieter Braun & Hannes Mutschler, Nature Communications doi.org/10.1038/s41467-023-37206-4 (2023)] before making the jump to the isothermal setting we describe here. So we can understand the question, but it was good practice also in the past to first get to know the setting with PCR, then jump to RNA.

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      (1) Could the authors comment on the likelihood of the geological environments where the water inflow velocity equals the evaporation velocity?

      This is an important point to mention in the manuscript, thank you for pointing that out. To produce a defined experiment, we were pushing the water out with a syringe pump, but regulated in a way that the evaporation was matching our flow rate. We imagine that a real system will self-regulate the inflow of the water column on the one hand side by a more complex geometry of the gas flow, matching the evaporation with the reflow of water automatically. The interface would either recede or move closer to the gas flux, depending on whether the inflow exceeds or falls short of the evaporation rate. As the interface moves closer, evaporation speeds up, while moving away slows it down. This dynamic process stabilizes the system, with surface tension ultimately fixing the interface in place.

      We have seen a bit of this dynamic already in the experiments, could however so far not yet find a good geometry within our 2-dimensional constant thickness geometry to make it work for a longer time. Very likely having a 3-dimensional reservoir of water with less frictional forces would be able to do this, but this would require a full redesign of a multi-thickness microfluidics. The more we think about it, the more we envisage to make the next implementation of the experiment with a real porous volcanic rock inside a humidity chamber that simulates a full 6h prebiotic day. But then we would lose the whole reproducibility of the experiment, but likely gain a way that recondensation of water by dew in a cold morning is refilling the water reservoirs in the rocks again. Sorry that I am regressing towards experiments in the future.

      We added a paragraph after the second paragraph in Results and Discussion.

      It now reads: […] For a real early Earth environment we envision a system that self-regulates the water column's inflow by automatically balancing evaporation with capillary flows. The interface adjusts its position relative to the gas flux, moving closer if the inflow is less than the evaporation rate, or receding if it exceeds it. When the interface nears the gas flux, evaporation accelerates, while moving it away slows evaporation. This dynamic process stabilizes the system, with surface tension ultimately fixing the interface's position. […]

      (2) Could the authors speculate on using gases other than ambient air to provide the flux and possibly even chemical energy? For example, using carbonyl sulfide or vaporized methyl isocyanide could drive amino acid and nucleotide activation, respectively, at the gas-water interface.

      This is an interesting prospect for future work with this system. We thought also about introducing ammonia for pH control and possible reactions. We were amazed in the past that having CO2 instead of air had a profound impact on the replication and the strand separation [Water cycles in a Hadean CO2 atmosphere drive the evolution of long DNA, Alan Ianeselli, Miguel Atienza, Patrick Kudella, Ulrich Gerland, Christof Mast & Dieter Braun, Nature Physics doi.org/10.1038/s41567-022-01516-z (2022)]. So going more in this direction absolutely makes sense and as it acts mostly on the length-selectively accumulated molecules at the interface, only the selected molecules will be affected, which adds to the selection pressure of early evolutionary scenarios.

      Of course, in the manuscript, we use ambient air as a proxy for any gas, focusing primarily on the energy introduced through momentum transfer and evaporation. We speculate that soluble gasses could establish chemical gradients, such as pH or redox potential, from the bulk solution to the interface, similar to the Mg2+ accumulation shown in Figure 3c. The nature of these gradients would depend on each gas's solubility and diffusivity. We have already observed such effects in thermal gradients [Keil, L. M., Möller, F. M., Kieß, M., Kudella, P. W., & Mast, C. B. (2017). Proton gradients and pH oscillations emerge from heat flow at the microscale. Nature communications, 8(1), 1897.] and finding similar behavior in an isothermal environment would be a significant discovery.

      Added a paragraph in the Conclusion to showcase this: [… ] Furthermore we expect that other gases, such as CO2, could establish chemical gradients in this environment. Such gradients have been observed in thermal gradients before [23] and finding similar behaviour in an isothermal environment would be a significant discovery.[…]

      (3) Line 162: Instead of "risk," I suggest using "rate".

      Thanks for pointing this out! Will be changed.

      Fixed.

      (4) Using FRET of a DNA duplex as an indicator of salt concentration is a decent proxy, but a more direct measurement of salt concentration would provide further merit to the explicit statement that it is the salt concentration that is changing in the system and not another hidden parameter.

      Directly observing salt concentration using microscopy is a difficult task. While there are dyes that change their fluorescence depending on the local Na+ or Mg2+ concentration, they are not operating differentially, i.e. by making a ratio between two color channels. Only then we are not running into artifacts from the dye molecules being accumulated by the non-equilibrium settings. We were able to do this for pH in the past, but did not find comparable optical salt sensors. This is the reason we ended up with a FRET pair, with the advantage that we actually probe the strand separation that we are interested in anyhow. Using such a dye in future work would however without a doubt enhance the understanding of not only this system, but also our thermal gradient environments.

      (5) Figure 3a: Could the authors add information on "Dried DNA" to the caption? I am assuming this is the DNA that dried off on the sides of the vessel but cannot be sure.

      Thanks to the reviewer for pointing this out. This is correct and we will describe this better in the revised manuscript.

      Added a sentence in the caption to address this: […] Fluctuations in interface position can dry and redissolve DNA repeatedly (see “Dried DNA” in right panel). […]

      (6) Figure 4b and c: How reproducible is this data? Have the authors performed this reaction multiple independent times? If so, this data should be added to the manuscript.

      The data from the gel electrophoresis was performed in triplicates and is shown in full in supplementary information. The data in c is hard to reproduce, as the interface is not static and thus ROI measurements are difficult to perform as an average of repeats. Including the data from the independent repeats will however give the reader insight into some of the experimental difficulties, such as air bubbles, which form from degassing as the liquid heats up, that travel upwards to the interface, disrupting the ongoing fluorescence measurements.

      This was also pointed out by reviewer 1 and addressed there.

      (7) Line 256: "shielding from harmful UV" statement only applies to RNA oligomers as UV light may actually be beneficial for earlier steps during ribonucleoside synthesis. I suggest rephrasing to "shielding nucleic acid oligomers from UV damage.".

      Will be adjusted as mentioned.

      Fixed.

      (8) The final paragraph in the Results and Discussion section would flow better if placed in the Conclusion section.

      This is a good point and we will merge results and discussion closer together.

      Fixed.

      (9) Line 262, "...of early Life" is slightly overstating the conclusions of the study. I suggest rephrasing to "...of nucleic acids that could have supported early life."

      This is a fair comment. We thank the reviewer for his detailed analysis of the manuscript!

      Changed the phrase to: […]In this work we investigated a prebiotically plausible and abundant geological environment to support the replication of nucleic acids. […]

      (10) In references, some of the journal names are in sentence case while others are in title case (see references 23 and 26 for example).

      Thanks - this will be fixed.

      Fixed.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      This study provides compelling evidence that RAR, rather than its obligate dimerization partner RXR, is functionally limiting for chromatin binding. This manuscript provides a paradigm for how to dissect the complicated regulatory networks formed by dimerizing transcription factor families.

      Dahal and colleagues use advanced SMT techniques to revisit the role of RXR in DNA-binding of the type-2 nuclear receptor (T2NR) RAR. The dominant consensus model for regulated DNA binding of T2NRs posits that they compete for a limited pool of RXR to form an obligate T2NR-RXR dimer. Using advanced SMT and proximity-assisted photoactivation technologies, Dahal et al. now test the effect of manipulating the endogenous pool size of RAR and RXR on heterodimerization and DNA-binding in live U2OS cells. Surprisingly, it turns out that RAR, rather than RXR, is functionally limiting for heterodimerization and chromatin binding. By inference, the relative pool size of various T2NRs expressed in a given cell, rather than RXR, is likely to determine chromatin binding and transcriptional output.

      The conclusions of this study are well supported by the experimental results and provide unexpected novel insights into the functioning of the clinically important class of T2NR TFs. Moreover, the presented results show how the use of novel technologies can put long-standing theories on how transcription factors work upside down. This manuscript provides a paradigm for how to further dissect the complicated regulatory networks formed by T2NRs or other dimerizing TFs. I found this to be a complete story that does not require additional experimental work. However, I do have some suggestions for the authors to consider.

      Reviewer #1 (Recommendations For The Authors):

      (1) Does the increased chromatin binding measured when the RAR levels are increased reflect a higher occupancy of a similar set of loci, or are additional loci bound? The authors could discuss this issue in the context of the published literature. Obviously, this could be addressed experimentally by ChIP-seq or a similar analysis, but this would extend beyond the main topic of this manuscript.

      We attempted to explore this experimentally using ChIP-seq with multiple RAR- and RXR-specific antibodies. Unfortunately, our results were inconclusive, as the antibody enrichment relative to the IgG control was insufficient for reliable interpretation. Specifically, our ChIP-seq enrichment levels were only around 1.5fold, while the accepted standard for meaningful ChIP enrichment is typically at least 2-fold. Due to these technical limitations, we decided to defer these experiments for now.

      However, we agree with the reviewer that understanding whether the increased chromatin binding of RAR reflects higher occupancy at the same set of loci or binding to additional loci is a key question. In similar experiments involving the transcription factor TFEB (Esbin et al., 2024, Genes Dev, doi: 10.1101/gad.351633.124) where an increase in the SMT bound fraction occurred, both scenarios—higher occupancy at known loci and binding to additional loci in ChIP-seq was observed. So, addressing this intriguing possibility in future studies focused on RAR and RXR would be interesting.

      (2) The results presented suggest convincingly that endogenous RXR is normally in excess to its binding partners (in U2OS cells). This point could be strengthened further by reducing RXR levels, e.g., by knocking out 1 allele or the use of shRNAs (although the latter method might be too hard to control). Overexpression of another T2NR might also help determine the buffer capacity of RXR.

      We appreciate the reviewers’ acknowledgment that our results convincingly demonstrate that endogenous RXR is typically in excess relative to its binding partners in U2OS cells. We agree that this conclusion could be further reinforced by experiments such as overexpression of another T2NR to test RXR's buffering capacity. We are actively pursuing follow-up experiments involving overexpression of additional T2NRs to address this question in more detail. These studies are ongoing, and we plan to explore the buffer capacity of RXR more extensively in a future manuscript.

      (3) The ~10% difference in fbound of RAR and RXR (in Figs 1 and 2), while they should be 1:1 dimers, is explained by invoking the expression of RXR isoforms. Can the authors be more specific concerning the nature of these isoforms?

      We have provided detailed information about different T2NRs expressed in U2OS cells according to the Expression Atlas and the Human Protein Atlas Database in Supplementary Table S1. Table S1 specifically shows that both isoforms of RXRα and RXRβ are expressed in U2OS cells. Additionally, the caption of Table S1 explicitly notes the presence of isoform RXRβ in U2OS cells. In the main text, we reference Table S1 when discussing the 10% difference in fbound between RARα and RXRα, and we have now suggested that the expression of RXRβ likely accounts for the observed discrepancy.

      Reviewer #2 (Public Review):

      Summary:

      In the manuscript "Surprising Features of Nuclear Receptor Interaction Networks Revealed by Live Cell Single Molecule Imaging", Dahal et al combine fast single molecule tracking (SMT) with proximity-assisted photoactivation (PAPA) to study the interaction between RARa and RXRa. The prevalent model in the nuclear receptor field suggests that type II nuclear receptors compete for a limiting pool of their partner RXRa. Contrary to this, the authors find that over-expression of RARa but not RXRa increases the fraction of RXRa molecules bound to chromatin, which leads them to conclude that the limiting factor is the abundance of RARa and not RXRa. The authors also perform experiments with a known RARa agonist, all trans retinoic acid (atRA) which has little effect on the bound fraction. Using PAPA, they show that chromatin binding increases upon dimerization of RARa and RXRa.

      Strengths:

      In my view, the biggest strength of this study is the use of endogenously tagged RARa and RXRa cell lines. As the authors point out, most previous studies used either in vitro assays or over-expression. I commend the authors on the generation of single-cell clones of knock-in RARa-Halo and Halo-RXRa. The authors then carefully measure the abundance of each protein using FACS, which is very helpful when comparing across conditions. The manuscript is generally well written and figures are easy to follow. The consistent color-scheme used throughout the manuscript is very helpful.

      Weaknesses:

      (1) Agonist treatment:

      The authors test the effect of all trans retinoic acid (atRA) on the bound fraction of RARa and RXRa and find that "These results are consistent with the classic model in which dimerization and chromatin binding of T2NRs are ligand independent." However, all the agonist treatments are done in media containing FBS. FBS is not chemically defined and has been found to have between 10 and 50 nM atRA (see references in PMID 32359651 for example). The addition of 1 nM or 100 nM atRA is unlikely to result in a strong effect since the medium already contains comparable or higher levels of agonist. To test their hypothesis of ligand-independent dimerization, the authors should deplete the media of atRA by growing the cells in a medium containing charcoal-stripped FBS for at least 24 hours before adding agonist.

      We acknowledge the reviewer's concern regarding the presence of atRA in FBS and agree that it may introduce baseline levels of agonist. However, in our experiments, both the 1 nM and 100 nM atRA treatments resulted in observable changes in RAR expression levels (Figure S3C). Additionally, the luciferase assays demonstrated that 100 nM atRA significantly increased retinoic acid-responsive promoter activity (Figure S1C). Given these clear responses to atRA, we believe the observed lack of effect on the chromatin-bound fraction cannot be attributed to the presence of comparable or higher levels of atRA in the FBS, as the reviewer suggests. Moreover, since our results align with the established literature and do not impact the core findings of our study, we decided not to pursue the suggested experiments with charcoal-stripped FBS in this manuscript.  

      (2) Photobleaching and its effect on bound fraction measurements:

      The authors discard the first 500 to 1000 frames due to the high localization density in the initial frames. This will preferentially discard bound molecules that will bleach in the initial frames of the movie and lead to an over-estimation of the unbound fraction.

      For experiments with over-expression of RAR-Halo and Halo-RXR, the authors state that the cells were pre-bleached and that these frames were used to calculate the mean intensity of the nuclei. When pre-bleaching, bound molecules will preferentially bleach before the diffusing population. This will again lead to an over-representation of the unbound fraction since this is the population that will remain relatively unaffected by the pre-bleaching. Indeed, the bound fraction for over-expressed RARa and RXRa is significantly lower than that for the corresponding knock in lines. To confirm whether this is a biological result, I suggest that the authors either reduce the amount of dye they use so that this pre-bleaching is not necessary or use the direct reactivation strategy they use for their PAPA experiments to eliminate the pre-bleaching step.

      As for the measurement of the nuclear intensity, since the authors have access to multiple HaloTag dyes, they can saturate the HaloTagged proteins with a high concentration of JF646 or JFX650 to measure the mean intensity of the protein while still using the PA-JFX549 for SMT. Together, these will eliminate the need to prebleach or discard any frames.

      The Janelia Fluor dyes used in our experiments are known for their high photostability (Grimm et al., 2021, JACS Au, doi: 10.1021/jacsau.1c00006). During the initial 80 ms imaging to calculate the mean nuclear intensity, the laser power was kept at very low intensity (~3%) for a brief duration (~10 seconds), in contrast to the high-intensity (~100%) used during the tracking experiments, which span around 3 minutes. This low-power illumination does not induce significant photobleaching but merely puts the dyes in a temporary dark state. Therefore, this pre-bleaching step closely resembles the direct reactivation strategy employed in our PAPA experiments.

      To further address the reviewer's concern, we performed a frame cut-off analysis for our SMT movies of endogenous RARα-Halo and over-expressed RARα-Halo (Figure S9B). The analysis shows no significant change in the bound fraction of either endogenous or over-expressed RARα-Halo when discarding the initial 1000 frames. Based on these results, we conclude that the pre-bleaching does not lead to an overestimation of the unbound fraction, and that our experimental approach is robust.

      (3) Heterogeneous expression of the SNAP fusion proteins:

      The cell lines expressing SNAP tagged transgenes shown in Fig S6 have very heterogeneous expression of the SNAP proteins. While the bulk measurements done by Western blotting are useful, while doing single-cell experiments (especially with small numbers - ~20 - of cells), it is important to control for expression levels. Since these transgenic stable lines were not FACS sorted, it would be helpful for the reader to know the spread in the distribution of mean intensities of the SNAP proteins for the cells that the SMT data are presented for. This step is crucial while claiming the absence of an effect upon over-expression and can easily be done with a SNAPTag ligand such as SF650 using the procedure outlined for the over-expressed HaloTag proteins.

      We agree with the reviewer that there is heterogeneity in SNAP protein expression across the transgenic lines. In response to the reviewer’s suggestion, we performed the proposed experiment to assess the distribution of mean intensities for two key experimental conditions: Halo-RXRα with overexpressed RARα-SNAP and HaloRXRα with overexpressed RARαRR-SNAP. These results again confirm that the increase in chromatin-bound fraction of Halo-RXRα is observed only in the presence of RARα capable of heterodimerizing with RXRα, supporting our main conclusion (Figure S9).

      For these experiments, we followed the same labelling procedure described in the methods section for tracking endogenous Halo-tagged proteins alongside transgenic SNAP proteins. As shown in Figure S9, for ~ 70 cell nuclei, the distribution of mean intensities is similar for both conditions, with the bound fraction of Halo-RXRα significantly increasing in the presence of RARα-SNAP compared to RARαRR-SNAP. This analysis underscores that the observed effects are indeed due to the functional differences between the two RARα variants rather than variability in expression levels.

      (4) Definition of bound molecules:

      The authors state that molecules with a diffusion coefficient less than 0.15 um2/s are considered bound and those between 1-15 um2/s are considered unbound. Clarification is needed on how this threshold was determined. In previous publications using saSPT, the authors have used a cutoff of 0.1 um2/s (for example, PMID 36066004, 36322456). Do the results rely on a specific cutoff? A diffusion coefficient by itself is only a useful measure of normal diffusion. Bound molecules are unlikely to be undergoing Brownian motion, but the state array method implemented here does not seem to account for non-normal diffusive modes. How valid is this assumption here?

      We acknowledge the inconsistency in the diffusion coefficient thresholds for defining the chromatin-bound fraction used across our group’s publications. The choice of threshold or cutoff (0.1 µm²/s vs 0.15 µm²/s) is largely arbitrary and does not significantly impact the results. To validate this, we tested the effect of different cutoffs on fbound (%) for endogenously expressed Halo-tagged RARα and RXRα (Figure S10). As shown in Figure S10, there was no substantial difference in fbound (%) calculated using a 0.1 µm²/s versus 0.15 µm²/s cutoff (e.g., RARα clone c156: 47±1% vs 49±1%; RXRα clone D6: 34±1% vs 35±1%). 

      Since we have consistently applied the 0.15 µm²/s cutoff throughout this manuscript across all experimental conditions, the comparative analysis of fbound (%) remains valid. While we agree that a Brownian diffusion model may not fully capture the motion of bound molecules, our state array model accounts for localization error, which likely incorporates some of the chromatin motion features. Moreover, the distinction between bound (<0.15 µm²/s) and unbound (1-15 µm²/s) populations is sufficiently large that using a normal diffusion model is reasonable for our analysis.

      (5) Movies:

      Since this is an imaging manuscript, I request the authors to provide representative movies for all the presented conditions. This is an essential component for a reader to evaluate the data and for them to benchmark their own images if they are to try to reproduce these findings.

      We have now included representative movies for all the SMT experimental conditions presented in the manuscript. Please see data availability section of the manuscript.

      (6) Definition of an ROI:

      The authors state that "ROI of random size but with maximum possible area was selected to fit into the interior of the nuclei" while imaging. However, the readout speed of the Andor iXon Ultra 897 depends on the size of the defined ROI. If the ROI was variable for every movie, how do the authors ensure the same sampling rate?

      We used the frame transfer mode on the Andor iXon Ultra 897 camera for our acquisitions, which allows for fast frame rate measurements without altering the exposure time between frames. Additionally, we verified the metadata of all our movies to ensure a consistent frame interval of 7.4 ms across all conditions. This confirms that the sampling rate was maintained uniformly, despite the variability in ROI size. 

      Reviewer #2 (Recommendations For The Authors):

      (1) 'Hoechst' is mis-spelled.

      We have now corrected this typo in the manuscript.

      (2) Cos7 appears in several places throughout the text. I assume this is a typo. If so, please correct it. If not, please explain if some experiments were done in Cos7 cells and kindly provide a justification for that.

      The use of Cos7 cells is intentional and not a typo. Cos7 cells have been previously utilized in studies investigating the interaction between T2NRs (Kliewer et al., 1992, Nature, doi: 10.1038/355446a0). In our study, due to technical issues with antibodies for coIP in U2OS cells, we initially used Cos7 cells for control experiments to verify that Halo-tagging of RARα and RXRα did not disrupt their interaction, by transiently expressing the constructs in Cos7 cells. Following these control experiments, we confirmed the direct interaction of endogenously expressed RAR and RXR in U2OS cells with their respective binding partners using the SMT-PAPA assay. Since these results confirmed that Halo-tagging did not interfere with RAR-RXR interactions, we chose not to repeat the coIP experiments in U2OS cells.

      Reviewer #3 (Public Review):

      Summary:

      This study aims to investigate the stoichiometric effect between core factors and partners forming the heterodimeric transcription factor network in living cells at endogenous expression levels. Using state-of-the-art single-molecule analysis techniques, the authors tracked individual RARα and RXRα molecules labeled by HALO-tag knock-in. They discovered an asymmetric response to the overexpression of counter-partners. Specifically, the fact that an increase in RARα did not lead to an increase in RXRα chromatin binding is incompatible with the previous competitive core model. Furthermore, by using a technique that visualizes only molecules proximal to partners, they directly linked transcription factor heterodimerization to chromatin binding.

      Strengths:

      The carefully designed experiments, from knock-in cell constructions to singlemolecule imaging analysis, strengthen the evidence of the stoichiometric perturbation response of endogenous proteins. The novel finding that RXR, previously thought to be a target of competition among partners, is in excess provides new insight into key factors in dimerization network regulation. By combining the cutting-edge single-molecule imaging analysis with the technique for detecting interactions developed by the authors' group, they have directly illustrated the relationship between the physical interactions of dimeric transcription factors and chromatin binding. This has enabled interaction analysis in live cells that was challenging in single-molecule imaging, proving it is a powerful tool for studying endogenous proteins.

      Weaknesses:

      As the authors have mentioned, they have not investigated the effects of other T2NRs or RXR isoforms. These invisible factors leave room for interpretation regarding the origin of chromatin binding of endogenous proteins (Recommendations 4). In the PAPA experiments, overexpressed factors are visualized, but changes in chromatin binding of endogenous proteins due to interactions with the overexpressed proteins have not been investigated. This might be tested by reversing the fluorescent ligands for the Sender and Receiver. Additionally, the PAPA experiments are likely to be strengthened by control experiments (Recommendations 5).

      We agree that this would be an interesting experiment. However, there are three technical challenges that complicate its implementation: First, as demonstrated in our original PAPA paper, dark state formation is less efficient when dyes are conjugated to Halo compared to SNAPf, making the reverse configuration less optimal. Second, SNAPf-tagged proteins have slower labeling kinetics than Halotagged proteins, often resulting in under-labeling of SNAPf. Third, our SNAPf transgenes were integrated polyclonally. Since background PAPA scales with the concentration of the sender-labeled protein, variable concentrations of the senderlabeled SNAPf proteins would introduce significant variability, complicating the interpretation of the background PAPA signal. Due to these concerns, we believe that performing reciprocal measurements with reversed fluorescent ligands may not yield reliable results. 

      Reviewer #3 (Recommendations For The Authors):

      (1) The term "Surprising features" in the title is ambiguous and may force readers to search for what it specifically refers to. Including a word that evokes specific features might be helpful.

      Our findings contradict previous work, which suggested that chromatin binding of T2NRs is regulated by competition for a limited pool of RXR. In contrast, we found that RAR expression can limit RXR chromatin binding, but not the other way around, which challenges the existing model. This unexpected result is what we refer to as a "surprising feature" in our title, and we believe it accurately reflects the novel insights our study provides. We also think that this is clearly conveyed in our manuscript abstract, supporting the use of "Surprising features" in the title. 

      (2) p.3, line 11 - The threshold of 0.15 μm2s-1 seems to be a crucial value directly linked to the value of fbound. What is the rationale for choosing this specific value? If consistent conclusions can be obtained using threshold values that are similar but different, it would strengthen the robustness of the results.

      Please refer to our response to Reviewer #2’s Public Review point 4. The threshold choice is arbitrary and doesn’t affect the overall conclusions. To test this, we compared fbound (%) values calculated using both 0.1 μm²s-1 and 0.15 μm²s-1 cutoffs. For example, with endogenously expressed Halo-tagged RARα (clone c156), we observed fbound values of 47±1% vs 49±1%, and for RXRα (clone D6), 34±1% vs 35±1%, respectively (Figure S10). Since we have consistently applied the 0.15 μm²s-1 cutoff across all experimental conditions in this manuscript, the comparisons of fbound (%) between different conditions are robust and valid.

      (3) p.4, line 13 - "the fbound of endogenous RARα-Halo (47{plus minus}1%) was largely unchanged upon expression of SNAP (47{plus minus}1%)" part of the sentence is not surprising. It would make more sense if it were expressed as "the fbound of endogenous RARα-Halo (47{plus minus}1%) was largely unchanged upon expression of RXRα-SNAP (49{plus minus}1%), consistent with the control SNAP (47{plus minus}1%).".

      We understand how the original phrasing may be confusing to the readers and have restructured the sentence as suggested by the reviewer for clarity.

      (4) p.6, line 26 - The discussion that "most chromatin binding of endogenous RXRα in U2OS cells depends on heterodimerization partners other than RARα" seems to contradict the top right figure in Figure 4. If that's the case, the binding partner for the bound red molecule might be yellow rather than blue. Given a decrease in the number of RARα molecules with an unchanged binding ratio, the total number of binding molecules has decreased. Could it be interpreted that the potential reduction in RXRα chromatin binding, accompanying the decrease in binding RARα, is compensated for by other partners?

      We agree with the reviewer that both the yellow and blue molecules in Figure 4 represent T2NRs that can heterodimerize with RXR. For simplicity, we chose to omit the depiction of RXR dimerization with other T2NRs (represented in yellow) in Figure 4. We have now included a note in the figure caption to clarify this. We plan to follow up on the buffer capacity of RXR with other T2NRs in a separate manuscript and will discuss this aspect in more detail once we have data from those experiments.

      (5) Fig. 3 - I expected that DR localizations always appear more frequently than PAPA localizations by the difference in the number of distal molecules. Why does the linear line for SNAP-RXRα in Fig. 3 B have a slope exceeding 1? Also, although the sublinearity is attributed to binding saturation, is there any possibility that this sublinearity originates from the PAPA system like the saturation of PAPA reactivation? Control samples like Halo-SNAPf-3xNLS might address these concerns.

      The number of DR and PAPA localizations depends on the arbitrarily chosen intensity and duration of green and violet light pulses. For any given protein pair, different experimental settings can result in PAPA localizations being greater than, less than, or equal to the number of DR localizations. Therefore, the informative metric is not the absolute number of DR and PAPA localizations, but rather how the ratio of PAPA to DR localizations changes between different conditions—such as between interacting pairs and non-interacting controls.

      Regarding the sublinearity, we agree that it is essential to consider whether the observed sublinearity might stem from saturation of the PAPA signal. We know of two ways in which this could occur:

      First, PAPA can be saturated as the duration of the green light pulse increases and dark-state complexes are depleted. However, this cannot explain the nonlinearity that we observe, because the duration of the green light pulse is constant, and thus the probability that a given complex is reactivated by PAPA is also constant. Likewise, holding the violet pulse duration constant yields a constant probability that a given molecule is reactivated by DR. PAPA localizations are expected to scale linearly with the number of complexes, while DR localizations are expected to scale linearly with the total number of molecules. Sublinear scaling of PAPA localizations with DR localizations thus implies that the number of complexes scales sublinearly with the total concentration of the protein.

      Second, saturation could occur if PAPA localizations are undercounted compared to DR localizations. While this is a valid concern, we consider it unlikely in this case because 1) our localization density is below the level at which our tracking algorithm typically undercounts localizations, and 2) we observe sublinearity for RXR → RAR PAPA even though the number of PAPA localizations is lower than the DR localizations; undercounting due to excessive localization density would be expected to introduce the opposite bias in this case.

      (6) Fig. 4 - The differences between A, B, and C on the right side of the model are subtle, making it difficult to discern where to see. Emphasizing the difference in molecule numbers or grouping free molecules at the top might help clarify these distinctions.

      We appreciate the reviewer’s feedback. In response, we have revised Figure 4 by grouping the free molecules on the top right side for panels A, B and C, as suggested.

      (7) While the main results are obtained through single-molecule imaging, no singlemolecule fluorescence images or trajectory plots are provided. Even just for representative conditions, these could serve as a guide for readers trying to reproduce the experiments with different custom-build microscope setups. Also, considering data availability, depositing the source data might be necessary, at least for the diffusion spectra.

      We have now included representative movies for all the presented SMT conditions as source data. Please see data availability section of the manuscript.

      (8) Tick lines are not visible on many of the graph axes. 

      We have revised the figures to ensure that the tick lines are now clearly visible on all graph axes.

      (9) Inconsistencies in the formatting are present in the methods, such as "hrs" vs. "hours", spacing between numbers and units, and "MgCl2". "u" should be "μ" and "x" should be "×". 

      We have corrected the formatting errors.

      (10) Table S4, rows 16 and 17 - Are "RAR"s typos for "RXR"s? 

      We have corrected this in the manuscript.

      (11) p.10~12 - Are three "Hoestch"s typos for "Hoechst"s? 

      This is now corrected in the manuscript.

      (12) p.11, line 17 - According to the referenced paper, the abbreviation should be "HILO" in all capital letters, not "HiLO". 

      This is now corrected in the manuscript.

      (13) "%" on p.3, line 18, and "." on p.6, line 27 are missing. 

      This missing “%”  and “.” are now added.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      The manuscript by Yao S. and colleagues aims to monitor the potential autosomal regulatory role of the master regulator of X chromosome inactivation, the Xist long non-coding RNA. It has recently become apparent that in the human system, Xist RNA can not only spread in cis on the future inactive X chromosome but also reach some autosomal regions where it recruits transcriptional repression and Polycomb marking. Previous work has also reported that Xist RNA can show a diffused signal in some biological contexts in FISH experiments.

      In this study, the authors investigate whether Xist represses autosomal loci in differentiating female mouse embryonic stem cells (ESCs) and somatic mouse embryonic fibroblasts (MEFs). They perform a time course of ESC differentiation followed by Capture Hybridization of Associated RNA Targets (CHART) on both female and male ESCs, as well as pulldowns with sense oligos for Xist. The authors also examine transcriptional activity through RNA-seq and integrate this data with prior ChIP-seq experiments. Additional experiments were conducted in MEFs and Xist-ΔB repeat mutants, the latter fails to recruit Polycomb repressors.

      Based on this experimental design, the authors make several bold claims:

      (1) Xist binds to about a hundred specific autosomal regions.

      (2) This binding is specific to promoter regions rather than broad spreading.

      (3) Xist autosomal signal is inversely correlated with PRC1/2 marks but positively correlated with transcription.

      (4) Xist targeting results in the attenuation of transcription at autosomal regions.

      (5) The B-repeat region is important for autosomal Xist binding and gene repression.

      (6) Xist binding to autosomal regions also occurs in somatic cells but does not lead to gene repression.

      Together, these claims suggest that Xist might play a role in modulating the expression of autosomal genes in specific developmental and cellular contexts in mice.

      Strengths:

      This paper deals with an interesting hypothesis that Xist ncRNA can also function at autosomal loci.

      Weaknesses: The claims reported in this paper are largely unsubstantiated by the data, with multiple misinterpretations, lacking controls, and inadequate statistics. Fundamental flaws in the experimental design/analysis preclude the validity of the findings. Major concerns are listed below: (1) The entire paper is based on the CHART observation that Xist is specifically targeted to autosomal promoters. Overall, the data analysis is flawed and does not support such conclusions. Importantly the sense WT and the 0h controls are not used, nor are the biological replicates. 

      We respectfully disagree with Rev1 but nevertheless thank the reviewer for making some suggestions that helped to strengthen our manuscript.  We have provided new experiments and analyses in the revised manuscript. Please see responses below.

      Rev1 seems to have missed or misunderstood some key experiments. In fact, the sense WT and 0h controls were shown. Furthermore, we included at least two biological replicates for each experiment.

      We used both male ES cells (which do not express Xist) and sense probes as key negative controls, as outlined in Figure S1. Crucially, we only analyzed peaks that were reproducible between biological replicates. The Xist CHART peaks in differentiating female ES cells were significantly enriched above the “background” defined by the sense probe and male controls. Specifically, in comparison to undifferentiated female ES cells (day 0) where both X chromosomes are active and Xist is not induced, Xist CHART robustly pulled down the X chromosome during cell differentiation (day 4, day 7, and day 14). In contrast, male ES cells showed no significant pull-down of the X chromosome, and the sense group also exhibited markedly reduced binding (new Figure S1B). Furthermore, Principal Component Analysis (PCA) of CHART-seq reads (day 4 as an example) include Xist, sense, and input in WT and ΔRepB female, further confirmed that the sense probe CHART was clearly distinguishable from Xist CHART signals. Please see revised Figure S1C. Together, these findings underscore the specificity and robustness of our CHART results.

      Data is typically visualized without quantification, and when quantified, control loci/gene sets are erroneously selected. Firstly, CHART validation on the X in FigS1 is misleading and not based on any quantifications (e.g., see the scale on Kdm6a (0-190) compared to Cdkl5 (0-40)). If scaled appropriately, there is Xist signal on the escapee. 

      Rev1 may have misread the presented data. In the example raised by Rev1, Fig. S1 is inherently quantitative: e.g., a ratio is a number in Fig. S1A (now Fig. S1B) and all gene tracks in Fig. 1B-E are shown with scales. We showed X-linked genes in Fig. S1 (now Fig. S2) as a control to demonstrate that the CHART worked and that Xist accumulated over time from day 0 to day 14. Our new Figure 1B demonstrates the Xist accumulation in graph format. 

      Our paper focuses on Xist autosomal binding sites. Thus, the X-linked examples were placed in the supplement. Escapee genes do in fact accumulate Xist at their promoter regions and this finding is consistent with data published by Simon et al. (2013, Nature). It was therefore not desirable in this paper to reanalyze X-linked genes, including escapees. Nevertheless, to address the reviewer’s concerns, we present new data in new Figure S3A. Here we analyzed the density of Xist binding across X-linked genes, including both active and inactive genes, as well as escapee genes. From this quantitative analysis, it should be clear that escapees do bind Xist. However, from the metagene plots in Figure S3B, we confirm the previous conclusion that escapees bind Xist at high levels just upstream of the promoter and that there is a depletion of Xist in the escapee gene body, consistent with a barrier preventing Xist from moving into the active gene. 

      All X-linked loci should have been quantified and classified based on escape status; sense control should also be quantified, and biological replicates should be shown separately. 

      Please see above response.

      Additionally, in the revised manuscript, we have examined the Irreproducible Discovery Rate (IDR) to validate the reproducibility of peaks between the two replicates in the revised version, and we included a representative example from female WT ES cells at day 4 (revised Figure S4A). The results showed a strong correlation between the replicates, with an IDR threshold of 0.05 (red point > 0.05). As described in the Methods section, to ensure reliable and robust peak identification, we performed peak calling (MACS2) separately on each replicate, and then used bedtools intersect to identify peaks that overlapped between the two replicates. This stringent process, including strict q-value settings in MACS2, ensures the reliability and reproducibility of the peaks presented in this study.

      Secondly, and most importantly, Figure 1 does not convincingly show specific Xist autosomal binding. Panel A quantification is on extremely variable y-scales and actually shows that Xist is recruited globally to nearly all autosomal genes, likely indicating an unspecific signal. Again, the sense and 0h controls should have been quantified along with biological replicates. 

      Figure 1 shows heatmaps and corresponding metagenes for d0, d4, d7, and d14 female ES cells. Two biological replicates are analyzed. In our revised manuscript, we have used Pearson and Spearman correlation coefficients to measure the strength and direction of a relationship between two biological replicates and shown that the two replicates have high reproducibility (new Figure S1A). On d0, the Xist coverage on autosomes and X chromosome is low, but there is a clear increase on d4, d7, and d14, particularly at the TSS of autosomal genes, as shown by the metagene plots on in Figure 1A-B and the CHART density maps in new Figure 1E-F. We also show relative depletion of Xist signals in the male and sense negative controls.

      Upon inspecting genome browser tracks of all regions reported in the manuscript (Rbm14, Srp9, Brf1, Cand2, Thra, Kmt2c, Kmt2e, Stau2, and Bcl7b), the signal is unspecific on all sites with the possible exception of Kmt2e. On all other loci, there is either a strong signal in the 0h ESC controls or more signal in some of the sense controls. This implies that peak calling is picking up false positive regions. How many peaks would have been picked up if the sense or the 0h controls were used for peak calling? It is likely that there would be a lot since there are also possible "peaks" (e.g., Fzd9) in control tracks. 

      The analysis cannot be performed by visual inspection. A statistical analysis must be performed to call signal above noise. This is why we performed peak-calling on two biological replicates and identified overlapping peaks using bedtools intersect to improve reliability. Significant peaks are noted as black bars under each track. As mentioned above, for our analysis, we focused on the top 100 peaks based on peak scores to ensure robustness. Xist has significantly higher signal compared to the sense probe in the Xist-autosomal peak regions (revised Figure 1E-F). Additionally, we conducted peak calling on undifferentiated ES cells (d0) and detected a significantly higher number of peaks (~600) compared to the differentiated states (d4 or d7) (~100).

      Single-cell sequencing studies have shown that about 2% of undifferentiated mESCs express detectable Xist (Pacini et al., Nat Commun, 2021). The Xist peaks in “day 0” cells may be due to the differentiating population.

      Further inspection of the data was not possible as the authors did not provide access to the raw fastq files. When inspecting results from past published experiments {Engreitz, 2013 #1839} reported regions were not bound by Xist. 

      On the contrary, we deposited the raw data files to GEO prior to the submission of the paper and included the reviewer link to access them. As of August 24, 2024, GEO publicly released these files, allowing for full inspection of the data. 

      Regarding the Engreitz publication, it is not recommended to compare our current study to their analysis for the crucial reason that the Engreitz study was not conducted under physiological conditions. The authors overexpressed the Xist gene in male ES cells. Because Xist RNA can silence genes in male cells as well, this ectopic overexpression normally leads to cell death — thus forcing examination of effects in a narrow time window before Xist can fully spread and act across the genome. Comparing our experiments (endogenous Xist expression in female ES cells) to the ectopic overexpression in male ES cells of Engreitz et al. should therefore not be undertaken.

      Thirdly, contrary to the authors' claim, deleting the B repeat does not lead to a loss of autosomal signal. Indeed, comparing Fig1A and Fig2B side by side clearly shows no difference in the autosomal signal, likely because the autosomal signal is CHART background. Properly quantifying the signal with separate replicates as well as the sense and 0h controls is vital. Overall current data together with published results indicate that CHART peak calling on autosomes is due to technical noise or artefacts.

      In our revised manuscript, we have included the quantitative results as mentioned above in the main and supplementary figure (new Figure 1E-F, Figure 2E-F, and S3A). The data clearly show an enrichment in the Xist CHART samples in differentiating female ES cells.

      We believe the reviewer may be comparing the original Figure 1A and Figure 2A (not Figure 2B). As mentioned above, the analysis cannot be performed by visual inspection. Please see new Figure 2E and 2F. From these data, it should be clear that deleting RepB causes a decrease in Xist targeting to autosomal loci.

      (2) The RNA-seq analysis is also flawed and precludes strong statements. Firstly, the analysis frequently lacks statistical analysis (Fig3B, FigS2B-C) and is often based on visualizations (Fig 3D-G) without quantifications. Day 4 B-repeat deletion does not lead to a significant change in the expression of genes close to Xist signal (Fig3H, d14 does not fully show). 

      Please see new revised Figure 3B and Figures S2B-C (now revised as Figures S6A and S6B). 

      Secondly, for all transcriptional analysis, it is important to show autosomal non-target genes, which is not always done. 

      In the revised manuscript, we included non-target genes for each analysis (new Figure 4E-F, 5D and 5F, 7C and 7E, S7F, S8).

      Indeed, both males and B repeat deletion will lead to transcriptional changes on autosomes as a secondary effect from different X inactivation status. The control set, if used, is inappropriate as it compares one randomly selected set of ~100 genes. This introduces sampling error and compares different classes of genes. Since Xist signal targets more active genes, it is important to always compare autosomal target genes to all other autosomal genes with similar basal expression patterns.

      Please see new Figure S8. We included 100 randomly selected non-target sites on autosomes for this comparative analysis. For consistency, we applied the same flanking regions (10 kb) in the analysis of both target and non-target genes. We believe that this selection method for nontargets is appropriate for two reasons: first, it allows us to control for Xist binding and non-binding; second, it ensures a similar number of genes in both groups, providing a robust foundation for statistical analysis. 

      (3) The ChIP-seq analysis also has some problems. The authors claim that there is no positive correlation between genes close to Xist autosomal binding (10kb) compared to those 50kb away (Fig 3C, S2D); however, this analysis is based entirely on metagene visualization. Signal within the Xist binding sites should be quantified (not genes close by) and compared to other types of genomic loci and promoters. Focusing on the 50kb group only as controls is misleading.

      We believe the reviewer may have misunderstood our conclusions. As stated in the paper, we observed lower coverage of the histone marks H3K27me3 and H2AK119ub, associated with PRC2 and PRC1, respectively. Our conclusions regarding PRC1/2 support the RNA-seq results, indicating that Xist tends to bind to actively expressed genes. In other words, these genes exhibit lower levels of PRC-mediated silencing signals. This observation underscores the relationship between Xist binding and gene activity, highlighting that Xist preferentially associates with regions that are less subject to silencing by polycomb repressive complexes.

      Secondly, the authors only look at PRC mark signal upon differentiation; what about the 0h timepoint, i.e., is there pre-marking? 

      Day 0 is not an appropriate timepoint for this analysis because Xist is not yet induced. There is also a small fraction of cells (<5%) that spontaneously differentiate and start to undergo XCI. Because of these reasons, the day 0 timepoint is considered somewhat heterogeneous and it would be difficult to make conclusions regarding Xist peaks in these samples.

      Most worryingly, the data analysis is not consistent between figures (see Fig3C vs 5H-I). In Fig5, the group of Xist targets was chosen as those within 100kb of Xist binding, which would encompass all the control regions from Fig3C. In this analysis, the authors report that there is Xist-dependent H3K27me3 deposition, and in fact, here the Xist autosomal targets have more of it than the controls. Overall, all of this analysis is misleading, and clear conclusions cannot be made.

      We believe that the reviewer may have also misunderstood the analysis in Figure 5. Figure 5 shows the effect of the Xist inhibitor, X1, on H3K27me3 and gene expression. X1 blocks reduces PRC2 targeting and gene silencing — consistent with X1’s effect on RepA as published in Aguilar et al. 2022. 

      All in all, because the fundamental observation is not robust (see point 1), all subsequent analyses are also affected. There are also multiple other inconsistencies within the analysis; however, they have not been included here for brevity.

      We again respectfully disagree with Rev1 but thank the reviewer for making suggestions that helped to strengthen our manuscript.  We believe that the revised manuscript with new analyses is improved in part because of the reviewer’s critical comments.

      Reviewer #2 (Public review):

      Summary:

      To follow-up on recent reports of Xist-autosome interaction the authors examine female (and male transgenic) mESCs and MEFs by CHARTseq. Upon finding that only 10% of reads map to X, they sought to identify reproducible alternative sites of Xist-binding, and identify ~100 autosomal Xistbinding sites and show a transient impact on expression.

      Strengths:

      The authors address a topical and interesting question with a series of models including developmental timepoints and utilize unbiased approaches (CHARTseq, RNAseq). For the CHARTseq they have controls of both sense probes and male cells; and indeed do detect considerable background with their controls. The use of deletions emphasizes that intact functional Xist is involved. The use of 'metagene' plots provides a visual summation of genic impact.

      Reviewer 2 has made some excellent suggestions. We have revised the manuscript accordingly and are grateful to the reviewer for the recommendations.

      Weaknesses:

      Overall, the result presentation has many 'sample' gene presentations (in contrast to the stronger 'metagene' summation of all genes). The manuscript often relies on discussion of prior X chromosomal studies, while the data generated would allow assessment of the X within this study to confirm concordance with prior results using the current methodology/cell lines. 

      Many of the 'follow-up' analyses are in fact reprocessing and comparison of published datasets. The figure legends are limited, and sample size and/or source of control is not always clear. While similar numbers of autosomal Xist-binding sites were often observed, the presented data did not clarify how many were consistent across time-points/cell types. While there were multiple time points/lines assessed, only 2 replicates were generally done.

      We apologize for the deficiencies in the legend.  The revised manuscript has corrected them.

      We generated many new datasets with deep sequencing, with at least two biological replicates for each. Such experiments are extremely expensive by nature. Thus, two biological replicates are typically considered acceptable.

      Additionally, we performed reanalysis of published datasets to test whether — in the hands of other investigators — cell lines expressing Xist also supported autosomal targeting. Figure 4 is a case in point. Here we examined Tg1 and Tg2, which respond to doxycycline to overexpress Xist from an ectopic site. Transcriptomic analysis showed significant downregulation of autosomal Xist targets, as exemplified by Rbm14 and Bcl7b (new Figure 4C, S9B). In contrast, non-targets of Xist such as Stau1 did not demonstrate significant changes in gene expression (new Figure 4E and 4G). Looking across all autosomal target genes, we observed a significant decrease in mean expression in the Xist overexpressing cell lines (new Figure 4D). The fact that the autosomal changes were also observed in datasets generated by other investigators greatly strengthen our conclusions. 

      Aim achievement:

      The authors do identify autosomal sites with enrichment of chromatin marks and evidence of silencing. More details regarding sample size and controls (both treatment, and most importantly choice of 'non-targets' - discussed in comments to authors) are required to determine if the results support the conclusions.

      Specific scenarios for which I am concerned about the strength of evidence underlying the conclusion:

      I found the conclusion "Thus, RepB is required not only for Xist to localize to the X- chromosome but also for its localization to the ~100 autosomal genes " (p5) in constrast to the statement 2 lines prior: "A similar number of Xist peaks across autosomes in ΔRepB cells was observed and the autosomal targets remained similar". Some quantitative statistics would assist in determining impact, both on autosomes and also X; perhaps similar to the quintile analysis done for expression.

      We have added the Xist coverage panel for day 4 and 7 in the identified Xist-autosomal peak regions (new Figure 1E-F, Figure 2E-F), as mentioned above. The results clearly demonstrate that the deletion of RepB decreases Xist binding to autosomes. Also, we showed that ΔRepB increased X-linked genes expression in our revised Figure 3D. 

      It is stated that there is a significant suppression of X-linked genes with the autosomal transgenes; however, only an example is shown in Figure 4B. To support this statement, a full X chromosomal geneset should be shown in panels F and G, which should also list the number of replicates. 

      Please see new Figure 4B.

      As these are hybrid cells, perhaps allelic suppression could be monitored? Is Med14 usually subject to X inactivation in the Ctrl cells, and is the expression reduced from both X chromosomes or preferentially the active (or inactive) X chromosome?

      If Rev2 is referring to Figure 4, the dataset used in Figure 4 comes from another research group and was previously published (Loda, A. et al. Nat Commun, 2017).

      If Rev2 is referring to our ES cells, they are N2 cell lines.  The X chromosomes are fully hybridized (Cas/Mus), but the autosomes are not fully hybridized (Ogawa et al., Science, 2008). Med14 is subject to XCI and is expressed from the Xa, silenced on the Xi. 

      The expression change for autosomes after transgene induction is barely significant; and it was not clear what was used as the Ctrl? This is a critical comparator as doxycycline alone can change expression patterns.

      We agree that there was a modest change in expression after transgene induction, but it is a significant change. Again, the dataset is from a published study where the authors generated doxycycline-responsive Xist transgenes (see above). The control in this case is Dox-treated wildtype cells. We now clarify these points.

      In the discussion there is the statement. "Genetic analysis coupled to transcriptomic analysis showed that Xist down-regulates the target autosomal genes without silencing them. This effect leads to clear sex difference - where female cells express the ~100 or so autosomal genes at a lower level than male cells (Figure 7H)." This sweeping statement fails to include that in MEFs there is no significant expression difference, in transgenics only borderline significance, and at d14 no significant expression difference. The down-regulation overall seems to be transient during development while targeting is ongoing?

      Indeed, the Xist effects on autosomes seem to occur during cell differentiation in ES cells. While there is no apparent effect in MEFs, we cannot exclude effects on other somatic cells. Regardless of whether the effects are in early development or throughout life, the sex differences may have life-long effects in mammals. The study conducted in human cells by the Plath lab also concluded that the differences primarily affect stem cells.

      Finally, I would have liked to see discussion of the consistency of the identified genes to support the conclusion that the autosomal sites are not merely the results of Xist diffusion.

      We address this in the third paragraph of the Discussion. Our main argument is that if autosomal binding were caused by diffusion, then RepB deletion or X1 treatment would have led to increased binding at autosomal sites, as Xist would bind less to the X chromosome. However, as demonstrated in our study, both treatments resulted in reduced Xist binding on both the X chromosome and autosomes. This finding suggests that the binding is specific and reliant on Xist's RepA and RepB domains, rather than being a passive diffusion process.

      To examine overlap between the conditions (days of differentiation and WT/RepB cells), we generated Venn Diagrams as now shown in Figure S4E.

      The impact of Xist on autosomes is important for consideration of impact of changes in Xist expression with disease (notably cancers). Knowing the targets (if consistent) would enable assessment of such impact.

      We thank Rev2 for the very helpful review and for the forward-looking experiments. Indeed, the physiological changes brought on by autosomal targeting will be of future interest.

      Reviewer #3 (Public review):

      Summary:

      Yao et al use CHART to identify chromatin associated with Xist in female mouse ESCs, and, as control, male ESCs at various timepoints of differentiation. Besides binding of Xist to X chromosome regions they found significant binding to autosomes, concentrating mostly on promoter regions of around 100 autosomal genes, as elucidated by MACS. The authors went on to show that the RepB repeat is mostly responsible for these autosomal interactions using a female ESC line in which RepB is deleted. Evidence is provided that Xist interacts with active autosomal genes containing lower coverage of repressive marks H3K27me3 and H2AK119ub and that RepB dependent Xist binding leads to dampening of expression, but not silencing of autosomal genes. These results were confirmed by overexpression studies using transgenic ESCs with doxycycline-inducible Xist as well as via a small molecule inhibitor of Xist (X1), inducing/inhibiting the dampening of autosomal genes, respectively. Finally, using MEFs and Xist mutants RepB or RepE the authors provide evidence that Xist is bound to autosomal genes in cells after the XCI process but appears not to affect gene expression. The data presented appear generally clear and consistent and indicate some differences between human and mouse autosomal regulation by Xist. Thus, these results are timely and should be published.

      We thank Rev3 for the positive remarks and great suggestions.  We have amended the manuscript per below. 

      Strengths:

      Regulation of autosomal gene expression by Xist is a "big deal" as misregulation of this lncRNA causes developmental defects and human disease. Moreover, this finding may explain sexspecific developmental differences between the sexes. The results in this manuscript identify specific mouse autosomal genes bound by Xist and decipher critical Xist regions that mediate this binding and gene dampening. The methods used in this study are appropriate, and the overall data presented appear convincing and are consistent, indicating some differences between human and mouse autosomal regulation by Xist.

      Weaknesses:

      (1) The figure legends and/or descriptions of data are often very short lacking detail, and this unnecessarily impedes the reading of the manuscript, in particular the figures would benefit not only from more detailed descriptions/explanations of what has been done but also what is shown. 

      We have included more detailed descriptions in the figure legends and throughout the manuscript.

      This will facilitate the reading and overall comprehension by the reader. One out of many examples: In Fig S1B in the CHART data at d4 and d7 there is not only signal in female WT Xist antisense but also in female sense control. For a reader that is not an expert in XCI it would be helpful to point out in the legend that this signal corresponds to the lncRNA Tsix (I suppose), that is transcribed on the other strand.

      We thank the reviewer for this excellent point.  We have amended the Results section accordingly.

      (2) Different scales are used in the lower panels of Figures 1A and 2A, which makes it difficult to directly compare signals between the different differentiation stages.

      We have included a figure combining all timepoints — d0, d4, d7, and d14 WT female Xist CHART signals  — on the X chromosome and autosomes to support our thesis. Please see new Figure 1B.

      (3) In this study some of the findings on mouse cells contrast previously published results in human ESCs: 1) Xist binding occurs preferentially to promoters in mice, not in human. 2) Binding of Xist is mostly detected in polycomb-depleted regions in mice but there is a positive correlation between Xist RNA and PRC2 marks in human ESCs. These differences are surprising but may be very interesting and relevant. While I am aware that this might be a difficult task, it would be helpful to experimentally address this issue in order to distinguish whether species specific and/or methodological differences between the studies are responsible for these differences.

      Indeed, our findings in mouse cells contrast with those observed in humans. As discussed in the manuscript, this discrepancy may be attributed to factors such as cell type, differentiation methods, and the Xist pull-down technique employed (our CHART method utilizes a 20 nt oligo library, whereas RAP uses long oligos). We agree that future work should investigate the underlying causes of these differences between mouse and human systems.

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      For Figure 2: labelling ∆B on the panel A timeline (e.g. d0-∆B) would make the results clearer for the audience. Panel B makes most sense beside panel E of Figure 1, so combine here and skip in Figure 1?

      We have modified Figure 2A and thank Rev2 for this suggestion. As for the embedded tables: since we performed peak calling for WT and ∆B separately, we believe that showing both the peak numbers and their corresponding peak patterns provides a clearer representation of the data.

      I agree that at day 7 there appears to be a difference in X; but by day 14 this looks much more minimal - is it just time-shifted rather than altered? Perhaps this could be discussed. Autosomal binding sites show no change in number.

      Day 7 exhibits the strongest Xist binding on the X chromosome, consistent with the de novo establishment phase of XCI when Xist is expressed at the highest levels (300 copies/cell during de novo XCI versus ~100 copies/cell during maintenance [Sunwoo et al., 2015 as cited]. Per our RNA-seq analysis here, we also observed highest Xist expression on day 7 and reduced levels on day 14 (Fig. S5A). This expression difference explains the reduced Xist CHART levels on day 14 compared to day 7. 

      While the X has previously been examined, it would seem beneficial to conduct the same expression analyses (Figure 3) for the X (perhaps supplemental), as the authors have the data 'in hand'. I feel comparison to X in the main figure for panels A and B would fit, while a similar analysis for the X for panel C could be supplemental, presumably supporting the published data to which this data is currently compared. 

      This is a good suggestion. Please find the new data in Figures 2E-F and 3D, which demonstrate that the RepB deletion inhibits Xist binding on the X chromosome, resulting in increased X-linked gene expression, as previously mentioned. Since Xist binds across the X chromosome, we did not perform peak calling as we did for the autosomes. Therefore, applying a similar analysis as in Figures 3A-B may not be appropriate in this case.

      Such a direct comparison to X-data from the same study would be important. For panel H: How many replicates (2)? This should be in the legend. What is the change in median expression? Again, a supplemental figure showing impact on X-linked targets would be useful. Do male and female ESCs show an expression difference prior to differentiation (ie d0)? The data underlying this Figure should be in one of the supplementary tables, showing the full statistical tests and average change. The supplementary tables 8-12 list the WT target genes, not expression differences with the deletion. Again, given that the difference appears transient, might the ∆B cells be altered in rate of differentiation?

      Panel H (revised Figure 3G) includes two replicates, and this has been added to the legends. We have provided a supplementary figure demonstrating that RepB increases the expression levels of X-linked genes on days 4, 7, and 14 (revised Figure 3D). Male and female ESCs show differences in the expression of X-linked genes, as both X chromosomes are active in females at this stage prior to differentiation (revised Figure S5C). 

      A supplementary table with statistical tests and average change information has been included in our revised version (Table S11).

      On the other hand, these Xist-autosomal target genes displayed no significant differences between WT male, female, or ∆B female cells on day 0 — prior to onset of XCI and Xist expression. Please see new Figure 3H. 

      As for whether ∆B cells are altered in their rate of differentiation, the analysis by Colognori et al. 2019 indicates that ∆B cells differentiate similarly to WT cells. (In Figure 6 of Colognori et al. 2019, autosomal genes expressed similarly in WT and ∆B cells, whereas XCI is affected only in ∆B cells)

      We have also modified the legends for our supplementary tables.

      Why were the transgene lines examined upon neuronal differentiation rather than the same approach as in Figures 1-3? I would have thought neuronal differentiation might be more similar to d14, where limited changes remain? Could the authors clarify and discuss?

      We apologize for the confusion. The Tg lines in Figure 4 came from a previously published study. We performed reanalysis of published datasets because we wanted to test whether — in the hands of other investigators — cell lines expressing Xist also supported autosomal targeting. Here we examined Tg1 and Tg2, which respond to doxycycline to overexpress Xist from an ectopic site. Transcriptomic analysis showed significant downregulation of autosomal Xist targets, as exemplified by Bcl7b and Rbm14 (Figure 4C and S9B). In contrast, non-targets of Xist such as Stau1 did not demonstrate significant changes in gene expression (Figure 4E and 4F). Looking across all autosomal target genes, we observed a significant decrease in mean expression in the Xist overexpressing cell lines (Figure 4D). The fact that the autosomal changes were also observed in datasets generated by other investigators greatly strengthen our conclusions. We have clarified this in the Results section.

      Figure 5 - the legend should specify the number of replicates and clarify the blue/green (intuitive, but not specified). Are the 'target' / 'non-target' genes from d4 Chart (but the RNA from d5)? How are 'non-targets' defined - do they match the 'targets' in certain criteria (expression level, chromatin features, GC content)? Do they change per differentiation protocol?

      We have modified the legends to clarify that the 'target' and 'non-target' genes are derived from the day 4 CHART-seq data, while the RNA data is from day 5, as that study sequenced day 5 and not day 4. Non-targets were randomly chosen based on (i) the absence of Xist binding and (ii) similar expression levels. Please see revised Figure S8.

      It would be helpful to compare Xist expression levels across the various models, and the MEF model could be better described - are they polyploid as often happens?

      We have included the Xist expression levels of ES cells and MEF cells in the revised version (revised Figure S5A, 6D). The transformed MEFs are indeed tetraploid, as is typical.

      For 6A to be informative, one needs to know % mapping to X in ES timeline, which is in supplemental, so perhaps 6A should also be supplemental?

      We have moved 6A to the supplemental figure.

      It is odd that ∆B seems to have had more impact in MEFs, and I would like more discussion - but I also think I am missing something: "We observed that Xist signals were more substantially reduced on both the Xi and autosomal regions in ΔRepE MEFs compared to ΔRepB cells", yet in lower panel 6 G it looks like ∆B is LOWER than ∆E? Am I misinterpreting?

      We apologize for the confusing writing.  The revised text now reads:  “To investigate, we utilized a deletion of Xist’s Repeat E (∆RepE), which was previously demonstrated to severely abrogate localization of Xist to the Xi 41,42. We reasoned that the severe loss of Xist binding might unmask a transcriptomic difference. As expected, we observed that Xist signals were somewhat more reduced on the Xi in ΔRepE MEFs compared to ΔRepB cells (Figure 6E-6F). Despite this reduction, peak coverages in autosomal target genes did not increase in ΔRepE MEFs (Figure 6E-6F). However, there was an overall decrease in the number of significant autosomal peaks in ∆RepE MEFs relative to WT cells (Figure 6A). Regardless, we observed no significant transcriptomic differences in ∆RepE MEFs relative to WT MEFs (Figure 7A-7E). Additionally, further examination of RNA sequencing data from male and female MEF cells in two published studies 43,44 corroborated that the expression levels of these autosomal Xist targets did not exhibit significant changes (Figure 7F and 7G). Altogether, the analysis in MEFs demonstrates that Xist continues to bind autosomal genes in post-XCI somatic cells. However, autosomal binding of Xist in post-XCI cells does not overtly impact expression of the associated autosomal genes. Nonetheless, we cannot exclude more subtle changes that do not meet the significance cut-off.”

      Overall, I would like to see how consistent these autosomal peaks are - I shudder to suggest Venn diagrams, but something to show whether there are day/lineage specific peaks and/or ∆repeat B/E resistant peaks. 

      We now present Venn diagrams comparing MEF, ES_d4, and ES_d7, showing approximately 50% overlap between MEF and ES cells (revised Figure S10B). This may be expected, as each timepoint is a different developmental stage of XCI, with expected gene expression differences.

      Very minor comments:

      It would be easier if the supplemental tables were tabs in 1 file!

      We will defer to the editor on how best to format the supplemental tables.

      Similar to the text, could gene names be included in the supplemental?

      We have provided gene names in the supplemental files.

      Figure 3 legend: should 'representing' be representative?

      We have modified it.

      "Xist patterns identified in human cells" p 5; it is challenging to follow human versus mouse, so specify or ensure correct use of XIST/Xist Indeed, we edited the manuscript accordingly.

      Gene names should be italicized.

      We have italicized gene names in our manuscript.

      Ref. 38 lacks details (...).

      We have updated the reference.

      Peak-like characters - perhaps characteristics? P8

      We have modified this.

      Reviewer #3 (Recommendations for the authors):

      On page 6, the 6th sentence in the first paragraph needs correction. "Consistent with Xist's behavior on the X chromosome."

      We have modified the sentence. Thank you.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The study by Longhurst et al. investigates the mechanisms of chemoresistance and chemosensitivity towards three compounds that inhibit cell cycle progression: camptothecin, colchicine, and palbociclib. Genome-wide genetic screens were conducted using the HAP1 Cas9 cell line, revealing compound-specific and shared pathways of resistance and sensitivity. The researchers then focused on novel mechanisms that confer resistance to palbociclib, identifying PRC2.1. Genetic and pharmacological disruption of PRC2.1 function, but not related PRC2.2, leads to resistance to palbociclib. The researchers then show that disruption of PRC2.1 function (for example, by MTF2 deletion), results in locus-specific changes in H3K27 methylation and increases in D-type cyclin expression. It is suggested that increased expression of D-type cyclins results in palbociclib resistance.

      Strengths:

      The results of this study are interesting and contribute insights into the molecular mechanisms of CDK4/6 inhibitors. Importantly, while CDK4/6 inhibitors are effective in the clinic, tumour recurrence is very high due to acquired resistance.

      Weaknesses:

      A key resistance mechanism is Rb loss, so it is important to understand if resistance conferred by PRC2.1 loss is mediated by Rb, and whether restoration of PRC2.1 function in Rb-deplete cells results in renewed palbociclib sensitivity. It is also important to understand the clinical implications of the results presented. The inclusion of these data would significantly improve the paper. However, besides some presentation issues and typos as described below, it is my opinion that the results are robust and of broad interest.

      Major questions:

      (1) Is the resistance to CDK4/6 inhibition conferred by mutation of MTF2 mediated by Rb?

      (2) Are mutations in PRC2.1 found in genetic analyses of tumour samples in patients with acquired resistance?

      We thank the reviewer for their editing and experimental suggestions, and have integrated their responses into our re-submitted manuscript.

      We also agree that understanding the role of RB1 in mediating palbociclib resistance to the proposed resistance mechanism is of particular interest. However, as there are three RB proteins expressed in human cells, this is a technically difficult question to probe genetically. Despite this technical challenge, we have provided multiple lines of evidence in our resubmitted manuscript that the resistance to palbociclib observed in our PRC2.1-deficent cells is mediated through the canonical CDK4/6-RB1 pathway. First, disruption of RB1 in HAP1 cells results in palbociclib resistance to a level comparable level to PRC2.1 disruption (Fig. 4E). Second, inactivation of SUZ12 or MTF2 increases the number of cells entering S-phase in palbociclib treatment (Fig. 4G) with no increase in basal rates of apoptosis (Fig. S2D), suggesting that any proliferation advantage observed in PRC2.1-defective cells is due to resistance to  palbociclib-induced cell cycle arrest. Third, we show that over expression of CCND1 and CCND2 is sufficient to drive resistance to palbociclib in wild-type HAP1 cells (Fig. S5F).  And finally, increased levels of CCND1 and CCND2 observed in cells lacking PRC2.1 activity results in higher CDK4/6 activity as measured by RB1 phosphorylation, despite palbociclib blockade (Fig. 6F). All these lines of evidence strongly suggest that MTF2-containing PRC2.1 regulates G1 progression in through the canonical CDK4/6RB1 pathway by repressing CCND1 and CCND2 expression. 

      Whether or not MTF2 deletion leads to palbociclib resistance in clinical samples is also of a question of particular interest. Currently, we are unaware of any reports that specifically mention MTF2 deletion as leading to palbociclib resistance, and we were unable to find another example in our own cancer database review. However, we have included references to other examples of MTF2 mutation resulting in chemotherapeutic resistance in our discussion. Additionally, although MTF2 is rarely observed to be mutated in cancers (Ngubo et al. 2023), it is highly differentially expressed and investigating decreased MTF2 transcription in palbociclib resistant tumors, though challenging, might prove fruitful.  However, as mechanisms of palbociclib resistance is an area of active investigation, we speculate that future studies might uncover additional examples of MTF2 mediating resistance to this clinically important chemotherapeutic.  

      Reviewer #2 (Public Review):

      Summary:

      Longhurst et al. assessed cell cycle regulators using a chemogenetic CRISPR-Cas9 screen in haploid human cell line HAP1. Besides known cell cycle regulators they identified the PRC2.1 subcomplex to be specifically involved in G1 progression, given that the absence of members of the complex makes the cells resistant to Palbociclib. They further showed that in HAP1 cells the PRC2.1, but not the PRC2.2 complex is important to repress the cyclins CCND1 and CCND2. This can explain the enhanced resistance to Palbociclib, a CDK4/6Inhibitor, after PRC2.1 deletion.

      Strengths:

      The initial CRISPR screen is very interesting because it uses three distinct chemicals that disturb the cell cycle at various stages. This screen mostly identified known cell cycle regulators, which demonstrates the validity of the approach. The results can be used as a resource for future research.

      The most interesting outcome of the experiment is the finding that knockouts of the PRC2.1 complex make the cell resistant to Palbociclib. In a further experiment, the authors focused on MTF2 and JARID2 as the main components of PRC2.1 and PRC2.2, respectively. Via extensive analyses, including genome-wide experiments, they confirmed that MTF2 is particularly important to repress the cyclins CCND1 and CCND2. The absence of MTF2 therefore leads to increased expression of these genes, sufficient to make the cell resistant to palociclib. This result will likely be of wide interest to the community.

      Weaknesses:

      The main weakness of the manuscript is that the experiments were performed in only one cell line. To draw more general conclusions, it would be essential to confirm some of the results in other cell lines.

      In addition, some of the findings, such as the results from the CRISPR screen as well as the stronger impact of the MTF2 KO on H3K27me3 and gene expression (compared to JARID2 KO), are not unexpected, given that similar results were already obtained before by other labs.

      We thank the reviewer for their suggestions and we believe that we have addressed their main concern about the generality of the MTF2 regulation of D-type cyclin expression in our resubmitted manuscript. We have now shown through shRNA knockdown that MTF2 represses CCND1 in two additional cell lines, the breast cancer MDA-MB-231 and immortalized monkey COS7 cell line (Fig. 6E). However, it is important to note that MTF2 did not control CCND1 expression in every cell line tested (Fig. 6D), underscoring the context-dependent nature of this regulation. Future studies will illuminate what cell or tumor types in which this regulation is observed.

      Additionally, while MTF2 has previously been shown to exert a greater effect on H3K27me3 levels in some circumstances (Loh et al. 2021, Rothberg et al. 2018), a number of notable reports in ES cell lines have concluded that PRC2 localization and H3K27me3 at the majority of genomic sites are dependent on both PRC2.1 and PRC2.2 activity (Healy et al. 2019, Højfeldt et al. 2019, Perino et al. 2020, Oksuz et al. 2018). Therefore, we think it is important to highlight the greater dependence on MTF2 for promoter proximal H3K27me3 levels in our transformed cell line context.  

      Reviewer #3 (Public Review):

      This study begins with a chemogenetic screen to discover previously unrecognized regulators of the cell cycle. Using a CRISPR-Cas9 library in HAP1 cells and an assay that scores cell fitness, the authors identify genes that sensitize or desensitize cells to the presence of palbociclib, colchicine, and camptothecin. These three drugs inhibit proliferation through different mechanisms, and with each treatment, expected and unexpected pathways were found to affect drug sensitivity. The authors focus the rest of the experiments and analysis on the polycomb complex PRC2, as the deletion of several of its subunits in the screen conferred palbociclib resistance. The authors find that PRC2, specifically a complex dependent on the MTF2 subunit, methylates histone 3 lysine 27 (H3K27) in promoters of genes associated with various processes including cell-cycle control. Further experiments demonstrate that Cyclin D expression increases upon loss of PRC2 subunits, providing a potential mechanism for palbociclib resistance.

      The strengths of the paper are the design and execution of the chemogenetic screen, which provides a wealth of potentially useful information. The data convincingly demonstrate in the HAP1 cell line that the MTF2-PRC2 complex sustains the effects of palbociclib (Figure 4), methylates H3K27 in CpG-rich promoters (Figure 5), and represses Cyclin D expression (Figure 6). These results could be of great interest to those studying cell-cycle control, resistance mechanisms to therapeutic cell-cycle inhibitors, and chromatin regulation and gene expression.

      There are several weaknesses that limit the overall quality and potential impact of the study. First, none of the results from the colchicine and camptothecin screens (Figures 1 and 2) are experimentally validated, which lessens the rigor of those data and conclusions. Second, all experiments validating and further exploring results from the palbociclib screen are restricted to the Hap1 cell line, so the reproducibility and generality of the results are not established. While it is reasonable to perform the initial screen to generate hypotheses in the Hap1 line, other cancer and non-transformed lines should be used to test further the validity of conclusions from data in Figures 4-6. Third, conclusions drawn from data in Figures 3D and 4D are not fully supported by the experimental design or results. Finally, there have been other similar chemogenetic screens performed with palbociclib, most notably the study described by Chaikovsky et al. (PMID: 33854239). Results here should be compared and contrasted to other similar studies.

      We thank the reviewer for their suggestions regarding our manuscript. While the genes recovered as mediating cellular responses to camptothecin and colchicine was never confirmed following our chemogenetic screens, we felt our primary findings were in the area of palbociclib resistance and decided focus our follow-up investigations on genes. We included the results camptothecin and colchicine chemogenetic screens as confirmation of the specificity of PRC2 mutation resulting in resistance to palbociclib (Fig. 4C) and for others in the community to use as a resource for future investigations. We have also clarified our results for Figure 3D and 4D in our revised manuscript, as well as included additional plots of these results (Fig. S1DS1F). And, with our resubmitted manuscript, we believe we have addressed their concern of the generality of our results by demonstrating our primary finding that MTF2 regulates D-type cyclins in additional cell lines other than HAP1. We feel these results indicate that while not “general”, there are additional cellular contexts that our main result holds true. In line with this, and to address how our chemogenetic screens fits into the landscape of previous studies, including Chaikosvsky et al., we have included the following lines to our discussion:  “Additionally, other chemogenetic screens utilizing palbociclib and have not identified that inactivation of PRC2 components as either enhancing or reducing palbociclib-induced proliferation defects, suggesting that PRC2 mutation is neutral in the cell lines studied. These observations not only underscore the context-dependent ramifications of mutation of these PRC2 complex members, but also may help inform the context in which CDK4/6 inhibitors are most efficacious.”

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) "We found that only thirteen and twenty genes resulted in sensitivity or resistance, respectively, in every conditions tested and were deemed non-specific and excluded from any further analysis (see Table S2)." It's unclear to me why these genes were deemed 'nonspecific'. Are these genes functionally important for the general exclusion of xenobiotic molecules?

      By this, we simply meant that these effects were not specific to one condition. Such genes could affect drug half-life or a general stress response, but are less likely to have functions directly tied to the pathway targeted by a drug than are genes whose loss affects only one condition.  

      (2) "Given that increased CCND1 levels is sufficient to drive increased CDK4/6 kinase activity, upregulation of these D-type cyclins is likely to be a significant contributor to the palbociclib resistance in MTF2∆ cells." It's unclear to me what is the basis for this statement. This is only true if there is free CDK4/6. If CDK4/6 is already fully occupied by D-type cyclins, then increased CCND1 levels would not be expected to have an effect. 

      While we anticipated that increased levels of CCND1 would result in more CDK4/6-Dtype association, we now demonstrate in the new Figure S5F that there is more CCND1 in complex with CDK6 in both SUZ12∆ and MTF2∆ cell lines. Furthermore, we able to show in Figure S5G that overexpression of D-type cyclins results in resistant to palbociclib-induced proliferation defects in HAP1 cells.

      (3) The description of the results is very confusing in places, especially regarding "resistance" versus "sensitivity" genes. For example: "CCNE1, CDK6, CDK2, CCND2 and CCND1, all of which are integral to promoting the G1/S phase transition, ranked as the 2nd, 24th, 27th, 29th and 46th most important genes for palbociclib resistance, respectively (Figures 1F and 1G). CCND1 and CCND2 bind either CDK4 or CDK6, the molecular targets of palbociclib, whereas CDK2 and CCNE1 form a related CDK kinase that promotes the G1/S transition.

      Similarly, cells with sgRNAs targeting RB1, whose phosphorylation by CDK4/6 is a critical step in G1 progression, displayed substantial resistance to palbociclib." My reading of this paragraph suggests that disruption of the CDK6 locus is associated with palbociclib resistance - surely this is a typo and instead should have been sensitivity? Please explain.

      We thank the reviewer for pointing this out and have corrected this typo  

      (4) Sensitivity to palbociclib was enhanced in cells expressing sgRNAs targeting H4 acetylation, positive regulators of Pol II transcription, and regulators of the DNA Damage Response pathway (Figures 3A and 3B), although this sensitivity was much weaker than that seen with DNA damaging agents. This observation is consistent with long-term treatment with palbociclib inducing DNA damage, as has been suggested by a number of recent publications 65,66." This is also consistent with recent work on Cdk7 inhibitors (Wilson et al. Mol Cell 2023), as Cdk7 inhibition is expected to affect both CDK1/2/4/6 activities and Pol II transcription.

      We thank the reviewer for bringing this observation to our attention and we have added this citation to this passage in our manuscript.

      (5) Figure 3D - would it not make sense to plot the data such that palbo concentration is on the x-axis? It is also difficult to interpret since the data are normalized to starting "% proliferation" at the indicated palbo treatment, when it is likely that % proliferation changes significantly with palbo concentration. Indeed, this is the graphing format used for a later figure (Figure 4D). The data with rotenone suggests palbo antagonizes rotenone-mediated reduction in proliferation. But it's unclear to me whether the graph shows the converse - that rotenone treatment modulates palbo-induced cell cycle arrest.

      This reviewer is correct about the fact that increasing doses of palbociclib in the absence of oxidative phosphorylation do indeed have an effect on proliferation. However, it is helpful to normalize proliferation values to each initial dose of palbociclib and then compare this to the different oxidative phosphorylation inhibitors treatment combinations. To illustrate that the oxidative phosphorylation inhibitors do indeed antagonize palbociclib-induced proliferation defects, we have now included the data graphed as each oxidative phosphorylation inhibitor vs palbociclib as Supplemental Figures S1D-S1F.

      • The highest concentration of GSK126 tested (5µM) does not appear to confer resistance, but perhaps this is due to off-target effects or cytotoxicity?

      We agree with the reviewer that at the highest doses of dose of GSK126, low doses of palbociclib do not confer resistance to palbociclib. However, higher doses do appear to have this effect. We have included a statement in our results section to address this reviewer’s observations. 

      • Disruption of Emi1 leads to resistance (Figure 1F, FZR1), yet overexpression induces resistance (Mouery et al. bioRxiv 2023). Explain.

      We do not understand why EMI1 responds in this way, and therefore we cannot comment on this in the text. 

      Typos/stylistic comments:

      • Typo "However, the net result of these opposing effects on cell cycle progression, and the contribution of the individual subcomplexes to this regulation, rained unclear."

      We thank the reviewer for pointing this out, and we have corrected it.  

      • Use of the word "growth" - I think the authors should be more precise. Is "proliferation" meant here?

      We thank the reviewer for pointing this out, and we have corrected it.

      • n Figure 4G, two of the panels have 8.42%. Is this correct, or may it be a copy/paste error?

      This was an error, but is no longer relevant as we have reconducted and reanalyzed this experiment.

      Reviewer #2 (Recommendations For The Authors):

      Major Points

      (1) Some of the conclusions should be confirmed in additional cell lines. I would suggest testing the resistance to Palbociclib in several additional cell lines, where MTF2 and JARID2 are deleted. If the conclusion can be generalized, one would expect that the differential role of MTF2 versus JARID2 can be confirmed in more cell lines.

      While the PRC2.1-dependent repression of D-type cyclins does not appear to be general, we have now demonstrated in Figures 5SE and 6F that there are multiple different cellular contexts in which our observations are consistent. Specifically, we demonstrate that GSK126 causes upregulation of CCND1 in both immortalized nontumor cells (COS7 cells) and in the breast cancer cell line MDA-MB-231. Moreover, in both cases we showed that this effect is PRC2.1-dependent, as shRNA knockdown of MTF2 increases expression of CCND1.

      (2) In addition, it may be attractive to make use of publicly available RNA-seq data of MTF2 and JARID2 knockout/down cells, to investigate the generality of the finding that PRC2.1 regulates CCND1 and CCND2.

      While it would be useful to address this issue, Figure S5E demonstrates that the repression of D-type cyclin expression by PRC2.1 is context dependent. Furthermore, prior to identifying the lines shown in Figure 6F and 5SE, we were not aware of which lines to focus our investigations on. However, we have now demonstrated a few cellular contexts in which either chemical inhibition of PRC2 or knockdown of MTF2 results in de-repression of CCND1 expression.

      (3) At a bare minimum the authors should strongly discuss the limitations of the study, and tone down the conclusions.

      We would agree with this based upon the data in the original submitted manuscript, however, now that we have shown that this effect is more general, this is less critical. That said, we do not see this effect in all cell lines, and we have made this apparent in the final version of the manuscript.

      Minor point

      (1) In my view, Figures 1-3 should be shortened to the most essential points, and some data/figures should be moved to the supplementary figures. Especially the STING genenetwork graphs are in my view not particularly meaningful.

      While we understand the opinion of this reviewer, we feel that these data will be of significant interest to some readers.  

      (2) Figure 6E and 6F/G appear to be largely redundant. This can perhaps be made more concise.

      This has been addressed in the new version of Figure 6

      (3) Figure 5D should be enlarged. 

      We thank the reviewer for this suggestion and have enlarged the image.

      Reviewer #3 (Recommendations For The Authors):

      The manuscript could be edited to improve clarity. In several places, the scientific logic motivating an experiment is confusing, and there are several hypotheses and conclusions that seem opposite from what the data are suggesting. Some aspects of the figures were also unclear. Specific examples include the following:

      (1) Last sentence of abstract : "Our results demonstrate a role for PRC2.1, but not PRC2.2, in promoting G1 progression." Data show that knockout of PRC2.1 components promotes G1 progression through upregulation of CycD, so the conclusion here is the opposite.

      We thank the reviewer for catching this error. We have now changed this to “in antagonizing G1 progression”.

      (2) In the second paragraph of the results, CCNE1, CDK2, etc are described as scoring high for palbociclib resistance, but those genes scored as sensitizing. Also, in that paragraph, it is described that a drug is sensitizing cells to loss of a gene, which seems like incorrect logic. It should be clarified that knock-out of a gene either sensitizes or desensitizes cells to the drug.

      We thank the reviewer for catching this error. We have now corrected it.  

      (3) In the motivation for the experiment in Figure 3D, it is written: "we asked whether chemical inhibition of oxidative phosphorylation could rescue sensitivity to palbociclib". Considering that knock-out of genes that mediate oxidative phosphorylation confer resistance to palbociclib, it is confusing why it was expected that chemical inhibitors would restore sensitivity.

      We are sorry if the original wording was confusing. We have now changed this to “combined inhibition of oxidative phosphorylation and CDK4/6 activity mutually rescue the proliferation defect imposed by agents targeting the other process”.  

      (4) If the intention of Figure 3D is to test the hypothesis that chemical inhibition of oxidative phosphorylation modulates sensitivity to palbociclib, the clarity of Figure 3D would be improved if data were shown such that palbociclib concentration is on the x-axis and the different curves are different drug concentrations.

      It appears that there is some mutual suppression, which inhibition of each process rescues cells partly from inhibition of the other. In fact, with these drugs the stronger of the two is seen as the rescue of mitochondrial poisons by palbociclib. We have now discussed this in the text.  

      (5) The authors should check the units on the x-axis in Figure 4D, should they be log[uM Palbo] or log [nM Palbo]?

      We thank the reviewer for catching this error. We have now corrected it

      (6) It should be clarified which data are summarized in the graph to the right in Figure 4G, are these experiments with palbociclib?

      This is currently included in the figure legends.

      (7) The text suggests that the control CCNE1 knockout is shown in Figure 4E, but those data are missing.

      This has been corrected in Figure 4E.

      Several conclusions are not well supported by the data and should be revised or more data and analysis should be added.

      (1) The titular conclusion that the "PRC2.1 Subcomplex Opposes G1 Progression through Regulation of CCND1 and CCND2" has only been demonstrated in the context of a Cdk4/6 inhibitor in HAP1 cells. There is little evidence supporting this claim that is broadly applicable. For example, data in Figure 4G show small and not demonstrable significant differences in G1 and S phase populations in the mock experiments. Also, experiments in other cells are needed to support the rigor and generality of the conclusion.

      Our chemogenetic screen and competitive proliferation assay data in Figure 4A, 4C and 4E support the conclusion that PRC2.1 and PRC2.2 play opposing roles in G1 progression. Furthermore, we have repeated the initial BrdU incorporation experiments shown in Figure 4G and have been able to demonstrate that JARID2∆ cells do indeed display a significant decrease of cells entering into S-phase when treated with palbociclib. Most importantly, in the Figures 6D and 6E we show additional cell lines where this is the case.  Therefore, we feel that this title is valid in the current version of the manuscript, where we have shown it to be the case in multiple tumor-derived human cell lines as well as immortalized non-human primate cells.  

      (2) It is unclear how the data in Figure 3D support the conclusion that the administered inhibitors of oxidative phosphorylation influence response to palbociclib.

      As noted in the response to point 4, we have now discussed this mutual rescue more thoroughly in the text.  

      (3) In Figure 4D, the IC50 values should be calculated and statistical significance based on biological replicates should be determined. Also, the conclusion that "increasing doses of GSK126 withstood palbociclib-induced growth suppression" is overstated, as ultimately all drug conditions succumb to palbocilib suppression of proliferation, although there may be differences in sensitivity.

      We have now  included a statical analysis of each data point in Figure 4D.  

      Editorial comments:

      (1) The title does not seem to optimally capture the content of the paper. Please consider changing it, e.g. focusing on palbociclib resistance. 

      While we used this particular drug to make the original observation, we feel it is more general to discuss the underlying biology (cyclin gene control) than the pharmacological methodology. Moreover, we have now extended our findings about the regulation of D-type cyclins by PRC2.1 to several cell lines, derived from both cancers and primary cells, re-enforcing the fact that this effect is observed more broadly.   

      (2) Please indicate the biological system (haploid human HAP1 cells) in either title or abstract.

      The abstract now indicates that we have observed this in CML, breast cancer and immortalized primary cells.

    1. Author response:

      Reviewer #1 (Public review):

      Summary:

      The authors aim to investigate the relationship between low estrogen levels, postmenopausal hypertension, and the potential role of the molecule L-AABA as a biomarker for hypertension. By employing metabolomic analysis and various statistical methods, the study seeks to understand how estrogen deficiency affects blood pressure and identify key metabolites involved in this process, with a particular focus on L-AABA.

      Strengths:

      The study addresses a relevant and understudied area: the role of estrogen and metabolites in postmenopausal hypertension. It presents a novel hypothesis that L-AABA may serve as a protective factor against hypertension, which could have significant clinical implications if proven.

      We appreciate the acknowledgment of our study’s focus on an important and understudied area. Our hypothesis regarding L-AABA’s role as a possible protective factor against hypertension indeed holds promise for advancing clinical implications.

      Weaknesses:

      The evidence linking L-AABA to hypertension is largely correlative, lacking experimental validation or mechanistic proof. Key limitations, such as the inadequacy of the ovariectomy model in replicating human menopause, are acknowledged but not addressed with alternative approaches. In summary, while the study offers an intriguing hypothesis, its conclusions are premature and require further experimental validation and human data to substantiate the claims.

      We recognize the limitations regarding the correlative nature of our findings and the inadequacy of the OVX model in replicating human menopause. Future research will prioritize experimental validation and incorporate human studies to solidify our conclusions.

      Reviewer #2 (Public review):

      Summary:

      In this manuscript, Dr. Yao Li et al. documented the metabolomic profile of the aorta from OVX rats and that from OVX plus E2. These conditions mimic post-menopause hypertension and hormonal replacement therapy.

      Strengths:

      The authors state that this is probably the first study to examine the metabolic changes in the aorta of post-menopause hypertension.

      As pointed out by the reviewer, our study may be the first to investigate changes in aortic metabolism in postmenopausal hypertension. As an exploratory study, our goal is to depict the overall characteristics and explore possible research directions.

      Weaknesses:

      There are several weaknesses, and a few of them are quite serious.

      (1) The aorta is not a resistant artery and has little to do with hypertension. The authors should have used resistant arteries for this study. The expression of several adrenergic receptors and cholinergic receptors in the aorta and resistant arteries are different. It is unknown whether the aorta metabolomic profile has any relevance to BP and whether they are similar to that of the resistant arteries. I understand the logistics issue of obtaining enough tissues from resistant arteries. At least, once some leads are discovered in the aorta, the authors should validate it in resistant arteries. This should be feasible.

      We acknowledge the limitation of using the aorta and will aim to include studies on resistant arteries to validate our metabolomic findings.

      (2) The aorta and all the arteries have three layers. It is critically important to know whether the metabolic changes occur in the intima or in the media, while the adventitia probably has little to do with vasoconstriction and hypertension. If the authors want to use the aorta to conduct the preliminary study, they should completely remove the adventitia and then use samples with and without their endothelium stripped and then assess their metabolomic profiles. After the leads are obtained from this preliminary profiling, they should be validated in endothelium and smooth muscles of the resistant artery. The current experiments are not appropriately designed.

      Future studies will involve detailed profiling of specific arterial layers, focusing on the intima and media to enhance the relevance of our findings related to hypertension.

      (3) The tail-cuff BP measurement is a technique of the last century. The current gold standard of BP measurement is by telemetry. The tail-cuff method is particularly problematic in this study because the 1-2 h restraining of the rats for more than 10 times BP measurement will cause significant stress in the animal, and their stress hormone secretion might cause biased metabolomic profiles in the OVX versus shames operated mice. The problem can be totally avoided by using telemetry.

      We appreciate the suggestion and will consider telemetry for more accurate blood pressure measurements in future experiments to minimize stress-related bias.

      (4) Although the L-AABA showed a high p-value (10^-4) of a decrease in the OVX rats, the fold change is small (2-3 folds). Such a small change should be validated using a different method to be convincing.

      We plan to employ additional methods to validate the observed changes in L-AABA levels in the following research, ensuring robustness of our findings.

      (5) The authors claim (or hypothesize) that the reduced AABA level in OVX can cause vascular remodeling. This can be easily validated by the histology of the OVX-resistant artery, and they should do that during the revision. The authors should also examine the M1 macrophage function from the OVX mice to validate their claimed link of AABA to M1.

      We intend to conduct histological analyses and examine M1 macrophage function in OVX-resistant arteries to validate our hypothesis in the following research.

      (6) As mentioned above, the authors need to pinpoint the changes of AABA to target cells, i.e., endothelial cells, SMC, or M1, and then use in vitro or in vivo cell biology approaches to assess whether these cells in the OVX rat indeed have an abnormality in function and, indeed, such functional changes are responsible for the BP phenotype.

      Addressing these points, we aim to pinpoint specific cell types affected by AABA variations and conduct in vitro and in vivo studies to examine their physiological impacts in the following research.

      (7) The results of the current study can be condensed into 1 or 2 figures that can serve as a base or a starting point for a deeper scientific study.

      Thank you for your suggestion. As a omics research, our research approach may differ from traditional mechanism studies.

      Summary

      The experimental design of this manuscript is inappropriate, and the methods are not up to the current standards. The whole study is descriptive and rudimentary. It lacks validation and mechanism. The data from this manuscript might be of some value and can serve as the first step for more investigation of the mechanism of post-menopause hypertension.

      Reviewer #3 (Public review):

      Summary:

      The decrease in estrogen levels is strongly associated with postmenopausal hypertension. Dr. Yao Li and colleagues aimed to investigate the metabolomic mechanisms of underlying postmenopausal hypertension using OVX and OVX+E2 rat models. They successfully established a correlation between reduced estrogen levels and the development of hypertension in rats. They identified L-alpha-aminobutyric acid (AABA) as a potential marker for postmenopausal hypertension. The research explored the metabolic alterations in aortic tissues and proposed several potential mechanisms contributing to postmenopausal hypertension.

      Strengths:

      The group performed a comprehensive enrichment analysis and various statistical analyses of the metabolomics data.

      As summarized by the reviewer, our current study conducted a comprehensive analysis of metabolomics data. It is also a reliable foundation for further mechanism research.

      Weaknesses:

      (1) The manuscript is descriptive in nature, although they mentioned their primary objective is to explore the potential mechanisms linking low estrogen levels with postmenopausal hypertension. No mechanism insights have been interrogated in this study, which has been mentioned by the authors in the discussion. The connection between E2, AABA, and macrophage needs to be validated in endothelial cells, vascular smooth muscle cells, and other aortic tissue cells. Without such verification, the manuscript predominantly raises hypotheses only based on metabolomic data.

      We have proposed research hypotheses based on detailed omics data. Further research on the mechanisms involving endothelial and vascular smooth muscle cells to validate the pathway connections between E2, AABA, and macrophages is undoubtedly the future direction of this study.

      (2) The serum contains three forms of estrogen: Estradiol, Estrone, and Estriol. The authors used the Rat E2 ELISA kit. Ideally, all three forms of estrogen should be measured.

      Future assays will aim to measure Estradiol, Estrone, and Estriol to capture a more comprehensive picture of estrogen’s role in postmenopausal hypertension.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This useful study reports on the discovery of an antimicrobial agent that kills Neisseria gonorrhoeae. Sensitivity is attributed to a combination of DedA assisted uptake of oxydifficidin into the cytoplasm and the presence of a oxydifficidin-sensitive RplL ribosomal protein. Due to the narrow scope, the broader antibacterial spectrum remains unclear and therefore the evidence supporting the conclusions is incomplete with key methods and data lacking. This work will be of interest to microbiologists and synthetic biologists.

      General comment about narrow scope: The broader antibacterial spectrum of oxydifficidin has been reported previously (S B Zimmerman et al., 1987). The main focus of this study is on its previously unreported potent anti-gonococcal activity and mode of action. While it is true that broad-spectrum antibiotics have historically played a role in effectively controlling a wide range of infections, we and others believe that narrow-spectrum antibiotics have an overlooked importance in addressing bacterial infections. Their advantage lies in their ability to target specific pathogens without markedly disrupting the human microbiota.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Kan et al. report the serendipitous discovery of a Bacillus amyloliquefaciens strain that kills N. gonorrhoeae. They use TnSeq to identify that the anti-gonococcal agent is oxydifficidin and show that it acts at the ribosome and that one of the dedA gene products in N. gonorrhoeae MS11 is important for moving the oxydifficidin across the membrane.

      Strengths:

      This is an impressive amount of work, moving from a serendipitous observation through TnSeq to characterize the mechanism by which Oxydifficidin works.

      Weaknesses:

      (1) There are important gaps in the manuscript's methods.

      The requested additions to the method describing bacterial sequencing and anti-gonococcal activity screening will be made. However, we do not think the absence of these generic methods reduces the significance of our findings.

      (2) The work should evaluate antibiotics relevant to N. gonorrhoeae.

      (1) It is not clear to us why reevaluating the activity of well characterized antibiotics against known gonorrhoeae clinical strains would add value to this manuscript. The activity of clinically relevant antibiotics against antibiotic-resistant N. gonorrhoeae clinical isolates is well described in the literature. Our use of antibiotics in this study was intended to aid in the identification of oxydifficidin’s mode of action. This is true for both Tables 1 and 2.

      (2) If the reviewer insists, we would be happy to include MIC data for the following clinically relevant antibiotics: ceftriaxone (cephalosporin/beta-lactam), gentamicin (aminoglycoside), azithromycin (macrolide), and ciprofloxacin (fluoroquinolone).

      (3) The genetic diversity of dedA and rplL in N. gonorrhoeae is not clear, neither is it clear whether oxydifficidin is active against more relevant strains and species than tested so far.

      (1) We thank the reviewer for this suggestion. We aligned the DedA sequence from strain MS11 with DedA proteins from 220 N. gonorrhoeae strains that have high-quality assemblies in NCBI. The result showed that there are no amino acid changes in this protein. Using the same method, we observed several single amino acid changes in RplL. This included changes at A64, G25 and S82 in 4 strains with one change per strain. These sites differ from R76 and K84, where we identified changes that provide resistance to oxydifficidin. Notably, in a similar search of representative Escherichia, Chlamydia, Vibrio, and Pseudomonas NCBI deposited genomes, we did not identify changes in RplL at position R76 or K84.

      (2) While the usefulness of screening more clinically relevant antibiotics against clinical isolates as suggested in comment 2 was not clear to us, we agree that screening these strains for oxydifficidin activity would be beneficial. We have ordered Neisseria gonorrhoeae strain AR1280, AR1281 (CDC), and Neisseria meningitidis ATCC 13090. They will be tested when they arrive.

      Reviewer #2 (Public Review):

      Summary:

      Kan et al. present the discovery of oxydifficidin as a potential antimicrobial against N. gonorrhoeae, including multi-drug resistant strains. The authors show the role of DedA flippase-assisted uptake and the specificity of RplL in the mechanism of action for oxydifficidin. This novel mode of action could potentially offer a new therapeutic avenue, providing a critical addition to the limited arsenal of antibiotics effective against gonorrhea.

      Strengths:

      This study underscores the potential of revisiting natural products for antibiotic discovery of modern-day-concerning pathogens and highlights a new target mechanism that could inform future drug development. Indeed there is a recent growing body of research utilizing AI and predictive computational informatics to revisit potential antimicrobial agents and metabolites from cultured bacterial species. The discovery of oxydifficidin interaction with RplL and its DedA-assisted uptake mechanism opens new research directions in understanding and combating antibiotic-resistant N. gonorrhoeae. Methodologically, the study is rigorous employing various experimental techniques such as genome sequencing, bioassay-guided fractionation, LCMS, NMR, and Tn-mutagenesis.

      Weaknesses:

      The scope is somewhat narrow, focusing primarily on N. gonorrhoeae. This limits the generalizability of the findings and leaves questions about its broader antibacterial spectrum. Moreover, while the study demonstrates the in vitro effectiveness of oxydifficidin, there is a lack of in vivo validation (i.e., animal models) for assessing pre-clinical potential of oxydifficidin. Potential SNPs within dedA or RplL raise concerns about how quickly resistance could emerge in clinical settings.

      (1) Spectrum/narrow scope: The broader antibacterial spectrum of oxydifficidin has been reported previously (S B Zimmerman et al., 1987). The focus of this study is on its previously unreported potent anti-gonococcal activity and its mode of action. While it is true that broad-spectrum antibiotics have historically played a role in effectively controlling a wide range of infections, we and others believe that narrow-spectrum antibiotics have an overlooked importance in addressing bacterial infections. Their advantage lies in their ability to target specific pathogens without markedly disrupting the human microbiota.

      (2) Animal models: We acknowledge the reviewer’s insight regarding the importance of in vivo validation to enhance oxydifficidin’s pre-clinical potential. However, due to the labor-intensive process needed to isolate oxydifficidin, obtaining a sufficient quantity for animal studies is beyond the scope of this study. Our future work will focus on optimizing the yield of oxydifficidin and developing a topical mouse model for subsequent investigations.

      (3) Potential SNPs: Please see our response to Reviewer #1’s comment 3. We acknowledge that potential SNPs within dedA and rplL raise concerns regarding clinical resistance, which is a common issue for protein-targeting antibiotics. Yet, as pointed out in the manuscript, obtaining mutants in the lab was a very low yield endeavor.

      Reviewer #3 (Public Review):

      Summary:

      The authors have shown that oxydifficidin is a potent inhibitor of Neisseria gonorrhoeae. They were able to identify the target of action to rplL and showed that resistance could occur via mutation in the DedA flippase and RplL.

      Strengths:

      This was a very thorough and clearly argued set of experiments that supported their conclusions.

      Weaknesses:

      There was no obvious weakness in the experimental design. Although it is promising that the DedA mutations resulted in attenuation of fitness, it remains an open question whether secondary rounds of mutation could overcome this selective disadvantage which was untried in this study.

      We thank the reviewer for the positive comment. We agree that investigating factors that could compensate for the fitness attenuation caused by DedA mutation would enhance our understanding of the role of DedA.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) The use of the term "N. gonorrhoeae wildtype" should not be used. It is uninformative, as the species contains a large amount of diversity. Instead, please name the strain. From Figure 1, it looks like the authors used MS11. Since MS11 is a longstanding lab strain and likely does not reflect circulating N. gonorrhoeae, and since H041 is no longer in circulation, the authors should ideally test the compound against more representative strains of N. gonorrhoeae. This includes panels of isolates available through the CDC, for example (https://www.cdc.gov/drugresistance/resistance-bank/index.html). I encourage the authors to include FC428 or another recently identified isolate with the penA 60 allele to demonstrate oxydifficidin's activity against contemporary concerning isolates/lineages.

      (1) “N. gonorrhoeae MS11” is now used instead of “N. gonorrhoeae WT” in this manuscript.

      (2) In our revised manuscript, we have added MIC data for recently identified Neisseria gonorrhoeae isolates AR#1280 and AR#1281 which contain the penA 60 allele (Table 1). The data shows oxydifficidin maintains its potent activity against these multidrug-resistant strains. We also added a description of this data to the results section as shown below.

      Original text: “Oxydifficidin was more potent against N. gonorrhoeae MS11 than almost all other antibiotics we tested. In fact, it was only slightly less active than the highly optimized third-generation cephalosporin, ceftazidime.([18]) However, unlike third-generation cephalosporins, oxydifficidin retained activity against the multidrug resistant H041 clinical isolate (Table 1).([4]) H041 is resistant to the “standard of care” cephalosporin ceftriaxone (2 µg/mL) as well as a number of other antibiotics that are normally active against N. gonorrhoeae (penicillin G, 4 µg/mL; cefixime, 8 µg/mL; levofloxacin, 32 µg/mL).”

      Changed to: “Oxydifficidin was more potent against N. gonorrhoeae MS11 than most other antibiotics we tested. Notably, unlike clinically used antibiotics such as ceftriaxone, azithromycin, and ciprofloxacin, oxydifficidin retained activity against all multidrug-resistant clinical isolates we examined (Table 1).” (Line 77-79)

      (2) Does oxydifficidin have activity against N. meningitidis? It is the species most closely related to N. gonorrhoeae and the other pathogenic Neisseria.

      Oxydifficidin has potent activity against N. meningitidis ATCC 13090. In our revised manuscript, we have included its MIC data in Figure 1c.

      (3) Given claims that oxydifficidin activity in N. gonorrhoeae as compared to other Neisseria reflects N. gonorrhoeae's dedA and sensitive rplL, it would be good to assess the allelic diversity of these genes in N. gonorrhoeae. There are over 20,000 genomes from clinical isolates of N. gonorrhoeae in databases. It should be straightforward to check whether dedA and rplL allelic variants already exist in the population. Should variants be observed, oxydifficidin should be tested against the associated strains of N. gonorrhoeae.

      Response: We thank the reviewer for this suggestion. We aligned the DedA sequence from strain MS11 with DedA proteins from 220 N. gonorrhoeae strains that have high-quality assemblies in NCBI. The result showed that there are no amino acid changes in this protein. Using the same method, we observed several single amino acid changes in RplL. This included changes at A64, G25 and S82 in 4 strains with one change per strain. These sites differ from R76 and K84, where we identified changes that provide resistance to oxydifficidin. Notably, in a similar search of representative Escherichia, Chlamydia, Vibrio, and Pseudomonas NCBI deposited genomes, we did not identify changes in RplL at position R76 or K84.

      New text: “A survey of 220 N. gonorrhoeae strains with high-quality assemblies in NCBI found no mutations in the DedA protein.” (Line 104-105)

      “These two mutations were not found in the survey of the same collection of N. gonorrhoeae strains used to look for DedA mutations.” (Line 143-144)

      (4) Clinically relevant antibiotics for N. gonorrhoeae are penicillin, tetracycline, spectinomycin, gentamicin, ciprofloxacin, azithromycin, ceftriaxone; moreover, zoliflodacin and gepotidacin have reportedly successfully completed phase 3 trials. The authors should redo their MIC testing with these antibiotics (e.g., for Figures 1 and 2 and Tables 1 and 2), both because this will enable direct comparison with the many clinical isolates that have undergone testing and because these are the drugs most pertinent to clinical practice. Ampicillin, ceftazidime, chloramphenicol, bacitracin, and daptomycin are not relevant. Could the authors explain why they tested vancomycin, polymyxin B, irgasan, melittin, avilamycin, and thiostrepton?

      Our use of antibiotics with diverse modes of action (e.g. vancomycin, polymyxin B, irgasan, melittin, avilamycin, and thiostrepton) in this study was intended to aid in the identification of oxydifficidin’s mode of action. This is true for both Tables 1 and 2.

      To address the reviewer’s concern, in our revised manuscript, we have added MIC data for the following clinically relevant antibiotics: ceftriaxone (cephalosporin/beta-lactam), gentamicin (aminoglycoside), azithromycin (macrolide), and ciprofloxacin (fluoroquinolone) to Table 1.

      (5) Please describe the characteristics of the transposon library (finding four transposons in a single strain does seem unexpected, given how most transposon libraries aim for one transposon insertion per strain).

      We understand that one transposon insertion per strain is ideal for transposon libraries. This Bacillus strain proved to be recalcitrant to genetic manipulation. In the rare cases where we obtained resistance colonies upon electroporation with the transposon, all colonies contained multiple (≥ 4) transposon insertions. This made it impractical to build a library with one transposon insertion per library member.

      We assumed that the anti-N. gonorrhoeae activity most likely originated from a natural product BGC, which typically range from 10-100 kb in size.

      Based on the average of 50 kb per BGC, ~80 transposon insertions would be required to fully search the 4.2 Mb genome of Bacillus amyloliquefaciens BK for a BGC. At 4 mutations per transformant, 1x coverage of the genome would require only 20 library members.

      After extensive electroporation of transposon into Bacillus amyloliquefaciens BK, we were able to obtain a library of 50 members, including one mutant (Tn5-3) that lacked anti-N. gonorrhoeae activity.

      New text added to the methods section:

      “A library containing 50 transposon mutants was obtained. In the mutants examined, each strain contained ≥4 transposon insertions” (Line 337-339)

      (6) Please describe in the methods how you sequenced and annotated the genome of Bacillus amyloliquefaciens BK.

      The sequencing method is now described in “Genomic Sequencing and annotation of Bacillus amyloliquefaciens” section. The genome of Bacillus amyloliquefaciens BK was not fully annotated. Mutations were identified as described in the updated methods section below.

      New text:

      “Genomic Sequencing and annotation of Bacillus amyloliquefaciens

      Genomic DNA from Bacillus amyloliquefaciens BK WT and transposon mutant Tn5-3 was isolated using PureLink Microbiome DNA purification kit (Invitrogen) according to the manufacturer’s instructions.

      The Bacillus amyloliquefaciens BK WT genome was assembled by mapping its sequencing data onto the annotated genome of Bacillus amyloliquefaciens FZB42 using Geneious Prime. Differences in the mutant strain Tn5-3 were identified by mapping its sequencing data onto the assembled Bacillus amyloliquefaciens BK WT genome. The mutated genes were then annotated using NCBI BLAST. The oxydifficidin BGC was annotated using the antiSMASH online server.” (Line 253-260)

      (7) Please describe in the methods how you screened the library for strains that lacked anti-gonococcal activity.

      The method is added to our revised manuscript as section “Screening of Bacillus Strains Lacking Anti-N. gonorrhoeae Activity”.

      New text:

      “Screening of Bacillus Strains Lacking Anti-N. gonorrhoeae Activity

      The transposon mutants of Bacillus amyloliquefaciens BK were grown overnight in LB medium at 30 °C. Each overnight culture was then diluted 1:5000, and 1 μl of the diluted culture was spotted onto a GCB agar plate swabbed with N. gonorrhoeae cells. The plate was then incubated overnight at 37 °C with 5% CO2. The mutant strain (Tn5-3) lacking anti-N. gonorrhoeae activity was identified due to its failure to produce a zone of growth inhibition in the resulting N. gonorrhoeae lawn.” (Line 341-346)

      (8) Was only one strain found that was a 'non-producer' of anti-N. gonorrhoeae activity? Line 68 suggests that this was only one of multiple non-producers. Is that correct? If so, did you work up the others, and did they also have disruptions in the same biosynthetic gene cluster?

      Only one strain was identified as a “non-producer” of anti-N. gonorrhoeae activity. We have modified the text to clarify this point.

      Original text: “The sequencing of one non-producer strain revealed that it surprisingly contained four transposon insertions and one frame shift mutation.”

      Changed to: “The sequencing of the non-producer strain revealed that it surprisingly contained four transposon insertions and one frame shift mutation.” (Line 53-54 )

      (9) All sequences (including Bacillus amyloliquefaciens BK) must be deposited in a public database (e.g., NCBI) and the accession numbers reported in the manuscript.

      Genomic sequence data of Bacillus amyloliquefaciens BK has been deposited in GenBank, and its accession number (GCA_019093835.1) now appears in figure legend of Figure S1a.

      Figure S1a legend:

      “Genome-based phylogenetic tree containing Bacillus amyloliquefaciens BK and closely related Bacillus spp. The tree was built by Genome Clustering of MicroScope using neighbor-joining method. The NCBI accession numbers of Bacillus strains used in the tree are GCA_000196735.1, GCA_000204275.1, GCA_000015785.2, GCA_019093835.1, GCA_000009045.1, GCA_000011645.1, GCA_000172815.1, GCA_000008005.1, and GCA_000007845.1 (from top to bottom).”

      Minor

      (10) Statements in the article would benefit from fact-checking. For example:

      - gonorrhea is not the second most prevalent sexually transmitted infection worldwide; it is the second most reported bacterial sexually transmitted infection.

      - Treatment is ceftriaxone 500mg IM x1 in the US, but 1g IM x1 in the UK and Europe. The UK guidelines also permit ciprofloxacin, should sequencing indicate gyrA 91S. I suggest reviewing / specifying which treatment guidelines you're referring to.

      We appreciate the reviewer’s corrections. The word “prevalent” is now changed to “reported”.

      Original text: “Gonorrhea, which is caused by Neisseria gonorrhoeae, is the second most prevalent sexually transmitted infection worldwide.”

      Changed to: “Gonorrhea, which is caused by Neisseria gonorrhoeae, is the second most reported sexually transmitted infection worldwide.” (Line 2-3)

      Original text: “Gonorrhea is the second most prevalent sexually transmitted infection worldwide, its causative agent is the bacterium Neisseria gonorrhoeae.”

      Changed to: “Gonorrhea is the second most reported sexually transmitted infection worldwide, its causative agent is the bacterium Neisseria gonorrhoeae.” (Line 18-19)

      “In the USA” is now added to the sentence stating gonorrhea treatment.

      Original text: “The high dose (500 mg) of the cephalosporin ceftriaxone is currently the only recommended therapy for treating gonorrhea infections.”

      Changed to: “The high dose (500 mg) of the cephalosporin ceftriaxone is currently the only recommended therapy for treating gonorrhea infections in the USA.” (Line 20-22)

      (11) Please make sure all results are in the results section. The report of cell morphology, for example, should be in the results, not the discussion.

      In our revised manuscript, we have included the cell morphology data in the results section with the text changes below.

      Original text: “Interestingly, not only was dedA deficient N. gonorrhoeae less susceptible to oxydifficidin, oxydifficidin also kills this mutant more slowly (Figure 2b) than WT N. gonorrhoeae MS11.”

      Changed to: “Interestingly, not only was dedA deficient N. gonorrhoeae less susceptible to oxydifficidin, oxydifficidin also kills this mutant more slowly (Figure 2b) than WT N. gonorrhoeae MS11. The dedA deletion mutant also showed an altered cell morphology with reduced membrane integrity and lower formation of micro-colonies (Figure S4). (Line 100-104)

      Original text: “The dedA deletion mutant also showed an altered cell morphology with reduced membrane integrity and lower formation of micro-colonies (Figure S4), indicating that it should show reduced pathogenesis and fitness, and, as a result, not accumulate in a clinical setting, which adds to the therapeutic appeal of oxydifficidin.”

      Changed to: “The dedA deletion mutant exhibited altered cell morphology, characterized by diminished membrane integrity and reduced micro-colony formation, indicating that it should show reduced pathogenesis and fitness, and, as a result, not accumulate in a clinical setting, which adds to the therapeutic appeal of oxydifficidin” (Line 206-210)

      (12) Tables 1 and 2 should be combined and should address the most relevant antibiotics

      The MIC data of additional relevant antibiotics are now included in Table 1. However, we still believe that keeping Tables 1 and 2 separate enhances the clarity of the manuscript. Table 2 specifically focuses on diverse ribosomal targeting antibiotics, which highlights the unique binding site of oxydifficidin.

      (13) Supplemental Figure 1a. The tree could be better resolved, and there are four entries with the identical listing of "Bacillus amyloliquefaciens subsp. plantarum" on different branches. In the methods or the legend, please indicate the accession numbers for these genomes. Also please specify how this tree was made-is it a maximum likelihood tree? Something else?

      The tree is now better resolved and includes new entries. The requested information regarding accession numbers and tree construction method has been included in the figure legend.

      New supplemental Figure 1a legend:

      “a. Genome-based phylogenetic tree containing Bacillus amyloliquefaciens BK and closely related Bacillus spp. The tree was built by Genome Clustering of MicroScope using neighbor-joining method. The NCBI accession numbers of Bacillus strains used in the tree are GCA_000196735.1, GCA_000204275.1, GCA_000015785.2, GCA_019093835.1, GCA_000009045.1, GCA_000011645.1, GCA_000172815.1, GCA_000008005.1, and GCA_000007845.1 (from top to bottom).”

      Reviewer #2 (Recommendations For The Authors):

      The conclusions drawn in the manuscript are well-supported by the experimental data presented.

      I have the below minor comments:

      (1) "serendipitously identified" - I feel this wording should be avoided throughout the manuscript. The point of a research paper is to communicate methodology and experimental detail, and this language portrays the opposite.

      While we agree that methodology and experimental procedures are paramount in scientific reporting, we believe it is equally important to convey, particularly to younger generations, that a part of the scientific process is often unplanned and can benefit from chance observations. Therefore, we would like to keep this wording.

      (2) The introduction should include the biological roles/function of DedA proteins in bacteria.

      DedA proteins perform a wide array of biological roles and functions in bacteria. In the results section (Line 107-116), we have described the most well-established of these functions, particularly the flippase activity, which appears to be directly related to oxydifficidin sensitivity. We believe that introducing this information in the results section enhances the manuscript’s clarity and flow.

      (3) "When we screened this contaminant for antibacterial activity against lawns of other Gram-negative bacteria it did not produce a zone of growth of inhibition against any of the bacteria we tested (e.g., Escherichia coli, Vibrio cholerae, Caulobacter crescentus)." Can these data Figures be included in the Supplements?

      This result was recorded in the lead author’s notebook, but no image was saved.

      (4) Line 52: Was any base analyses performed on the Tn-mutants i.e., how many insertion-sites? Depth of mutants? Was a library constructed in this study or previously? Why were only BGC assessed?

      Please see our response to Reviewer #1’s comment (5). We focused on BGCs because we believed the anti-N. gonorrhoeae activity most likely resulted from a molecule encoded by a natural product BGC.

      (5) Line 98: Do the other 2 predicted DedA-like proteins also have a role in uptake of oxydifficidin? Is there some redundancy in uptake?

      We generated knockout mutants for two other predicted DedA-like proteins in N. gonorrhoeae MS11, and the MIC of oxydifficidin for these mutants remained the same as for the N. gonorrhoeae MS11 wild type strain. Therefore, we believe that the DedA protein discussed in this manuscript is the primary transporter of oxydifficidin. However, we cannot completely rule out the possibility of redundancy in oxydifficidin uptake by other DedA-like proteins.

      New text: “We also generated deletion mutants for two other predicted dedA-like genes, and the MIC of oxydifficidin for these mutants remained the same as for the N. gonorrhoeae MS11 wild type strain.” (Line 98-100)

      Reviewer #3 (Recommendations For The Authors):

      This is a well presented manuscript and I could not immediately see any issues with it.

      We appreciate the reviewer’s positive feedback.

    1. Author response:

      We are submitting a revised manuscript with major additions that address the main concerns in the initial reviews. At the highest level, this revision provides i) orthogonal biochemical measurements that yield concrete evidence of lysosomal protein aggregates, and ii) a plausible mechanism linking lysosomal lipid handling and protein aggregation through disruption of ESCRT function. We believe these additions significantly improve the completeness of this study and the conclusions that can be drawn from the data.

      Below are more specific highlights on the addition in this revision:

      -       We included orthogonal techniques (thioflavin-T staining and Lyso-IP followed by differential extraction) and confirmed the accumulation of RIPA-insoluble protein aggregates at the lysosomes in cells under lipid perturbation (Figure 3).

      -       We performed TMT-Proteomics and identified accumulation of insoluble ESCRT components at the lysosomes under lipid perturbation (Figure 4). Two new authors involved in this effort are added onto the manuscript.

      -       The ESCRT result prompted us to revisit lysosomal membrane integrity. With improved imaging conditions and analysis we were able to see increased membrane permeabilization under lipid perturbation. VPS4A overexpression partially rescued this phenotype, suggesting that lipid accumulation impairs ESCRT disassembly (Figure 5).

      -       Together, the results suggest that lipid perturbation impairs ESCRT function, compromising both lysosomal membrane repair and microautophagy, resulting in the accumulation of endogenous protein aggregates at the lysosomes (Graphical Abstract).

      Reviewer #1 (Recommendations For The Authors):

      (1) Perhaps the most prominent limitation of this work is the unilateral focus on native cells (i.e. cells under no endogenous or exogenous stress) as the model for protein aggregate formation. Furthermore, although the ProteoStat stain has been utilized by many investigators before, the sole reliance on this stain as the read-out for their assays is concerning. To compound the concern, the ProteoStat-positive puncta co-localize with lysosmal markers which was surprising even to the authors. All in all, it behooves the authors to test proteostasis in multiple parallel ways to actually define what they are studying. How is it possible that protein aggregates under native conditions are only co-localized with lysosomes? Are we really studying protein aggregates which should predominantly be cytoplasmic insoluble aggregates?

      (a) They need to get away from a simple stain like ProteoStat and conduct co-stainings with other markers such as poly-ubiquitin antibodies and other chaperones to define what and where else exactly are these aggregates.

      Co-staining with poly-ubiquitin was included in the original manuscript. We added orthogonal staining with another widely used amyloid dye, Thioflavin-T, and provided fine-grained quantification of lysosomal vs cytosolic localization of various signals (Figures S4A-C & 3A-B).

      (b) They need to do Immunoblots with and without triton insolubility to see if these aggregates are insoluble as most would predict. They can do lysosomal isolation vs cytoplasmic to see if the insoluble aggregates are really lysosomal.

      We performed Lyso-IP followed by differential detergent extraction to confirm the accumulation of insoluble proteins at the lysosomes (Figure 3C). Proteomic analysis identified some of these insoluble proteins as ESCRT subunits (Figure 4).

      (c) They should compare aggregate formation in the native state versus cells with lysosomal inhibition via Bafilomycin or chloroquine versus cells with proteosomal inhibition. The lysosomal inhibition experiments are particularly informative given the lysosomal relevance they have uncovered.

      We included other small molecule inhibitors and at different time points to compare the effect of different modes of proteostasis challenge (Figure S4A-D). Together with the ESCRT finding, our results suggest the role of microautophagy in our system, and provide a model of how ProteoStat- and/or ubiquitin- positive substrates become partitioned between the cytoplasm and lysosomes under different perturbations.

      (d) Many protein aggregates which are too bulky for proteosome degradation will traditionally be dealt with by aggrephagy. Why is this not observed?

      Knockdown of core macroautophagy components did not impact Proteostat intensity in our CRISPRi screen, suggesting that basal macroautophagy plays a negligible role in clearing endogenous amyloid-like structures in our experimental system. We provide an alternative model that these aggregates instead arrive at the lysosomes via microautophagy.

      (2) After addressing #1, they can validate if the genes they identified by CRISPR screens are also important in modulation of protein aggregate burden in other systems. For example, if they inhibit lysosomes by Bafilo or Chloroquine to obtain protein aggregates and then Knockdown the identified genes in the CRISPR screens, will they get the same results?

      We addressed the effect of different modes of proteostasis challenge as recommended above. Deacidifying the lysosomes alone causes intense protein aggregation (Figure S4A-D) and eventually cell death, and was thus not combined with other perturbations.

      (3) They identify lysosomal lipid metabolism genes/pathways as the culprit for inducing proteostasis. In particular sphingolipid and cholesteryl ester species appear to be operational here. However, there are no specific lipids species or specific lipid metabolism gene that is causative. Rather, you have to knockdown entire processes to have an effect. This suggests that the focus on lysosome health (i.e. permeability, proteolysis, etc) is rudimentary. When you have to knockdown entire classes of lipids, this would indicate more broad effects on cellular lipids (including membrane lipids beyond the lysosome) and related cellular health?

      We included data on the effect of knocking down MYLIP, PSAP, and as a comparison PSMD2 on the growth rate of K562 cells (Figure S5A). MYLIP and PSAP KDs, which cause predominantly an accumulation of lipids, do not impede cell growth. Increasing lipid uptake by MYLIP KD increases cell proliferation under our culture conditions, suggesting a general negative impact on cell health was not required for the association between lipid levels and protein aggregates.

      (a) They conduct a superficial methyl-beta-cyclodextrin experiment with equivocal results. The use of MBCD for different time-courses to deplete various membrane cholesterol pools including the plasma membrane pool is important to ascertain what aspect of the cellular cholesterol is affecting proteostasis. MBCD +/- cholesterol reintroduction time-courses for rescue will also be key to determine the culprit cellular cholesterol pool.

      The MBCD / Filipin experiment helped us determine that ProteoStat doesn’t directly stain cholesterol, nor any major plasma membrane components. Free cholesterol was implicated in neither the screen nor the lipidomics and was not the subject of targeted experiments.

      (b) The same concept can be applied to sphingolipids. There are sphingolipids in abundance in multiple membrane compartments. Which ones are causal here? More nuanced evaluation of this with sphingolipid staining/tracking can be conducted.

      We attempted experiments where sphingolipids were added back to cells grown in FBS-depleted media. Nevertheless, we were not able to consistently deliver these lipid species and doing so while ensuring the correct subcellular localization at physiologically relevant level would require substantial methods development.

      (c) As part of this, are lipid rafts and/or caveolae being affected by the perturbations in cholesterol and sphingolipids? Lipid rafts are highly enriched in these 2 lipids which could link to their preteostasis observation.

      Indeed, ceramides released from SM hydrolysis are proposed to self-assembled into microdomains with negative curvature that can promote the formation of intralumenal vesicles (Alonso and Goni, 2018; Niekamp et al 2022). We propose that SM accumulation may hinder this process by counteracting the negative membrane curvature and impede microautophagy.

      (d) How about ER membrane lipids? The UPR and subsequent effects on proteostasis are intricately involved with ER lipid bilayer composition.

      We did not perform lipidomics on ER membranes in this study, though we note that at steady state, sphingolipids and cholesterol esters are not expected to be enriched at the ER (Ikonen and Zhou, 2021). We checked whether lipid-related genetic perturbations induced the UPR in published perturb-seq data in K562 cells. Neither MYLIP nor PSAP knockdown induced a UPR.

      In conclusion, the manuscript is interesting but the excitement over a link between lysosome-related lipid metabolism and proteostasis needs to be tamped until a more robust experimental approach is employed to generate supportive and corroborating results.

      Reviewer #2 (Recommendations For The Authors):

      - The paper has a number of grammatically awkward sentences. Editing these would enhance clarity.

      - It is important to show the co-localization of aggregates with the lysosome. This is shown in supplements but should be in a main figure. Here the authors cite previous work indicating that ProteoStat puncta co-localize with ubiquitinated proteins and state that they do not see this, then essentially just move on. Is there an explanation for this discrepancy and can it be resolved? What do they think is really going on? What happens to levels of ubiquitinated proteins when lipid metabolism is perturbed as in these experiments?

      We have included the lipid-induced lysosomal protein aggregation data in the main text (Figure 3A-B), and provided fine-grained quantification of the cytosolic-vs-lysosomal ProteoStat / Ub / ThT signals under different aggregate-inducing conditions (Figure S4A-D). We discuss these results in the main text and propose a model involving ESCRT-mediated microautophagy in the main text. This is supported further by the LysoIP-proteomics and LMP analysis.

      - Please add an indicator of amino acid numbers to Fig. 3C.

      These annotations are now included (now Figure S3C).

      - The legend for 3D is mislabelled.

      We have corrected the legend (now Figure S3D).

      Reviewer #3 (Recommendations For The Authors):

      Protein homeostasis and lipid homeostasis are both are important for maintaining cellular functions. However, the crosstalk remains largely unknown. The manuscript entitled as "Impairment of lipid homoeostasis causes accumulation of protein aggregates in the lysosome" deals with this interesting topic. An important link between lysosomal protein aggregation and sphingolipids/cholesterol esters metabolism were discovered. The topic belonging to the Cell Biology domain also falls into the aims and scope of eLife. Here are the revisions I recommend:

      (1) From lipidomics analysis, a remarkable correlation between levels of sphingomyelin and cholesterol ester and ProteoStat staining was found. Could the authors explain how sphingomyelin and cholesterol ester are quantified? The two lipids are not included as internal standards from the lipidomics experiment.

      Sphingomyelin and cholesterol ester internal standards are included in the Avanti 330707 SPLASH® LIPIDOMIX® Mass Spec Standard, which was supplied at 3% v/v to the MeOH/H2O cell lysis buffer. We have amended the Methods section to clarify this.

      (2) Could the authors perhaps delete Figure 1B and show it on Figure 2A only? There is no need to show the same figure two times. The threshold of both False Discovery Rate and Median Enrichment needs to be added. From Figure 2A, the Lysosomal hydrolases (GBA, LIPA, GALC) seems located in statistically insignificant region. Based on previous studies, the GBA could have an effect on sphingolipid levels, then how to explain that sphingomyelin was highly correlated with ProteoSate staining?

      We have combined the two volcano plots into a single figure (now Figure 1D), and added a line to help visualize the gene effects while considering the combined contribution of FDR and enrichment. Individual lysosomal hydrolases indeed have insignificant effects on ProteoStat and this is discussed in the main text as having relatively constrained impacts on the general lipidome. For example, while GBA and GALC KDs can lead to accumulation of their immediate substrates (glucosylceramide and galactosylceramide, respectively), they do not directly impinge on sphingomyelin.

      (3) The authors show the corelation between ProteoState staining and different lipids/lipid classes in Figure 3B and Figure S3A. It is not necessary to show the corelation with individual lipids (such as sphingomyelin(d18:1/24:0) and cholesterol ester(18:2). The corelation with full collection of lipid classes would be more representative, which is only list in Figure 3B and Figure S3A. It is suggested to add the information of how many individual lipids in each chass are used for the correlation analysis. Replace Figure 3A to Figure S3A, and put Figure 3A as supplementary figure are suggested.

      We decided to retain the correlation of two individual lipids (a sphingomyelin and a cholesterol ester species) with ProteoStat as examples to illustrate with clarity how we obtained the class-wide comparison. The number of individual lipids included in each class for correlation analysis is now included in Figures 2F and S3A.

      (4) The authors state that lipid uptake and metabolism modulate proteostasis. However, only cholesterol and LDL were tested. It would be more precise to state as cholesterol uptake and metabolism modulate proteostasis. In addition, sphingolipids and cholesterol esters accumulate with increased lysosomal protein aggregation. It would be interesting to see the effects of sphingolipids uptake, since sphingolipids are correlated with proteostasis better than cholesterol.

      We attempted to add back specific sphingolipids to assess sufficiency. However, we found it challenging to ensure that these lipids were distributed to the correct subcellular locations at physiologically relevant levels. Without this crucial information, it was difficult to draw any conclusions about the sufficiency of the sphingolipids we tested to impair proteostasis.

      Alonso A, Goñi FM. 2018. The Physical Properties of Ceramides in Membranes. Annu Rev Biophys 47:633–654. doi:10.1146/annurev-biophys-070317-033309

      Ikonen E, Zhou X. 2021. Cholesterol transport between cellular membranes: A balancing act between interconnected lipid fluxes. Dev Cell 56:1430–1436. doi:10.1016/j.devcel.2021.04.025

      Niekamp P, Scharte F, Sokoya T, Vittadello L, Kim Y, Deng Y, Südhoff E, Hilderink A, Imlau M, Clarke CJ, Hensel M, Burd CG, Holthuis JCM. 2022. Ca2+-activated sphingomyelin scrambling and turnover mediate ESCRT-independent lysosomal repair. Nat Commun 13:1875. doi:10.1038/s41467-022-29481-4

    1. Author response:

      We thank the editors and reviewers for their thorough evaluation of our manuscript. We appreciate the constructive feedback and insights provided. 

      We acknowledge that some of our conclusions would benefit from more measured statements and additional computational controls. We will revise the manuscript to better reflect the scope and limitations of our analytical approach. While we cannot add new experimental validations at this stage, we will strengthen our computational analyses and clarify our methodology.

      Below, we outline our planned revisions to address the major points raised in the public reviews:

      Clarification of Terms and Definitions:

      (1) We will make it clearer in our manuscript to emphasize that we reuse the same raw datasets from our previous study as described in Calendrilli et al, 2023, and there is no modification to the experimental methods or data. 

      (2) We will provide clear definitions for:

      - "Non-differentially expressed" genes

      - "Ctrl specific" RNA sets

      - The composition of control populations in different analyses

      (3) We will revise the use of "non-diffusive RNA-chromatin interactome" and “RNase-resistant” terminology to better reflect our actual findings.

      (4) We will also improve clarity regarding:

      - The rationale for focusing on specific genomic regions

      - The interpretation of evolutionary conservation data

      (5) We will provide additional rationale on the exclusion of short-range interactions.

      Figure Revisions:

      (1) Figure 3a: We will correct any discrepancy between text references and figure content.

      (2) Figure 4: We will standardize the color scheme between control and RNase-treated samples.

      (3) We will follow the reviewer's suggestion to move figure 1g to the supplementary file. 

      Additional Computational Analyses:

      (1) We will consider adding controls for RNA length effects and integrate any existing knowledge on the protection extent variation across different RBP.

      Discussions:

      (1) We will carefully rephrase our conclusions to more accurately reflect the scope and limitations of our computational findings, ensuring we do not overstate the implications.

      (2) We will expand the discussion of limitations, including:

      - The focus on RNase-resistant interactions only

      - The cell-type specificity of our findings

      - The lack of functional validation

      - The limited ability to discern and study the transient or weak RNA-chromatin interactions using the current dataset

      (3) Regarding the recent papers from Jenner and Davidovich groups about RNase treatment effects on chromatin solubility:

      - We will discuss these findings in our revised manuscript

      - We will address potential limitations this may impose on our interpretations

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This work examines the binding of several phosphonate compounds to a membrane-bound pyrophosphatase using several different approaches, including crystallography, electron paramagnetic resonance spectroscopy, and functional measurements of ion pumping and pyrophosphatase activity. The work attempts to synthesize these different approaches into a model of inhibition by phosphonates in which the two subunits of the functional dimer interact differently with the phosphonate.

      Strengths:

      This study integrates a variety of approaches, including structural biology, spectroscopic measurements of protein dynamics, and functional measurements. Overall, data analysis was thoughtful, with careful analysis of the substrate binding sites (for example calculation of POLDOR omit maps).

      Weaknesses:

      Unfortunately, the protein did not crystallize with the more potent phosphonate inhibitors. Instead, structures were solved with two compounds with weak inhibitory constants >200 micromolar, which limits the molecular insight into compounds that could possibly be developed into small molecule inhibitors. Likewise, the authors choose to focus the spectroscopy experiments on these weaker binders, missing an opportunity to provide insight into the interaction between more potent binders and the protein.

      We acknowledge the reviewer concern regarding the choice of weaker inhibitors. We attempted co-crystallization with all available inhibitors, including those with higher potency. However, despite numerous efforts, these potent inhibitors yielded low-resolution crystals, making them unsuitable for detailed structural analysis. Therefore, we chose to focus on the weaker binders, as we were able to obtain high-quality crystal structures for these compounds. This allowed us to perform DEER spectroscopy with the added advantage of accurately analyzing the data against structural models derived from X-ray crystallography. Using these weaker inhibitors enabled a more precise interpretation of the DEER data, thus providing reliable insights into the conformational dynamics and inhibition mechanism. However, as suggested by the reviewer, in the revised version, we will perform DEER analysis on the more potent inhibitors to provide additional insight into their interactions.

      In general, the manuscript falls short of providing any major new insight into membrane-bound pyrophosphatases, which are a very well-studied system. Subtle changes in the structures and ensemble distance distributions suggest that the molecular conformations might change a little bit under different conditions, but this isn't a very surprising outcome. It's not clear whether these changes are functionally important, or just part of the normal experimental/protein ensemble variation.

      We respectfully disagree with the reviewer. The scale of motions seen in this study correspond to those seen in the full panoply of crystal structures of mPPases. Some proteins undergo very large conformational changes during catalysis – such as the rotary ATPase. This one doesn’t, meaning that the precise motions we describe are likely to be relevant. Conformational changes in the ensemble, whether large or small, represent essential protein motions which underlie key mPPase catalytic function. Our DEER spectroscopy data demonstrate the sensitivity and resolution necessary to monitor these subtle changes in equilibria, even if these are only a few Angstroms. For several of the conditions we investigated by DEER in solution, corresponding x-ray structures have been solved, with the derived distances agreeing well with the DEER distributions. This further validates the biological relevance of the structures, including serial time-resolved ones that indicate asymmetry.

      The ZLD-bound crystal structure doesn't predict the DEER distances, and the conformation of Na+ binding site sidechains in the ZLD structure doesn't predict whether sodium currents occur. This might suggest that the ZLD structure captures a conformation that does not recapitulate what is happening in solution/ a membrane.

      We agree with the reviewer that the ZLD-bound crystal structure does not predict the DEER distances. However, we believe this discrepancy arises from the effect of the bulkiness of ZLD inhibitor, which prevents the closure of the hydrolytic centre. Additionally, the absence of Na+ at the ion gate in the ZLD-bound structure suggests that Na+ transport does not occur, a conclusion further supported by our electrometric measurements. We agree with the reviewer, that the distances observed in the DEER experiments might represent a potential new conformation in solution, which may not be captured by the static X-ray structure, thereby offering insights into the dynamic nature of the protein under physiological conditions. Finally, the static x-ray structures have not captured the asymmetric conformations that must exist to explain half-of-the-sites reactivity.

      Reviewer #2 (Public review):

      Summary:

      Crystallographic analysis revealed the asymmetric conformation of the dimer in the inhibitor-bound state. Based on this result, which is consistent with previous time-resolved analysis, authors verified the dynamics and distance between spin introduced label by DEER spectroscopy in solution and predicted possible patterns of asymmetric dimer.

      Strengths:

      Crystal structures with inhibitor bound provide detailed coordination in the binding pocket thus useful information for the PPase field and maybe for drug development.

      Weaknesses:

      The distance information measured by DEER is advantageous for verifying the dynamics and structure of membrane protein in solution. However, regarding T211 data, which, as the authors themselves stated, lacks measurement precision, it is unclear for readers how confident one can judge the conclusion leading from these data for the cytoplasmic side.

      We thank the reviewer for acknowledging the advantageous use of the DEER methodology for identifying dynamic states of membrane proteins in solution. We used two sites in our analysis: S525 (periplasm) and T211 (cytoplasm). As we clearly stated in the original manuscript, S525R1 yielded high-quality DEER data, while T211R1 yielded weak (or no) visual oscillations, leading to broad, though different distributions for the several conditions tested. Our main conclusions are based on the S525R1 data. We included the T211R1 data because, although it does not provide definitive evidence, it is consistent with our proposed model and offers additional insights into biologically relevant conditions. Furthermore, the shifts in the centre of mass (Fig EV8D) of the broad T211R1 distributions show a trend that is consistent with our model; although not proving it, it does not exclude it either. Lastly, these data do indeed confirm an important structural feature of mPPase in solution conditions which is the intrinsically high dynamic state of the loop5-6 where T211 is located, and consistent with our previous (Kellosalo et al., Science,  2012; Li et al., Nat. Commun, 2016; Vidilaseris et al., Sci. Adv., 2019; Strauss et al., EMBO Rep., 2024) and current x-ray crystallography data.

      The distance information for the luminal site, which the authors claim is more accurate, does not indicate either the possibility or the basis for why it is the ensemble of two components and not simply a structure with a shorter distance than the crystal structure.

      We thank the reviewer for pointing out this possibility and alternative interpretation of our DEER data. In the revised version, we will show that our DEER data are consistent with (and do not exclude) asymmetry and rephrase to be inclusive of other possibilities. Importantly, this additional possibility does not affect the current interpretation of the data in our manuscript.

      Reviewer #3 (Public review):

      Summary:

      Membrane-bound pyrophosphatases (mPPases) are homodimeric proteins that hydrolyze pyrophosphate and pump H+/Na+ across membranes. They are attractive drug targets against protist pathogens. Non-hydrolysable PPi analogue bisphosphonates such as risedronate (RSD) and pamidronate (PMD) serve as primary drugs currently used. Bisphosphonates have a P-C-P bond, with its central carbon can accommodate up to two substituents, allowing a large compound variability. Here the authors solved two TmPPase structures in complex with the bisphosphonates etidronate (ETD) and zoledronate (ZLD) and monitored their conformational ensemble using DEER spectroscopy in solution. These results reveal the inhibition mechanism of these compounds, which is crucial for developing future small molecule inhibitors.

      Strengths:

      The authors show that seven different bisphosphonates can inhibit TmPPase with IC50 values in the micromolar range. Branched aliphatic and aromatic modifications showed weaker inhibition.

      High-resolution structures for TmPPase with ETD (3.2 Å) and ZLD (3.3 Å) are determined. These structures reveal the binding mode and shed light on the inhibition mechanism. The nature of modification on the bisphosphonate alters the conformation of the binding pocket.

      The conformational heterogeneity is further investigated using DEER spectroscopy under several conditions.

      Weaknesses:

      The authors observed asymmetry in the TmPPase-ELD structure above the hydrolytic center. The structural asymmetry arises due to differences in the orientation of ETD within each monomer at the active site. As a result, loop5-6 of the two monomers is oriented differently, resulting in the observed asymmetry. The authors attempt to further establish this asymmetry using DEER spectroscopy experiments. However, the (over)interpretation of these data leads to more confusion than any further understanding. DEER data suggest that the asymmetry observed in the TmPPase-ELD structure in this region might be funneled from the broad conformational space under the crystallization conditions.

      See also the response below - We respectfully disagree with the reviewer. The asymmetry was previously established using serial time crystallography (Strauss et al., EMBO Rep, 2024) and biochemical assays (e.g. Malinen et al., Prot. Sci., 2022; Artukka et al., Biochem J, 2018; Luoto et al., PNAS, 2013) and also partially seen in one static structure (Vidilaseris et al., Sci Adv 2019). DEER data only show that the previously proposed asymmetry could also be present within the conformational ensemble in solution conditions. Indeed, our data do not (and cannot) exclude this possibility.

      DEER data for position T211R1 at the enzyme entrance reveal a highly flexible conformation of loop5-6 (and do not provide any direct evidence for asymmetry, Figure EV8).

      Please see relevant response above. We acknowledge that T211 is indeed situated on a highly dynamic loop, which is important for gating and our DEER data confirm its high flexibility. Given we have not observed oscillations of this site, leading to broad distributions, we have stated in the original manuscript that we will not establish the presence of any asymmetry in solution on the basis of T211, rather relying on the S525 site, for which we have acquired high-quality DEER data, as was also pointed out and have been commented on by all reviewers.

      Similarly, data for position S521R1 near the exit channel do not directly support the proposed asymmetry for ETD.

      The reviewer appears to suggest that we hold the S525R1 DEER data as direct proof of asymmetry; this is combative on the grounds that to directly prove asymmetry would require time-resolved DEER measurements, far beyond the scope of this work. Rather, we have applied DEER measurements to explore whether asymmetry (observed previously via time-resolved X-ray crystallography) is also present (or indeed a possibility) in solution. We simply state that the DEER data are consistent with asymmetry (i.e., that the mean distance increases in the presence of ETD compared to the apo-state). This is a restrained interpretation of the data.

      Despite the high quality of the data, they reveal a very similar distance distribution. The reported changes in distances are very small (+/- 0.3 nm), which can be accommodated by a change of spin label rotamer distribution alone. Further, these spin labels are located on a flexible loop, thereby making it difficult to directly relate any distance changes to the global conformation

      We thank the reviewer for recognising the high quality of our DEER data for the S525R1, where visual oscillations in the raw traces, as in our case, reportedly lead to highly accurate and reliable distributions, able to separate (in fortuitous cases) helical movements of only a few Angstroms. The ability of DEER/PELDOR offering near Angstrom resolution was previously demonstrated by the acquisition and solution of high resolution multi-subunit spin-labelled membrane protein structures (Pliotas at al., PNAS, 2012; Pliotas et al., Nat Struct Mol Biol, 2015; Pliotas, Methods Enzymol, 2017) as well as it ability in detecting small (and of similar to mPPase magnitude) conformational changes in different integral membrane proteins systems (Kapsalis et al., Nature Comms, 2019; Kubatova et al., PNAS, 2023; Schmidt et al., JACS, 2024; Lane et al., Structure, 2024; Hett et al., JACS, 2021; Zhao et al., Nature, 2024), occurring under different conditions and/or stimuli in solution and/or lipid environment. The changes here are not very small (e.g. ~ 7 Angstroms between the two mean distance extremes (Ca vs IDP)) for DEER’s proven detection sensitivity, and with all other conditions showing changes between those extremes.

      These changes are relatively small, but they are expected for membrane ion pumps. Indeed, none of the mPPase structures show helical movements of greater than a half a turn, and that only in helices 6 and 12. There appear to be larger-scale loop closing motions of the 5-6 loop that includes T211, due to the presence of E217 which binds to one of the Mg2+ ions that coordinate the leaving group phosphate. (This is, inter alia, the reason that this loop is so flexible: it can not order before substrate is bound.) Here we have the resolution to detect such subtle differences by DEER, given there are clear shifts in our time domain data and these are reflected in the mean distances in the distributions. Therefore, our study demonstrates the sensitivity and resolution DEER offers in detecting subtle conformational transitions, key in membrane proteins pathways. To further belabour this point, we do not quantify the DEER data (for instance through parametric fitting) to extract populations of different conformational states and we appreciate that to do so would be highly prone to error; however we do (and can, we feel without overinterpretation) assert that the mean distances shift.

      The interpretations listed below are not supported by the data presented:

      (1) 'In the presence of Ca2+, the distance distribution shifts towards shorter distances, suggesting that the two monomers come closer at the periplasmic side, and consistent with the predicted distances derived from the TmPPase:Ca structure.' Problem: This is a far-stretched interpretation of a tiny change, which is not reliable for the reasons described in the paragraph above.

      While the authors overall agree with the reviewer assessment that ±0.3 nm is a small (not a minor) change, there are literature examples quantifying (or using for quantification) distribution peaks separated by similar Δr. (Kubatova et al., PNAS, 2023; Schmidt et al., JACS, 2024; Hett et al., JACS, 2021; Zhao et al., Nature, 2024). In particular, none of the mPPase structures show helical movements of greater than a half a turn (in helices 6 and 12 in particular). There appear to be larger-scale loop closing motions of the 5-6 loop that includes T211, due to the presence of E217 which binds to one of the Mg2+ ions that coordinate the leaving group phosphate. (This is, inter alia, the reason that this loop is so flexible: it can not order before substrate is bound.)

      Importantly, we have fitted Gaussians to the experimental distance distributions of 525R1 output by the Comparative Deer Analyzer 2.0 and observed a change in the distribution width in presence of Ca2+, implying the rotameric freedom of the spin label is restricted. However, the CW-EPR for 525R1 indicate that the rotational correlation time of the spin label is highly consistent between conditions (the spectra are almost identical); this cannot be explained simply by rotameric preference of the spin label (as asserted by the reviewer 3), as there is no (further) immobilisation observed from the CW-EPR of apo-state (Figure EV9) to that in presence of Ca2+. Furthermore, in the absence of conformational changes, it is reasonable to assume (and demonstrable from the CW-EPR data) that the rotamer cloud should not significantly change between conditions. However, Gaussian fits of the two extreme cases yielding the longest (i.e., in presence of IDP) and shortest (in presence of ZTD) mean distances for the 525R1 DEER data indicated significant (i.e., above the noise floor after Tikhonov validation) probability density for the IDP condition at 50 Å (P(r) = 0.18). This occurs at four standard deviations above the mean of the ZTD condition, which by random chance should occur with <0.007% probability. Indeed, one can say that to observe 18% probability density at four standard deviations above the mean by random chance would occur on the order of one in 4 x 10^6.

      As in previous response the method can detect changes of such magnitude which are not small, but physiologically relevant and expected for integral membrane proteins, such as mPPases. Indeed, even in equal (or more) complex systems such as heptameric mechanosensitive channel proteins DEER provided sub-Angstrom accuracy, when a spin labelled high resolution XRC structure was solved (Pliotas et al., PNAS, 2012; Pliotas et al., Nat Struct Mol Biol, 2015). Despite this is ideal case where DEER accuracy was experimentally validated another high resolution structural method on modified membrane protein and is not very common it demonstrates the power of the method , especially when strong oscillations are present in the raw DEER data (as here for mPPase 525R1), even when multiple distances are present, Angstrom resolution is achievable in such challenging protein classes.

      (2) 'Based on the DEER data on the IDP-bound TmPPase, we observed significant deviations between the experimental and the in silico distances derived from the TmPPase:IDP X-ray structure for both cytoplasmic- (T211R1) and periplasmic-end (S525R1) sites (Figure 4D and Figure EV8D). This deviation could be explained by the dimer adopting an asymmetric conformation under the physiological conditions used for DEER, with one monomer in a closed state and the other in an open state.'

      Problem: The authors are trying to establish asymmetry using the DEER data. Unfortunately, no significant difference is observed (between simulation and experiment) for position 525 as the authors claim (Figure 4D bottom panel). The observed difference for position 112 must be accounted for by the flexibility and the data provide no direct evidence for any asymmetry.

      Reviewer 3 is wrong in suggesting that we are trying to prove asymmetry through the DEER data. That is a well-known fact in the literature (eg Vidilaseris et al, Sci Adv 2019 where we show (1) that the exit channel inhibitor ATC (i.e., close to 525) binds better in solution to the TmPPase:PPi complex than the TmPPase:PPi2 complex, and (2) that ATC binds in an asymmetric fashion to the TmPPase:IDP2 complex with just one ATC dimer on one of the exit channels. We merely use the DEER data to support this well-established fact.

      However, we agree that the DEER data in presence of IDP does not provide direct proof for asymmetry; particularly mutant T211R1 yields in silico distributions too short for measurement by DEER. It is possible that the deviations observed (and particularly likely for T211R1) arise from conformational heterogeneity in solution. We will rephrase this paragraph accordingly: “Owing to the broad nature of the T211R1 (cytoplasmic site) distance distributions, we refrain from interpreting shifts in this data. For the 525R1 (periplasmic site) for which we obtained data of high quality (as also pointed out by both reviewers 2 and 3) we observed deviations between the experimental and the in-silico distances derived from the TmPPase:IDP X-ray structure. While this deviation is less pronounced than for the +ZTD condition, the deviation is consistent with an asymmetric conformation in solution.”

      (3) 'Our new structures, together with DEER distance measurements that monitor the conformational ensemble equilibrium of TmPPase in solution, provide further solid experimental evidence of asymmetry in gating and transitional changes upon substrate/inhibitor binding.'

      Problem: See above. The DEER data do not support any asymmetry.

      We feel that the reviewer comments here are somewhat unfounded. The DEER data (and we will limit discussion only to the 525R1 mutant in this regard) satisfy relevant criteria of the white paper (Schiemann et al., 2021, JACS) from the EPR community (signal-to-noise ratio w.r.t modulation depth of > 20 in all cases; replicates have been performed and will be added into the main-text or supplementary; near quantitative labelling efficiency (evidenced by lack of free spin label signal in the CW-EPR spectra); analysed using the CDA (now Figure EV10, this data we will promote to the main-text) to avoid confirmation bias).

      While the DEER data do not prove asymmetry, we do not claim proof of asymmetry in the above sentence. We concede to rephrase the offending sentence above as: “Our new structures, together with DEER distance measurements that monitor the conformational ensemble of TmPPase in solution, do not exclude asymmetry in gating and transitional changes upon substrate/inhibitor binding and are consistent with our proposed model.” We feel that this reframed conjecture of asymmetry is well founded; indeed, comparing the experimental apo-state 525R1 distance distribution with in-silico modelling performed on the hybridised asymmetric structure (i.e., comprised of one monomer bound to Ca2+ and another bound to IDP) yields an overlap coefficient (Islam and Roux, JPC B, 2015) of >0.97. This implies the envelope of the modelled distance distribution is quantitatively inside the envelope of the experimental distance distribution. Thus, the DEER data do not exclude asymmetry (previously observed by time-resolved XRC) in solution. While we appreciate that ideally one would measure time-resolved DEER to directly correlate kinetics of conformational changes within the ensemble to the catalytic cycle of mPPase,(and this is something we aim to do in the future), it is beyond the the scope of this study.

      Indeed, half-of-the-sites reactivity has been demonstrated in at least the following papers (Vidilaseris et al, Sci Acv. ,2019, Strauss et al, EMBO Rep. 2024, Malinen et al Prot Sci, 2022, Artukka et al Biochem J, 2018; Luoto et al, PNAS, 2013). Half-of-the sites activity requires asymmetry in the mechanism, and therefore asymmetric motions in the active site (viz 211) and exit channel (viz 525). As mentioned above, we have demonstrated this for other inhibitors (Vidilaseris et al 2019) and as part of a time-resolved experiment (Strauss et al 2024). In fact, given the wealth of evidence showing that the symmetrical crystal structures sample a non- or less-productive conformation of the protein, it would be quixotic to propose the DEER experiments - in solution - do not generate asymmetric conformations. It certainly doesn’t obey Occam’s razor of choosing the simplest possible explanation that covers the data.

      (4) Based on these observations, and the DEER data for +IDP, which is consistent with an asymmetric conformation of TmPPase being present in solution, we propose five distinct models of TmPPase (Figure 7).

      Problem: Again, the DEER data do not support any asymmetry and the authors may revisit the proposed models.

      We respectfully disagree with the reviewer. Please see our detailed response above. However, in the revised version, we will clarify that the proposed models are not solely based on the DEER data but are grounded in both current and previously solved structures, with the DEER data providing additional consistency with these models.

      (5) 'In model 2 (Figure 7), one active site is semi-closed, while the other remains open. This is supported by the distance distributions for S525R1 and T211R1 for +Ca/ETD informed by DEER, which agrees with the in silico distance predictions generated by the asymmetric TmPPase:ETD X-ray structure'

      Problem: Neither convincing nor supported by the data

      We respectfully disagree with the reviewer. However, owing to the conformational heterogeneity of T211R1, in the revised version, we will exclude it in the above sentence, to the effect: Please see our detailed response above.

    1. Author Response:

      Thank you for your interest in our paper. We would also like to thank the anonymous reviewers for their critical and constructive comments. Although the reviewers found our work interesting, they raised several important concerns about our study. To address these concerns, mostly we will perform new experiments as following.

      1. Examine whether antioxidant-NAC can block SFN-induced TFEB-nuclear translocation in NPC cells;

      2. Examine whether calcineurin inhibitor (FK506+CsA) or Ca 2+ inhibitor (Bapta-AM) can block SFN-induced TFEB-nuclear translocation in NPC cells.

      3. Investigate whether cholesterol was cleared by activation of TFEB by SFN in vivo tissues.

      4. Investigate whether SFN-evoked the lysosomal exocytosis is TFEB-dependent by using TFEB-KO cells.

      5. Examine the effect of NPC1 deficiency on dextran trafficking by studying the localization of CF- dex and Lamp1.

      6. Perform cytotoxicity experiments to examine whether SFN used in this study is cytotoxic in various cell lines

      In addition, according to the reviewers’ suggestions, we will make clarifications and corrections wherever appropriate in the manuscript. Below please find our point-by-point responses and plans to the reviewers’ comments.

      Reviewer #1 (Public review):

      Summary:

      The authors are trying to determine if SFN treatment results in dephosphorylation of TFEB, subsequent activation of autophagy-related genes, exocytosis of lysosomes, and reduction in lysosomal cholesterol levels in models of NPC disease.

      Strengths:

      (1) Clear evidence that SFN results in translocation of TFEB to the nucleus.

      (2) In vivo data demonstrating that SFN can rescue Purkinje neuron number and weight in NPC1-/- animals.

      Thank you for the support!

      Weaknesses:

      (1) Lack of molecular details regarding how SFN results in dephosphorylation of TFEB leading to activation of the aforementioned pathways. Currently, datasets represent correlations.

      Thank you for this constructive comment. The reviewer is right that in this manuscript the molecular mechanism of SFN-activated TFEB has not been discussed in details. Because previously we have shown that SFN induces TFEB nuclear translocation via a Ca 2+ - dependent but MTOR (mechanistic target of rapamycin kinase)-independent mechanism through a moderate increase in reactive oxygen species (ROS). And calcineurin-mediated TFEB dephosphorylation underlies SFN-induced TFEB activation. These data have been published in 2021 autophagy (Li, Shao et al. 2021) . Therefore, in this study we did not mention this part. We will add the molecular mechanism of TFEB activation by SFN in the discussion part. And to further confirm this mechanism in NPC cells, we will also perform experiments including: 1) examine whether antioxidant-NAC can block SFN-induced TFEB-nuclear translocation in NPC cells; 2) examine whether calcineurin inhibitor (FK506+CsA) can block SFN-induced TFEB-nuclear translocation in NPC cells.

      (2) Based on the manuscript narrative, discussion, and data it is unclear exactly how steady-state cholesterol would change in models of NPC disease following SFN treatment. Yes, there is good evidence that lysosomal flux to (and presumably across) the plasma membrane increases with SFN. However, lysosomal biogenesis genes also seem to be increasing. Given that NPC inhibition, NPC1 knockout, or NPC1 disease mutations are constitutively present and the cell models of NPC disease contain lysosomes (even with SFN) how could a simple increase in lysosomal flux decrease cholesterol levels? It would seem important to quantify the number of lysosomes per cell in each condition to begin to disentangle differences in steady state number of lysosomes, number of new lysosomes, and number of lysosomes being exocytosed.

      Thank you for the suggestion. It is important to define the three states 1) original number of lysosomes, 2) number of new lysosomes, and 3) number of lysosomes being exocytosis. However, we have checked literature, so far it seems that there is no good method that could clearly differentiate the three states of lysosomes.

      (3) Lack of evidence supporting the authors' premise that "SFN could be a good therapeutic candidate for neuropathology in NPC disease".

      Suggestion was taken! We will investigate whether cholesterol was reduced by activation of TFEB by SFN in vivo to strength the point that SFN could be a potential therapeutic compound for NPC treatment. And to avoid confusion, we have removed this sentence.

      Reviewer #2 (Public review):

      Summary:

      This study presents a valuable finding that the activation of TFEB by sulforaphane (SFN) could promote lysosomal exocytosis and biogenesis in NPC, suggesting a potential mechanism by SFN for the removal of cholesterol accumulation, which may contribute to the development of new therapeutic approaches for NPC treatment.

      Strengths:

      The cell-based assays are convincing, utilizing appropriate and validated methodologies to support the conclusion that SFN facilitates the removal of lysosomal cholesterol via TFEB activation.

      Weaknesses:

      (1) The in vivo experiments demonstrate the therapeutic potential of SFN for NPC. A clear dose-response analysis would further strengthen the proposed therapeutic mechanism of SFN. Additional data supporting the activation of TFEB by SFN for cholesterol clearance in vivo would strengthen the overall impact of the study

      We understand the reviewer’s point. We examined two doses of SFN-30 and 50mg/kg. As shown in Fig.6, SFN (50mg/kg), but not 30mg/kg prevents a degree of Purkinje cell loss in the lobule IV/V of cerebellum, suggesting a dose-correlated preventive effect of SFN. In vivo experiments with higher concentrations of SFN and optimized dosage form of SFN were planned in the future study, but will not be included in this study.

      We will investigate whether cholesterol was cleared by activation of TFEB by SFN in vivo.

      (2) In Figure 4, the authors demonstrate increased lysosomal exocytosis and biogenesis by SFN in NPC cells. Including a TFEB-KO/KD in this assay would provide additional validation of whether these effects are TFEB-dependent.

      Thank you for this valuable suggestion. We will investigate whether SFN-evoked the lysosomal exocytosis is TFEB-dependent by using TFEB-KO cells.

      (3) For lysosomal pH measurement, the combination of pHrodo-dex and CF-dex enables ratiometric pH measurement. However, the pKa of pHrodo red-dex (according to Invitrogen) is ~6.8, while lysosomal pH is typically around 4.7. This discrepancy may account for the lack of observed lysosomal pH changes between WT and U18666A-treated cells. Notably, previous studies (PMID: 28742019) have reported an increase in lysosomal pH in U18666A-treated cells.

      We understand the reviewer’s point. But we used pHrodo™ Green-Dextran (P35368, Invitrogen), but not pHrodo red-dex to measure the lysosomal luminal acidity. According to the product information from Invitrogen, pHrodo Green-dex conjugates are non-fluorescent at neural pH, but fluorescence bright green at acidic pH ranges 4-9, such as those in endosomes and lysosomes. Therefore, pHrodo Green-dex can be used to monitor the acidity of lysosome (Hu, Li et al. 2022) . We also used LysoTracker Red DND-99 (Thermo Scientific, L7528) to measure lysosomal pH (Fig. 4G, H), which is consistent with results of pHrodo Green/CF measurement. Overall, in our hands, we have not detected pH change of lysosomes in U18666A-treated NPC1 cell models.

      (4) The authors are also encouraged to perform colocalization studies between CF-dex and a lysosomal marker, as some researchers may be concerned that NPC1 deficiency could reduce or block the trafficking of dextran along endocytosis.

      Suggestion was taken! We will examine the effect of NPC1 deficiency on dextran trafficking by studying the localization of CF-dex and Lamp1.

      (5) In vivo data supporting the activation of TFEB by SFN for cholesterol clearance would significantly enhance the impact of the study. For example, measuring whole-animal or brain cholesterol levels would provide stronger evidence of SFN's therapeutic potential.

      We really appreciate the reviewer’s suggestions. We will investigate whether cholesterol was cleared by activation of TFEB by SFN in vivo.

      Reviewer #3 (Public review):

      Summary:

      The authors demonstrate that activation of TFEB facilitates cholesterol clearance in cell models of Niemann-Pick type C (NPC). This is done through a variety of approaches including activation of TFEB by sulforaphane (SFN), a naturally occurring small-molecule TFEB agonist. SFN induces TFEB nuclear translocation and promotes lysosomal exocytosis. In an NPC mouse model, SFN dephosphorylates/activates TFEB in the brain and rescues the loss of Purkinje cells.

      Strengths:

      NPC is a severe disease and there is little in the way of treatment. The manuscript points towards some treatment options. However, the title, the title "Small-molecule activation of TFEB Alleviates Niemann-Pick Disease..." is far too strong and should be changed.

      Weaknesses:

      (1) The manuscript is extremely hard to read due to the writing; it needs careful editing for grammar and English.

      We will thoroughly check grammar to improve the manuscript.

      (2) There are a number of important technical issues that need to be addressed.

      We will address the technical issues mentioned in the following.

      (3) The TFEB influence on filipin staining in Figure 1A is somewhat subtle. In the mCherry alone panels there is a transfected cell with no filipin staining and the mCherry-TFEBS211A cells still show some filipin staining.

      We understand the reviewer’s point. We will investigate whether cholesterol is cleared by activation of TFEB by SFN in vivo.

      (4) Figure 1C is impressive for the upregulation of filipin with U18666A treatment. However, SFN is used at 15 microM. This must be hitting multiple pathways. Vauzour et al (PMID: 20166144) use SFN at 10 nM to 1microM. Other manuscripts use it in the low microM range. The authors should repeat at least some key experiments using SFN at a range of concentrations from perhaps 100 nM to 5 microM. The use of 15 microM throughout is an overall concern.

      We understand the reviewer’s point. See RESPONSE #1, previously we have shown that SFN (10–15 μM, 2–9 h) induces robust TFEB nuclear translocation in a dose- and time-dependent manner in HeLa GFP-TFEB stable cells as well as in other human cell lines without cytotoxicity (Li, Shao et al. 2021) . According to previous results, in this study, we chose SFN (15 μM) to examine its effect on cholesterol clearance. We will add the information in the discussion part. In this study, we will perform dose-response TFEB nuclear translocation in NPC model cells as well as cytotoxicity experiments to examine whether the concentrations of SFN used in various cell lines are toxic.

      References:

      Hu, M. Q., P. Li, C. Wang, X. H. Feng, Q. Geng, W. Chen, M. Marthi, W. L. Zhang, C. L. Gao, W. Reid, J. Swanson, W. L. Du, R. Hume and H. X. Xu (2022). "Parkinson's disease-risk protein TMEM175 is a proton-activated proton channel in lysosomes.” Cell 185(13): 2292-+.

      Li, D., R. Shao, N. Wang, N. Zhou, K. Du, J. Shi, Y. Wang, Z. Zhao, X. Ye, X. Zhang and H. Xu (2021). “Sulforaphane Activates a lysosome-dependent transcriptional program to mitigate oxidative stress.” Autophagy 17(4): 872-887.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The work from Petazzi et al. aimed at identifying novel factors supporting the differentiation of human hematopoietic progenitors from induced pluripotent stem cells (iPSCs). The authors developed an inducible CRISPR-mediated activation strategy (iCRISPRa) to test the impact of newly identified candidate factors on the generation of hematopoietic progenitors in vitro. They first compared previously published transcriptomic data of iPSCderived hemato-endothelial populations with cells isolated ex vivo from the aorta-gonadmesonephros (AGM) region of the human embryo and they identified 9 transcription factors expressed in the aortic hemogenic endothelium that were poorly expressed in the in vitro differentiated cells. They then tested the activation of these candidate factors in an iPSCbased culture system supporting the differentiation of hematopoietic progenitors in vitro. They found that the IGF binding protein 2 (IGFBP2) was the most upregulated gene in arterial endothelium after activation and they demonstrated that IGFBP2 promotes the generation of functional hematopoietic progenitors in vitro.

      Strengths:

      The authors developed an extremely useful doxycycline-inducible system to activate the expression of specific candidate genes in human iPSC. This approach allows us to simultaneously test the impact of 9 different transcription factors on in vitro differentiation of hematopoietic cells, and the system appears to be very versatile and applicable to a broad variety of studies.

      The system was extensively validated for the expression of 1 transcription factor (RUNX1) in both HeLa cells and human iPSC, and a detailed characterization of this test experiment was provided.

      The authors exhaustively demonstrated the role of IGFBP2 in promoting the generation of functional hematopoietic progenitors in vitro from iPSCs. Even though the use of IGFBP2interacting proteins IGF1 and IGF2 have been previously reported in human iPSC-derived hematopoietic differentiation in vitro (Ditadi and Sturgeon, Methods 2016; Ng et al., Nature Biotechnology 2016), and IGFBP-2 itself has been shown to promote adult HSC expansion ex vivo (Zhang et al., Blood 2008), its role on supporting in vitro hematopoiesis was demonstrated here for the first time.

      Weaknesses:

      Although the authors performed a very thorough characterization of the system in proof-ofprinciple experiments activating a single transcription factor, the data provided when 9 independent factors were used is not sufficient to fully validate the experimental strategy. Indeed, in the current version of the manuscript, it is not clear whether the results presented in both the scRNAseq analysis and the functional assays are the consequence of the simultaneous activation of all 9 TF or just a subset of them. This is essential to establish whether all the proposed factors play a role during embryonic hematopoiesis, and a more complete analysis of the scRNAseq dataset could help clarify this aspect.

      Similarly, the data presented in the manuscript are not sufficient to clarify at what stage of the endothelial-to-hematopoietic transition (EHT) the TF activation has an impact. Indeed, even though the overall increase of functional hematopoietic progenitors is fully demonstrated, the assays proposed in the manuscript do not clarify whether this is due to a specific effect at the endothelial level or to an increased proliferation rate of the generated hematopoietic progenitors. Similar conclusions can be applied to the functional validation of IGFBP2 in vitro.

      The overall conclusions are sometimes vague and not always supported by the data. For instance, the authors state that the CRISPR activation strategy resulted in transcriptional remodeling and a steer in cell identity, but they do not specify which cell types are involved and at what level of the EHT process this is happening. In the discussion, the authors also claim that they provided evidence to support that RUNX1T1 could regulate IGFBP2 expression. However, this is exclusively based on the enrichment of RUNX1T1 gRNA in cells expressing higher levels of IGFBP2 and it does not demonstrate any direct or indirect association of the two factors.

      We thank the reviewer for the positive comments about the importance of our work and have now addressed the points raised as weaknesses by performing additional analysis and experiments, adding a new schematic of the mechanism, and rewording our claims.

      We have clarified the different effects mediated by the activation and the IGFBP2 addition in a summary section at the end of the results and added Figure 6, showing this in visual form. We have also clearly stated the limitations related to the correlation between RUNX1T1 and IGFBP2 in the discussion and toned down our claims regarding this throughout the entire paper. We have also reworded the text to clarify the specific cell types identified in the sequencing data that we refer to.

      Reviewer #2 (Public Review):

      To enable robust production of hematopoietic progenitors in-vitro, Petazzi et al examined the role of transcription factors in the arterial hemogenic endothelium. They use IGFBP2 as a candidate gene to increase the directed differentiation of iPSCs into hematopoietic progenitors. They have established a novel induced-CRISPR mediated activation strategy to drive the expression of multiple endogenous transcription factors and show enhanced production of hematopoietic progenitors through expansion of the arterial endothelial cells. Further, upregulation of IGFBP2 in the arterial cells facilitates the metabolic switch from glycolysis to oxidative phosphorylation, inducing hematopoietic differentiation. While the overall study and resources generated are good, assertions in the manuscript are not entirely supported by the experimental data and some claims need further experimental validation.

      We thank the reviewer for the positive comments, and we have provided new data and analysis to make sure that all our assertations are clearly supported and also reworded those where limitations were identified by the reviewers.

      Recommendations for the authors:

      Reviewing Editor (Recommendations For The Authors):

      The assessment could change from "incomplete" to "solid" if the authors: i) improve data analysis (for both scRNAseq and functional assays) by providing additional information that could strengthen their conclusions, as suggested in the specific comments by both reviewers; ii) either provide new functional evidence supporting their mechanistic conclusion or alternatively tone down the claims that are not fully supported by data and acknowledge the limitations raised by reviewers in the discussion; (iii) the issue of paracrine signaling to expand only hematopoietic progenitors needs to be addressed.

      We have now improved the data analysis and provided additional functional tests to strengthen our conclusions and toned down those that were identified by the reviewers as not supported enough and included a discussion on these limitations. We have also reworded the section about the paracrine signaling throughout the paper.

      Reviewer #1 (Recommendations For The Authors):

      Figure 1 contains exclusively published data. It might be more appropriate to use it as a supplementary figure or as part of a more exhaustive figure (maybe combining Figures 1 and 2 together?).

      Figure 1 contained novel bioinformatic analyses that represent the base of our research and it has a different content and focus to figure 2, which is already a large figure. We therefore believe it is better to keep it as a separate figure, containing a new panel now too. 

      It seems there is an issue with Figure S3 labelling:

      • In line 112, Figure S2A-B does not display genomic PCR and sequencing results;

      • In line 123, Figure S3D-E does not show viability and proliferation data;

      • In line 127, Figure S3G does not show mCherry expression in response to DOX;

      We apologies for the confusion with the numbers, we have now correctly labelled the figures.

      It would be more informative to include gates and frequency on flow cytometry plots in Figure S3, to be able to evaluate the extent of the reduction in mCherry expression.

      We have now included the gating and frequency of mCherry-expressing cells in Supplementary Figure 3D.

      It is not clear from the text and figures whether the SB treatment was maintained throughout the hematopoietic differentiation protocol (line 122):

      • If so, it would be important to confirm that HDAC treatment does not affect EHT cultures

      • If not, can the authors provide some evidence that transgene silencing is not occurring during hematopoietic differentiation?

      We have clarified that we decided to treat the cells with SB exclusively in maintenance condihons because HDACs have been shown to be essenhal for the EHT (lines 138-142). We have now also included addihonal data showing the high expression of the mCherry tag reporhng the iSAM expression on day 8 (Supplementary Figure 4F).

      Can the authors provide a simple diagram summarizing the experimental strategy for each differentiation experiment in the respective supplementary figure? For instance, at what stage of the protocol was DOX added in Figure 3? Or at what stage IGFBP2 was added in Figure 5? It would be a very useful addition to the interpretation of the results.

      We have now included three schemahcs for all the experiments in the manuscript in supplementary figure 4 A-C.

      In Figure 3, the authors should provide more detailed information about the data filtering of the scRNAseq experiment, and more specifically:

      • How many cells were included in the analysis for each library after QC and filtering?

      • How "cells in which the gRNAs expression was detected" were selected? Do they include only cells showing expression of gRNAs for all 9 TF?

      This informahon is now included in the method sechon lines 773-781; the detailed code is available on the GitHub link provided in the same sechon. We have filtered the cells expressing one gRNA for the non-targehng gRNA (iSAM_NT) control and more than one for the iSAM_AGM sample. 

      In Figure 3A, it is not clear whether the expression of the 9 factors is consistently detected in all cells or just a subset of them, and the heatmap in Figure 3A does not provide this information. It would be more accurate to provide expression on a per-cell basis, for instance, as a violin plot displaying single dots representing each cell. 

      We have now included this violin plot in Supplementary Figure 4G as requested. However, this visualisation is difficult to interpret because some of the target genes’ expression seems variable in both experimental and control conditions. We had envisaged that this could have been the case and so this is why we had included the three different controls.  For this reason we chose to show the normalised expression which takes all the different variables into account (Figure 3A). 

      In Figure 3B-C, it seems that clusters EHT1 and EHT2 do not express endothelial markers anymore. Are these fully differentiated hematopoietic cells rather than cells undergoing EHT? In general, it would be quite important to provide evidence of expressed marker genes characterizing each cluster (eg. heatmap summarizing top DEG in the supplementary figure?). 

      We have now provided a spreadsheet containing the clusters’ markers that we used in

      Supplementary Table 1) a heatmap in Figure 3E. Furthermor,e we have now edited Figure 3C to include Pan Endothelial markers (PECAM1 and CDH5). These data show that the EHT1 and EHT2 cluster both express endothelial markers but are progressively downregulated as expected during endothelial to hematopoietic transition. We have also included and discussed this in the manuscript lines 192-195 and a schematic for the mechanism in Figure 6.

      In Figure 3E, displaying the proportion of clusters within each sample/library would be a more accurate way of comparing the cell types present in each library (removing potential bias introduced by loading different numbers of cells in each sample).

      We have now included the requested data in Supplementary Figure 4I and it confirms again the expansion of arterial cells in the activated cells.    

      In Figure 3G, by plating 20,000 total CD34+, the assay does not account for potential differences in sample composition. It is then hard to discriminate between the increased number of progenitors in the input or an enhanced ability of HE to undergo EHT. This is an important aspect to consider to precisely identify at what level the activation of the 9 factors is acting. A proper quantification of flow cytometry data summarizing the % of progenitors, arterial cells, etc. would be useful to interpret these results.

      Lines 204-205 reworded. We are very much aware of the fact that the CD34+ cell population consists of a range of cells across the EHT process and this is precisely why we carried out this single cell sequencing analyses.  We purposely tested the effect of the observed changes in composition by colony assays

      In Figure 3G, it seems that NT cells w/o DOX have very little CFU potential (if any). Can the authors provide an explanation for this?

      We think that the limited CFU potential is due to the extensive genetic manipulation and selection that the cells underwent for the derivation of all the iSAM lines but this did not impede us from observing an effect of gene activation on CFU numbers. This is one of the primary reasons that we then validated our overall findings using the parental iPSC line in control condition and with the addition of IGFBP2. We show that the parental iPSC line gives rise to hematopoietic progenitor, both immunophenotypically (Figure 4D) and functionally, at expected levels (Figure 4B left column).

      Figure 4A shows an upregulation of IGFBP2 in arterial cells as a result of TF activation. However, from the data presented here, it is not possible to evaluate whether this is specific to the arterial cluster, or it is a common effect shared by all cell types regardless of their identity. 

      Data has now been included in Supplementary Figure 4H, which shows that all the cells show an increase in IGFBP2, but arterial cells show the highest increase. We have now edited the text to reflect this, in lines 228-230.

      In Figure 5A-B only a minority of arterial cells express RUNX1 in response to IGFBP2 treatment. Is this sufficient to explain the very significant increase in the generation of functional hematopoietic progenitors described in Figure 4? Quantification and statistical analysis of RUNX1 upregulation would strengthen this conclusion.

      We have now provided the statistical analysis showing significant upregulation of RUNX1 upon IGFBP2 addition. The p values are now provided in the figure 5 legend.

      In Figure 5 the authors conclude that IGFBP2 remodels the metabolic profile of endothelial cells. However, it is not clear which cell types and clusters were included in the analysis of Figure 5C-G. Is the switch from Glycolysis to Oxidative Phosphorylation specific to endothelial cells? Or it is a more general effect on the entire culture, including hematopoietic cells? 

      We based this conclusion on the fact that the single-cell RNAseq allows to verify that the metabolic differences are obtained in the endothelial cells. Given that we sorted the adherent cells, the majority of these are endothelial cells as shown in Figure 5A. The Seahorse pipeline includes a number of washing steps resulting in the analyses being performed on the adherent compartment which we know consists primarily of endothelial cells. We cannot exclude some contamination from non-endothelial cells but we highlight to this reviewer that the initial observation of the metabolic changes was identified in endothelial cells in the single cell sequencing data. Taken together, we believe that this implies that metabolic changes are specific to this population. We have clarified this in the line 317.

      In the discussion, the authors conclude that they "provide evidence to support the hypothesis that RUNX1T1 could regulate IGFBP2 expression". To further support this conclusion, the authors could provide a correlation analysis of the expression of the two genes in the cell type of interest. 

      Following the observation of the IGFBP2 high expression across clusters, we have now reworded this sentence in lines 382-385  We have tried to perform the correlation analysis but we believe this not to be appropriate due to the detection level of the gRNA, we have now included this as a limitation point in the discussion lines 416-427, and also toned down the conclusion we did draw about RUNX1T1 throughout the whole manuscript.

      As mentioned by the authors, IGFBP2 binds IGF1 and IGF2 modulating their function. Both IGF1 (http://dx.doi.org/10.1016/j.ymeth.2015.10.001) and IGF2 (doi:10.1038/nbt.3702) have been used in iPSC differentiation into definitive hematopoietic cells. It would be relevant to discuss/reference this in the discussion.

      We have now included the suggested reference in the section where we discuss the role of IGFBP2 in binding IGF1 and IGF2.

      Reviewer #2 (Recommendations For The Authors):

      (1) Figure 1 compares the transcriptome of human AGM and in-vitro derived hemogenic endothelial cells (HECs). It is not clear why only the genes downregulated in the latter were chosen. Are there any significantly upregulated genes, knockdown/knockout which could also serve a similar purpose? Single-cell transcriptome database analysis is very preliminary. A detailed panel with differences in cluster properties of HECs between the two systems should be provided. A heatmap of all differentially expressed genes between the two samples must be generated, along with a logical explanation for choosing the given set of genes. 

      We have now included another panel in figure 1 to better clarify the logic behind the strategy used to identify our target genes (Figure 1A).

      (2) Figure 2 - a panel describing the workflow of gRNA design and targeting for the 9 candidate genes, along with lentiviral packaging and transduction would make it easier to follow. 

      We have now included three schematics for all the experiments in the manuscript in supplementary figure 4 A-C. 

      (3) Figure 3- to assess the effect of arterial cell expansion on the emergence of hematopoietic progenitors, CD34+ Dll4+ cells should be sorted for OP9 co-culture assay.

      Using only CD34+ cells does not answer the question raised. Also, the CFU assay performed does not fully support the claim of enhanced hematopoietic differentiation since only CFU-E and CFU-GM colonies are increased in Dox-treated samples, with no effect on other colony types. OP9 co-culture assay with these cells would be required to strengthen this claim. 

      We wanted to clarify that the effect on the methylcellulose coming from the activated cells was not limited to CFU-E, as the reviewer reported; instead, it also affected CFU-GM and CFU-M. 

      We have now performed additional experiments where we sorted the CD34+ compartment into DLL4- and DLL4+ in Supplementary Figure 5D-E, which we discussed in lines 250-258. 

      (4) In Figure 3F, there appears to be a lot of variation in the DLL4% fold change values for

      DOX treated iSAM_AGM sample, which weakens the claim of increased arterial expansion.

      Can the authors explain the probable reason? It is suggested that the two other controls (iSAM_+DOX and iSAM_-DOX) should be included in this analysis. It is imperative to also show % populations rather than just fold change to gain confidence.

      We agree that there is a lot of variability. That is because differentiation happens in 3D in embryoid bodies, which contain many different cell types that differentiate in different proportions across independent experiments. We have now included the raw data in Supplementary Figure 4 D, with additional statistical analysis to show the expansion of arterial cells including also the suggested additional controls.

      (5) How does activation of these target genes cause increased arterialization? Is the emergence of non-HE populations suppressed? Or is it specific to the HE? The data on this should be clarified and also discussed. ANTO/Lesley text

      We have provided additional data clarifying the connection between increased arterialisation and hemogenic potential. We showed that the activation induces increased arterialisation and that IGFBP2 acts by supporting the acquisition of hemogenic potential. We have discussed this in lines 326-348 and provided a new figure to explain this in detail (figure 6)

      (6) Considering that IGFBP2 was chosen from the activated target gene(s) cluster, can the authors explain why the reduced CFU-M phenomenon observed in Figure 3G does not appear in the MethoCult assay for IGFBP2 treated cells (Figure 4B)?

      The difference could be explained by the fact that in Figure 3G, the cells underwent activation of multiple genes, while in Figure 4B, they were only exposed to IGFBP2. Our results show that IGFBP2 could at least partially explain the phenotype that we see with the activation, but we believe that during the activation experiments, there might be other signals available that might not be induced by IGFBP2 alone. We have also added a summary section and a figure to clarify the different mechanisms of action of the gene activation and IGFBP2.

      (7) Figure 4- while the experiments conducted support the role of IGFBP2 in increasing hematopoietic output, there is no experimental evidence to prove its function through paracrine signalling in HECs. The authors need to provide some evidence of how IGFBP2 supplementation specifically expands only the hematopoietic progenitors. Experimental strategies involving specifically targeting IGFBP2 in hemogenic/arterial endothelial cells are required to prove its cell type specific function. Additionally, assessing the in vivo functional potential of the hematopoietic cells generated in the presence of IGFBP2, by bone-marrow transplantation of CD34+ CD43+ cells, is essential. 

      The role of IGFBP2 in the context of HSC production and expansion was not the topic of our research, and we have not claimed that IGFBP2  affects the long-term repopulating capacity of HSPCs. Therefore, we believe that the requested experiments are not required to support the specific claims that we do make. We have now provided more experiments and bioinformatic analysis that support the role of IGFBP2 in inducing the progression of EHT from arterial cells to hemogenic endothelium, and to avoid misunderstandings, we have toned down our claims by editing the text regarding its paracrine effect s. 

      (8) Figure 4C-D -It is recommended to plot % populations along with fold change value. As this is a key finding, it is important to perform flow cytometry for additional hematopoietic markers- CD144, CD235a and CD41a to demonstrate whether this strategy can also expand erythroid-megakaryocyte progenitors. Telma

      Figure 4C already shows the percentage values; we have now added the percentage for Figure 4D in SF5C. We have also performed additional analysis as requested and added the data obtained to Supplementary Figure 5D.

      (9) In Figure 5, analysis showing the frequency of cells constituting different clusters, between untreated and IGFBP2-treated samples in the single-cell transcriptome analysis is essential. Additional experiments are required to validate the function of IGFBP2 through modulation of metabolic activity. Inhibition of oxidative phosphorylation in the IGFBP2treated cells should reduce the hematopoietic output. Authors should consider doing these experiments to provide a stronger mechanistic insight into IGFBP2-mediated regulation of hematopoietic emergence.

      We have now included the requested cluster composition in Supplementary Figure 5F. We decided not to include further tests on the metabolic profile of IGFBP2 as we already discussed in other papers that showed, using selective inhibitors, that the EHT coincides with a glycol to OxPhos switch. 

      (10) It is very striking to see that IGFBP2 supplementation changes the transcriptional profile of developing hematopoietic cells by increasing transcription of OXPHOS-related genes with concomitant reduction of glycolytic signatures, particularly at Day 13. However, the mitochondrial ATP rate measurements do not seem convincing. The bioenergetic profiles show that when mitochondrial inhibitors are added, both groups exhibit decreased OCR values and, on the other hand, higher ECAR. This indicates that both groups have the capability to utilize OXPHOS or glycolysis and may only differ in their basal respiration rates.

      Differences in proliferation rate can cause basal respiration to change. There is no information on how the bioenergetic profile was normalized (cell no./protein amount). Given that IGFBP2 has been shown to increase proliferation, it is very likely that the cells treated with IGFBP2 proliferated faster and therefore have higher OCR. The data needs to be normalized appropriately to negate this possibility.

      We have previously tested whether IGFBP2 causes an increase in proliferation by analysing the cell cycle of cells treated with it, as we initially thought this could be a mechanism of action. We have now provided the quantification of the cell cycle in the cells treated with IGFBP2, showing no effect was observed in cell cycle Supplementary Figure 4E. Following this analysis, we decided to plate the same number of cells and test their density under the microscope before running the experiment; each experiment was done in triplicate for each condition. We have now added this info to the method sections lines 806-813.  We did not comment on the basal difference, which we agree might be due to several factors, but we only compared the difference in response to the inhibitors, which isn’t affected by the basal level but exclusively by their D values. We have also included the formulas used to calculate the ATP production rate.

      Overall, it appears that IGFBP2 does not seem to primarily cause metabolic changes, but simply accelerates the metabolic dependency on OXPHOS. Hence, the term 'metabolic remodelling' must be avoided unless IGFBP2 depletion/loss of function analysis is shown.

      We thank the reviewer for suggesting how to interpret the data about the dependency on OXPHOS. We have now changed the conclusions and claims about the effect of IGFBP2. We have also included a cell cycle analysis of the hematopoietic cells derived upon IGFBP2 addition to show that they don’t show differences in proliferation that could cause the increase in colony formation we observed. Regarding the assay, we have plated the same number of cells for each group to make sure we were comparing the same number of cells, which we also assessed in the microscope before the test, and we eliminated the suspension cells during the washes that preceded the measurement. The review is correct in indicating that there is a basal difference in the value of OCR and ECAR where the IGFBP2 is lower at the start and not higher, which would not conceal higher proliferation. Finally, the ATP production rate is calculated on the variation of OCR and ECAR upon the addition of inhibitors, which normalizes for the basal differences.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      Summary:

      In this manuscript, the molecular mechanism of interaction of daptomycin (DAP) with bacterial membrane phospholipids has been explored by fluorescence and CD spectroscopy, mass spectrometry, and RP-HPLC. The mechanism of binding was found to be a two-step process. A fast reversible step of binding to the surface and a slow irreversible step of membrane insertion. Fluorescence-based titrations were performed and analysed to infer that daptomycin bound simultaneously two molecules of PG with nanomolar affinity in the presence of calcium. Conformational change but not membrane insertion was observed for DAP in the presence of cardiolipin and calcium.

      Strengths:

      The strength of the study is skillful execution of biophysical experiments, especially stoppedflow kinetics that capture the first surface binding event, and careful delineation of the stoichiometry.

      Weaknesses:

      The weakness of the study is that it does not add substantially to the previously known information and fails to provide additional molecular details. The current study provides incremental information on DAP-PG-calcium association but fails to capture the complex in mass spectrometry. The ITC and NMR studies with G3P are inconclusive. There are no structural models presented. Another aspect missing from the study is the reconciliation between PG in the monomer, micellar, and membrane forms.

      Besides the two-stage process, another important finding in the current work is the stable complex that plays a critical role in the drug uptake both in vitro and in B. subtilis. This complex has been shown to be a stable species in HPLC and its binding stoichiometry and affinity have been quantitatively characterized. The complex may not be stable enough in gas phase to be detected in the MS analysis, which was designed to detect the phospholipid and Dap components, not the complex itself. The structural model of this complex is clearly proposed and presented in Figure 6. 

      The NMR and ITC studies have a very clear conclusion that Dap has a weak interaction with the PG headgroup alone, which is unable to account for the Dap-PG interaction observed in the fluorescence studies. Thus, the whole PG molecule has to be involved in the interaction, leading to the discovery of the stable complex.  

      Reviewer #2 (Recommendations For The Authors):

      (1) I appreciate and agree with the comment that there are stages of daptomycin insertion, and these might involve the formation of different complexes with different binding partners (e.g. pre-insertion vs quaternary vs bactericidal). However, it seems like lipid II is an apparent participant in daptomycin membrane dynamics (Grein et al. Nature Communications 2020). It's not clear why this was excluded from analysis by the authors, or what basis there is for the discussion statement that the quaternary complex can shift into the bactericidal complex by exchanging 1 PG for lipid II. 

      We agree that lipid II and other isoprenyl lipids may be involved in the uptake and insertion of daptomycin into membrane according to the results of the Nat. Comm. paper. However, these isoprenyl lipids are very small components of the membrane in comparison to PG and their contribution to the drug uptake is thus expected to be much less significant. Nonetheless, we included farnesyl pyrophosphate (FPP) as an analog of bactoprenol pyrophosphate (C55PP), which was reported to have the same promoting effect as lipid II in the previous study, in our study but found no promoting effect in the fluorescence assay (Fig. 2B). In addition, no complex was formed when FPP replaced PG in our preparation and analysis of the drug-lipid complex. In consideration of these negative results and the expected small contribution, other isoprenyl lipids or their analogs were not included in the study.

      The statement of forming the proposed bactericidal complex from the identified complex is a speculation that is possible only when lipid II has a higher affinity for Dap than a PG ligand. To avoid confusion, we deleted the sentence’ in the revision. 

      (2) The detailed examination of daptomycin dynamics, particularly on the millisecond scale, in this paper is ideal for characterizing the effect of lipid II on daptomycin insertion. It would be helpful to either include lipid II in some analyses (micelle binding, fluorescence shifts, CD) or at least address why it was excluded from the scope of this work.

      As mentioned in the response to the first comment, we did not exclude isoprenyl lipids in our study but used some of their analogs in the fluorescence assay. Besides FPP mentioned above, we also tested geranyl pyrophosphate and geranyl monophosphate but obtained the same negative results. Lipid II was not directly used because it is one of the three isoprenyl lipids reported to have the same promoting effects in the Nat. Comm. paper and also because its preparation is not easy. Even if lipid II were different from other isoprenyl lipids in promoting membrane binding, its contribution is likely negligible at the reversible stage compared to the phospholipids because of its minuscule content in bacterial membrane. This is the main reason we did not use the isoprenyl lipids in the fast kinetic study (this stage only involves reversible binding, not insertion). 

      (3) Grein et al. 2020 saw that PG did not have a strong effect on daptomycin interaction with membranes. I believe this discrepancy is more likely due to the complex physical parameters of supported bilayers versus micelles/vesicles or some other methodological variable, but if the authors have more insight on this, it would be valuable commentary in the discussion.

      We totally agree that the discrepancy is likely due to the different conditions in the assays. It is hard to tell exactly what causes the difference. Thus, we did not attempt to comment on the cause of this difference in the discussion.

      (4) Isolation of the daptomycin complex from B. subtilis cells clearly had different traces from the in vitro complex; is it possible that lipid II is present in the B. subtilis complex? If not, a time-course extraction could be useful to support the model that different complexes have different activities. Isolates from early-stage incubation with daptomycin may lack lipid II but isolates from longer incubations may have lipid II present as the complex shifts from insertion to bactericidal.

      From the day we isolated the complex from B. subtilis, we have been looking for evidence for the previously proposed lipid complexes containing lipid II or other isoprenyl lipids but have not been successful. We did not see any sign of lipid II or other isoprenyl lipids in the MALDI or ESI mass spectroscopic data. The minute peaks in the HPLC traces are not the expected complexes in separate LC-MS analysis. However, this does not mean that such complexes are not present in the isolated PG-containing complex because: (1) the amount of such complexes may be too small to be detected due to the low content of the isoprenyl lipids; (2) the isoprenyl lipids, particularly lipid II, are not easily ionizable due to their size and unique structure for detection in mass spectrometry. 

      We don’t think the drug treatment time is the reason for the failure in detecting lipid II or other isoprenyl lipids. In our reported experiment, the cells were treated with a very high dose of Dap for 2 hours before extraction. In a separate experiment done recently, we treated B. subtilis at 1/3 of the used dose under the same condition and found all treated cells were dead after 1 hour in a titration assay, consistent with the results from reported time-killing assays in the literature. From this result, the proposed bactericidal lipid-containing complex should have been formed in the treated cells used in our extraction and isolated along with the PG-containing complex. It was not detected likely due to the reasons discussed above. To avoid the interference of the PG-containing complex, a large amount of bacterial cells might have to be treated at a low dose to isolate enough amount of the lipid II-containing complex for identification. However, isolation or identification of the lipid II-containing complex is outside the scope of the current investigation and is therefore not pursued. 

      (5) Part of the daptomycin mechanism of interacting with bacterial membranes involves the flipping of daptomycin from one leaflet to another. There was some mentioned work on the consistency of results between micelles and vesicles, but the dynamics or existence of a flipping complex in the bilayer system wasn't addressed at all in this paper.

      The current investigation makes no attempt to solve all problems in the daptomycin mode of action and is limited to the uptake of the drug, up to the point when Dap is inserted into the membrane. Within this scope, flipping of the complex is not yet involved and is thus irrelevant to the study. How the complex is flipped and used to kill the bacteria is what should be investigated next.  

      (6) The authors mention data with phosphatidylethanolamine in the text, but I could not find the data in the main or supplemental figures. I recommend including it in at least one of the figures.

      It is much appreciated that this error is identified. The POPE data was lost when the graphic (Fig. 2B) was assembled in Adobe to create Figure 2. We re-draw the graphic and reassemble the figure to solve this problem. Fig. 2B has also been modified to use micromolar for the concentration of the lipids.

      (7) Readability point: I'd suggest some consistency in the concentrations mentioned. Making the concentrations either all molar-based or all percentage-based would make comparison across figures easier.

      As suggested, we have changed the % into micromolar concentrations in Fig. 2B and also in Fig. 3A. 

      (8) The model figure is quite difficult to interpret, particularly the final stage of the tail unfolding. I recommend the authors use a zoomed-in inset for this stage, or at least simplify the diagram by removing the non-participating lipid structures. The figure legend for the model figure should also have a brief description of the events and what the arrows mean, particularly the POPS PG arrow in the final panel of the figure. I am assuming here the authors are implying that daptomycin can transiently interact with one lipid species and move to another, but the arrow here suggests that daptomycin is moving through the lipid headgroup space.

      We really appreciate the suggestions. As suggested, we put an inset to show the preinsertion complex more clearly. In addition, we have removed the green arrows originally intended to show the re-organization/movement of the phospholipids. Moreover, the legend is changed to ‘Proposed mechanism for the two-phased uptake of Dap into bacterial membrane. In the first phase, Dap reversibly binds to negative phospholipids with a hidden tail in the headgroup region, where it combines with two PG molecules to form a pre-insertion complex. In the second phase, the hidden tail unfolds and irreversibly inserts into the membrane. The inset shows the headgroup of the pre-insertion complex with the broad arrow showing the direction for the unfolding of the hidden tail. The red dots denote Ca2+.’  

      (9) The authors listed the Kd for daptomycin and 2 PG as 7.2 x 10-15 M2. Is this correct? This is an affinity in the femtomolar range.

      Please note that this Kd is for the simultaneous binding of two PG molecules, not for the binding of a single ligand that we usually refer to. Assuming that each PG contributes equally to this interaction, the binding affinity for each ligand is then the squared root of 7.2 x 10-15 M2, which equals to 8.5 x 10-8 M. This is equivalent to a nanomolar affinity for PG and is a reasonably high affinity.

      Reviewer #3 (Recommendations For The Authors):

      (1) The authors reported an increase in daptomycin intensity with the increasing amount of negatively charged DMPG. A similar observation has been reported for GUVs, however, the authors did not refer to this paper in their manuscript: E. Krok, M. Stephan, R. Dimova, L. Piatkowski, Tunable biomimetic bacterial membranes from binary and ternary lipid mixtures and their application in antimicrobial testing, Biochim. Biophys. Acta - Biomembr. 1865 (2023) [1]. This paper is also consistent with the authors' observation that there is negligible fluorescence detected for the membranes composed of PC lipids upon exposure to the Dap treatment.

      As suggested, this paper is cited as ref. 29 in the revision by adding the following sentence at the end of the section ‘Dependence of Dap uptake on phosphatidylglycerol.’: ‘PG-dependent increase of the steady-state fluorescence was also observed in giant unilamellar vesicles (GUVs).29’. The numbering is changed accordingly for the remaining references.  

      (2) Please include the plot of the steady-state Kyn fluorescence vs the content of POPA (Figure 2C shows traces for DMPG, CL, and POPS). Both POPA and POPS lipids are negatively charged, however, POPS seems to interact with Dap, while POPA does not. In my opinion, this observation is really interesting and might deserve a more thorough discussion. The authors might want to describe what could be the mechanism behind this lipid-specific mode of binding.

      As suggested, a plot is now added for POPA in Fig. 2C, which is basically a flat line without significant increase for the Kyn fluorescence. Indeed, the different effect of the negative phospholipids is very interesting, indicating that the reversible binding of Dap to the lipid surface is dependent not only on the Ca2+-mediated ionic interaction but also the structure of the headgroup. In other words, Dap recognizes the phospholipids at the surface binding stage. Considering this headgroup specificity, the last sentence in the second paragraph in “Discussion’ is changed from ‘In addition, due to the low lipid specificity, this reversible binding likely involves Ca2+-mediated ionic interaction between Dap and the phosphoryl moiety of the headgroups.’ to ‘In addition, due to the specificity for negative phospholipids (Fig. 2B and 2C), this reversible binding of Dap likely involves both a nonspecific Ca2+-mediated ionic interaction and a specific interaction with the remaining part of the headgroups.’

      (3) The authors write that they propose a novel mechanism for the Ca2+-dependent insertion of Dap to the bacterial membrane, however, they rather ignored the already published findings and hypotheses regarding this process. In fact the role of Ca2+, as well as the proposed conformational changes of Dap, which allow its deeper insertion into the membrane are well known:

      The role of Ca2+ ions in the mechanism of binding is actually three-fold: (i) neutralization of daptomycin charge [2], (iii) creating the connection between lipids and daptomycin and (iii) inducing two daptomycin conformational changes. It should be noted that the interactions between calcium ions and daptomycin are 2-3 orders of magnitude stronger than between daptomycin and PG lipids [3,4]. Thus, upon the addition of CaCl2 to the solution, the divalent cations of calcium bind preferentially to the daptomycin, rather than to the negatively charged PG lipids, which results in the decrease of daptomycin net negative charge but also leads to its first conformational change [4]. Upon binding between calcium ions and two aspartate residues, the area of the hydrophobic surface increases, which allows the daptomycin to interact with the negatively charged membrane. In the next step, Ca2+ acts as a bridge connecting daptomycin with the anionic lipids. This event leads to the second conformational change, which enables deeper insertion of daptomycin into the lipid membrane and enables its fluorescence [4]. The overall mechanism has a sequential character, where the binding of daptomycin-Ca2+ complex to the negatively charged PG (or CA) occurs at the end.

      The authors should focus on emphasizing the novelty of their manuscript, keeping in mind the already published paper.

      We agree with the comments on the three general roles of calcium ion in the Dap interaction with membrane. The current investigation does not ignore the previous findings, which involve many more works than mentioned above, but takes these findings as common knowledge. Actually, the role of calcium ion is not the focus of current work. Instead, the current work focuses on how the drug is taken up and inserted into the membrane in the presence of the ion and how its structure changes in this process. With the known roles of calcium ion in mind, we propose an uptake mechanism (Fig. 6) that shows no conflict with the common knowledge.

      We would like to point out that the ‘deeper insertion into the membrane’ in the comment is different from the membrane insertion referred to in our manuscript. This ‘deeper insertion’ still remains in the reversible stage of binding to the membrane surface because all negative phospholipids can do this (causing a conformational change and fluorescence increase, as quantified in Fig.2C) but now we know that only PG can enable irreversible membrane insertion because of our work. In addition, the comment that calcium binding to daptomycin causes first conformational change is not supported by our finding that no conformational change is found for Dap in the presence of calcium in a lipid-free environment (Fig. 3B). One important aspect of novelty and contribution of our work is to clear up some of these ambiguities in the literature. Another contribution of our work is to demonstrate the formation of a stable complex between Dap and PG with a defined stoichiometry and its crucial role in the drug uptake. 

      (4) One paragraph in the section "Ca2+- dependent interaction between Dap and DMPG" is devoted to a discussion of the formation of precipitate upon extraction of DMPG-containing micelles, exposed to Dap in the calcium-rich environment. Contrary, in the absence of Dap, no precipitate was detected. The authors did not provide any visual proof for their statement. Please include proper photographs in the supplementary information.

      The precipitate formed upon extraction of the DMPG-containing micelles was too little to be visually identifiable but could be collected by centrifugation and detected by fluorescence or HPLC after dissolving in DMSO. For visualization, we show below the precipitate formed using higher amount of Dap and DMPG. The Dap-DMPG-Ca2+ complex (left tube) was formed by mixing 1 mM Dap, 2 mM DMPG and 1 mM Ca2+ and the control (right tube) was a mixture of 2 mM DMPG and 1 mM Ca2+. This is now added as Fig. S7 in the supplementary information (the index is modified accordingly) and cited in the main text.

      (5) The authors wrote that it is not clear how many calcium ions are bound to Dap-2PG complex (page 11, Discussion section). There are already reports discussing this issue. I recommend citing the paper discussing that exactly two Ca2+ ions bind to a single Dap molecule: R. Taylor, K. Butt, B. Scott, T. Zhang, J.K. Muraih, E. Mintzer, S. Taylor, M. Palmer, Two successive calcium-dependent transitions mediate membrane binding and oligomerization of daptomycin and the related antibiotic A54145, Biochim. Biophys. Acta - Biomembr. 1858, (2016) 1999-2005 [5]

      We were aware of the cited work that shows binding of two Ca2+ but also noted that there are more works showing one Ca2+ in the binding, such as the paper in [Ho, S. W., Jung, D., Calhoun, J. R., Lear, J. D., Okon, M., Scott, W. R. P., Hancock, R. E. W., & Straus, S. K. (2008), Effect of divalent cations on the structure of the antibiotic daptomycin. European Biophysics Journal, 37(4), 421–433.]. That was the reason we said ‘it is not clear how many calcium ions are bound to Dap-2PG complex’. Now, both papers are cited (as Ref. #33, 34) to support this statement.

      (6) The authors wrote two contradictory statements:

      -  PG cannot be found in mammalian cell membranes:

      "Moreover, the complete dependence of the membrane insertion on PG also explains why Dap selectively attacks Gram-positive bacteria without affecting mammalian cells, because PG is present only in bacterial membrane but not in mammalian membrane. " (Page 10, Discussion section, last sentence of the first paragraph)

      "However, Dap absorbed on bacterial surface is continuously inserted into the acyl layer via formation of complex with PG in a time scale of minutes, whereas no irreversible insertion of Dap occurs on mammalian membrane due to the absence of PG while the bound Dap is continuously released to the circulation as the drug is depleted by the bacteria." (Page 13, Discussion section)

      -  PG in trace amounts is present in mammalian membranes:

      "The proposed requirement of the pre-insertion quaternary complex increases the threshold of PG content for the membrane insertion to happen and thus makes it impossible on the surface of mammalian cells even if their plasma membrane contains a trace amount of PG." (Page 13, Discussion section).

      In fact, phosphatidylglycerol comprises 1-2 mol% of the mammalian cell membranes. Please, correct this information, which in this form is misleading to the readers.

      We appreciate the comments about the PG content in mammalian cells. Changes are made as listed below:

      (1) p10, the sentence is changed to ‘Moreover, the complete dependence of the membrane insertion on PG also explains why Dap selectively attacks Gram-positive bacteria without affecting mammalian cells, because PG is a major phospholipid in bacterial membrane but is a minor component in mammalian membrane.’ 

      (2) p13, the sentence is changed to ‘However, Dap absorbed on bacterial surface is continuously inserted into the acyl layer via formation of complex with PG in a time scale of minutes, whereas little irreversible insertion of Dap occurs on mammalian membrane due to the low content of PG while the bound Dap is continuously released to the circulation as the drug is depleted by the bacteria.’

      (3) p13, another sentence is modified to ‘The proposed requirement of the pre-insertion quaternary complex increases the threshold of PG content for the membrane insertion to happen and thus makes it less likely on the surface of mammalian cells that contain PG at a low level in the membrane.’ 

      (7) Please include information that Dap is effective only against Gram-positive bacteria and does not show antimicrobial properties against Gram-negative strains. The authors focused on emphasizing that Dap does not affect mammalian membranes, most likely due to the low PG content, however even membranes of Gram-negative bacteria are not susceptible to the Dap, despite the relatively high content of negatively charged PG in the inner membrane (e.g. inner cell membrane of E. coli has ~20% PG).

      The requested information is already included in ‘Introduction’. In this part, Dap is introduced to be only active against Gram-positive bacteria, implicating that it is not active against Gram-negative bacteria. The reason Dap is inactive against E. coli or other Gramnegative bacteria is because the outer membrane prevents the antibiotic from accessing the PG in the inner membrane to cause any harm. When the outer membrane is removed, Dap will also attack the plasma membrane of Gram-negative bacteria. 

      Literature cited in the comments:

      (1) E. Krok, M. Stephan, R. Dimova, L. Piatkowski, Tunable biomimetic bacterial membranes from binary and ternary lipid mixtures and their application in antimicrobial testing, Biochim. Biophys. Acta - Biomembr. 1865 (2023). https://doi.org/10.1101/2023.02.12.528174.

      (2) S.W. Ho, D. Jung, J.R. Calhoun, J.D. Lear, M. Okon, W.R.P. Scott, R.E.W. Hancock, S.K. Straus, Effect of divalent cations on the structure of the antibiotic daptomycin, Eur. Biophys. J. 37 (2008) 421-433. https://doi.org/10.1007/S00249-007-0227-2/METRICS.

      (3) A. Pokorny, P.F. Almeida, The Antibiotic Peptide Daptomycin Functions by Reorganizing the Membrane, J. Membr. Biol. 254 (2021) 97-108. https://doi.org/10.1007/s00232-02100175-0.

      (4) L. Robbel, M.A. Marahiel, Daptomycin, a bacterial lipopeptide synthesized by a nonribosomal machinery, J. Biol. Chem. 285 (2010) 2750127508. https://doi.org/10.1074/JBC.R110.128181.

      (5) R. Taylor, K. Butt, B. Scott, T. Zhang, J.K. Muraih, E. Mintzer, S. Taylor, M. Palmer, Two successive calcium-dependent transitions mediate membrane binding and oligomerization of daptomycin and the related antibiotic A54145, Biochim. Biophys. Acta - Biomembr. 1858 (2016) 1999-2005. https://doi.org/10.1016/J.BBAMEM.2016.05.020.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This work used a comprehensive dataset to compare the effects of species diversity and genetic diversity within each trophic level and across three trophic levels. The results showed that species diversity had negative effects on ecosystem functions, while genetic diversity had positive effects. These effects were observed only within each trophic level and not across the three trophic levels studied. Although the effects of biodiversity, especially genetic diversity across multi-trophic levels, have been shown to be important, there are still very few empirical studies on this topic due to the complex relationships and difficulty in obtaining data. This study collected an excellent dataset to address this question, enhancing our understanding of genetic diversity effects in aquatic ecosystems.

      Strengths:

      The study collected an extensive dataset that includes species diversity of primary producers (riparian trees), primary consumers (macroinvertebrate shredders), and secondary consumers (fish). It also includes the genetic diversity of the dominant species at each trophic level, biomass production, decomposition rates, and environmental data.

      The conclusions of this paper are mostly well supported by the data and the writing is logical and easy to follow.

      Weaknesses:

      (1) While the dataset is impressive, the authors conducted analyses more akin to a "meta-analysis," leaving out important basic information about the raw data in the manuscript. Given the complexity of the relationships between different trophic levels and ecosystem functions, it would be beneficial for the authors to show the results of each SEM (structural equation model).

      We understand the point raised by the reviewer. We now provide individual SEMs (Figure 3), although we limit causal relationships to those for which the p-value was below 0.2 for the sake of graphical clarity. We also provide the percentage of explained variance for each ecosystem function. We detail the graph in the Results section (see l. 317-328) and discuss them (see l. 387-398). Note that we do not detail each function separately as this would (in our opinion) result in a long descriptive paragraph from which it might be difficult to get some key information. Rather, we summarize the percentage of explained variance for each function and discuss the strength of environmental vs biodiversity effects for some examples. In the Discussion, we explain why environmental effects (on functions and biodiversity) are relatively weak. We mainly attribute this to the sampling scheme that follows an East-West gradient (weak altitudinal range) rather than an upstream-downstream gradient as it is traditionally done in rivers. The reasoning behind this sampling scheme is explained in our companion paper (Fargeot et al. Oikos 2023) to which we now refer more explicitly in the MS. Briefly, using an upstream-downstream gradient would have certainly push up the effects of the environment, but this would have made extremely complex the inference of biodiversity effects due to strong collinearity among environmental and biodiversity parameters.

      (2) The main results presented in the manuscript are derived from a "metadata" analysis of effect sizes. However, the methods used to obtain these effect sizes are not sufficiently clarified. By analyzing the effect sizes of species diversity and genetic diversity on these ecosystem functions, the results showed that species diversity had negative effects, while genetic diversity had positive effects on ecosystem functions. The negative effects of species diversity contradict many studies conducted in biodiversity experiments. The authors argue that their study is more relevant because it is based on a natural system, which is closer to reality, but they also acknowledge that natural systems make it harder to detect underlying mechanisms. Providing more results based on the raw data and offering more explanations of the possible mechanisms in the introduction and discussion might help readers understand why and in what context species diversity could have negative effects.

      (We now provide more details. However, we are unfortunately not sure that this helped reaching some stronger explanation regarding underlying mechanisms. To be frank, we did not succeed in improving mechanistic inferences based on the outputs of the SEM models. We explored visually some additional relationships (e.g. relationships between the biomass of the focal species and that of other species in the assemblage) that we now discuss a bit more, but again, this did not really help in better understanding processes. We realize this is a limitation of our study and that this can be frustrating for readers. Nonetheless, as said in the Discussion, field-based study must be taken for what they are; observational studies forming the basis for future mechanistic studies. Although we failed to explain mechanisms, we still think that we provide important field-base evidence for the importance of biodiversity (as a whole) for ecosystem functions.

      3) Environmental variation was included in the analyses to test if the environment would modulate the effects of biodiversity on ecosystem functions. However, the main results and conclusions did not sufficiently address this aspect.

      This is now addressed, see our response to your first comment. We now explain (result section) and discuss environmental effects. As explained in the MS, environmental effects are similar in strength to those of biodiversity and are not that high, which is partly explained by the sampling scheme (see Fargeot et al. 2023). This is a choice we’ve made at the onset of the experiment, as we wanted to focus on biodiversity effects and avoid strong collinearity as it is generally the case in rivers (which impedes any proper and strong statistical inferences).

      Reviewer #2 (Public review):

      Summary:

      Fargeot et al. investigated the relative importance of genetic and species diversity on ecosystem function and examined whether this relationship varies within or between trophic-level responses. To do so, they conducted a well-designed field survey measuring species diversity at 3 trophic levels (primary producers [trees], primary consumers [macroinvertebrate shredders], and secondary consumers [fishes]), genetic diversity in a dominant species within each of these 3 trophic levels and 7 ecosystem functions across 52 riverine sites in southern France. They show that the effect of genetic and species diversity on ecosystem functions are similar in magnitude, but when examining within-trophic level responses, operate in different directions: genetic diversity having a positive effect and species diversity a negative one. This data adds to growing evidence from manipulated experiments that both species and genetic diversity can impact ecosystem function and builds upon this by showing these effects can be observed in nature.

      Strengths:

      The study design has resulted in a robust dataset to ask questions about the relative importance of genetic and species diversity of ecosystem function across and within trophic levels.

      Overall, their data supports their conclusions - at least within the system that they are studying - but as mentioned below, it is unclear from this study how general these conclusions would be.

      Weaknesses:

      (4) While a robust dataset, the authors only show the data output from the SEM (i.e., effect size for each individual diversity type per trophic level (6) on each ecosystem function (7)), instead of showing much of the individual data. Although the summary SEM results are interesting and informative, I find that a weakness of this approach is that it is unclear how environmental factors (which were included but not discussed in the results) nor levels of diversity were correlated across sites. As species and genetic diversity are often correlated but also can have reciprocal feedbacks on each other (e.g., Vellend 2005), there may be constraints that underpin why the authors observed positive effects of one type of diversity (genetic) when negative effects of the other (species). It may have also been informative to run SEM with links between levels of diversity. By focusing only on the summary of SEM data, the authors may be reducing the strength of their field dataset and ability to draw inferences from multiple questions and understand specific study-system responses.

      We have addressed this remark and we ask the reviewers and the readers to refer to our response to comment 1 from reviewer 1. Regarding co-variation among biodiversity estimates (SGDCs according to Vellend’s framework), we have addressed these issues in a companion paper that we now cite and expand further in the MS (Fargeot et al. Oikos, 2023). Given the size of the dataset and its complexity (and associated analyses), we have decided to focus on patterns of species and genetic biodiversity in a first paper (Oikos paper) and then on the link between biodiversity and functions (this paper). As it can be read in the Oikos’s paper, there are no co-variation in term of biodiversity estimates; species diversity is not correlated to genetic diversity, and within facet, there are not co-variation among species. In addition, environmental predictors are highly estimate-specific (i.e. environmental predictors sustaining species and genetic estimates are idiosyncratic). As a result (see the new Figure 3), environmental effects are relatively weak (the same intensity that those of biodiversity) and collinearity among parameters is relatively weak. The second point is important, as this permit to better infer parameters from models, and this allows to discuss direct relationships (as observed in Figure 3, indirect environmental effects are relatively rare). We provide in the Discussion a bit more explanation about the absence of co-variation among biodiversity estimates (see l. 433-440).

      (5) My understanding of SEM is it gives outputs of the strength/significance of each pathway/relationship and if so, it isn't clear why this wasn't used and instead, confidence intervals of Z scores to determine which individual BEFs were significant. In addition, an inclusion of the 7 SEM pathway outputs would have been useful to include in an appendix.

      We now provide p-values (Table S2) and the seven models (Figure 3).

      (6) I don't fully agree with the authors calling this a meta-analysis as it is this a single study of multiple sites within a single region and a specific time point, and not a collection of multiple studies or ecosystems conducted by multiple authors. Moreso, the authors are using meta-analysis summary metrics to evaluate their data. The authors tend to focus on these patterns as general trends, but as the data is all from this riverine system this study could have benefited from focusing on what was going on in this system to underpin these patterns. I'd argue more data is needed to know whether across sites and ecosystems, species diversity and genetic diversity have opposite effects on ecosystem function within trophic levels.

      We agree. “Meta-regression” would perhaps be more adequate than “meta-analyses”. We changed the formulation.

      Reviewer #3 (Public review):

      The manuscript by Fargeot and colleagues assesses the relative effects of species and genetic diversity on ecosystem functioning. This study is very well written and examines the interesting question of whether within-species or among-species diversity correlates with ecosystem functioning, and whether these effects are consistent across trophic levels. The main findings are that genetic diversity appears to have a stronger positive effect on function than species diversity (which appears negative). These results are interesting and have value.

      However, I do have some concerns that could influence the interpretation.

      (7) Scale: the different measures of diversity and function for the different trophic levels are measured over very different spatial scales, for example, trees along 200 m transects and 15 cm traps. It is not clear whether trees 200 m away are having an effect on small-scale function.

      Trees identification and invertebrate (and fish) sampling are done on the same scale. Trees are spread along the river so that their leaves fall directly in the river. Traps have been installed all along the same transect in various micro-habitats. Diversity have been measured at the exact same scale for all organisms. We have modified the MS to make this clear.

      (8) Size of diversity gradients: More information is needed on the actual diversity gradients. One of the issues with surveys of natural systems is that they are of species that have already gone through selection filters from a regional pool, and theoretically, if the environments are similar, you should get similar sets of species, without monocultures. So, if the species diversity gradients range from say, 6 to 8 species, but genetic diversity gradients span an order of magnitude more, you can explain much more variance with genetic diversity. Related to this, species diversity effects on function are often asymptotic at high diversity and so if you are only sampling at the high diversity range, we should expect a strong effect.

      Fish species number varies from 1 to 11, invertebrate family number varies from 15 to 42 and the tree species number varies from 7 to 20 (see Fargeot et al. 2023 for details). We have added this information in the M&M. The gradients are hence relatively large and do not cover a restricted set of values. There is a variance in species number among sites, even if sites are collected along a relatively weak altitudinal gradient. This is obviously complex to compare to SNP (genomic) diversity. Genetic and species effects are similar in effect sizes (percentage of explained variance), so it does not seem we have biased one of the two gradients of biodiversity.

      (9) Ecosystem functions: The functions are largely biomass estimates (expect decomposition), and I fail to see how the biomass of a single species can be construed as an ecosystem function. Aren't you just estimating a selection effect in this case?

      The biomass estimated for a certain area represents an estimate of productivity, whatever the number of species being considered. Obviously, productivity of a species can be due to environmental constraints; the biomass is expected to be lower at the niche margin (selection effect). But if these environmental effects are taken into account (which is the case in the SEMs), then the residual variation can be explained by biodiversity effects. We provide an explanation (l. 217-219).

      (10) Note that the article claims to be one of the only studies to look at function across trophic levels, but there are several others out there, for example:

      Thanks, we now cite some of these studies (Li et al 2020, Moi et al. 2021, Seibold et al. 2018).

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Introduction:

      The introduction of the manuscript is generally well-structured, and the scientific questions are clearly presented. However, in each paragraph where specific aspects are introduced, the authors do not focus sufficiently on the given points. The current introduction discusses the weaknesses of previous studies extensively but lacks detailed explanations of mechanisms and a clear anticipation of this study's contributions.

      For example:

      L72-77: The authors mention that "genetic diversity may functionally compensate for a species loss," but this point is not highly relevant to the main analyses of this study, which focus on comparing the relative effects of species diversity and genetic diversity.

      Yes true, we understand the point made by the reviewers. We deleted this part of the sentence.

      L87-95: As previously noted, "whether environmental variation decreases or enhances the relative influence of genetic and species diversity on ecosystem functions" was not addressed in this study. Additionally, the last sentence seems unnecessary here, as it does not relate to "environmental variation." The phrase "generate insightful knowledge for future mechanistic models" is vague. It would be helpful to specify what kind of knowledge and what types of future mechanistic models are being referred to.

      We modified these two sentences. We now posit the prediction that what has been observed under controlled conditions (that genetic and species have effects of similar magnitude) might not be the norm under fluctuating environments (because it has been shown that environmental variation modulates the strength of interspecific BEFS and create huge variance).

      L96-116: The use of "for instance" three times in this paragraph makes the structure seem scattered, as only examples are provided. Improving the transition words can help the text focus better on the main point.

      We have modified some parts of this section to better reflect predictions

      L115-116: Again, it would be beneficial to specify what kind of insightful information can be provided.

      We have modified this sentence by making more explicit some of the information that may be gained.

      L117-134: Stating clear expectations can help the introduction focus on the mechanisms and assist readers in following the results.

      We now provide some predictions. We were reluctant to make predictions in the first version of the MS as we have the feeling that predictions can go on very different direction depending on how we set the scene. We therefore stick to predictions that we think are the most logical (the simplest ones). This illustrates the lack of theoretical papers on these issues.

      Methods:

      L287-293: The method for estimating the standard effect size is unclear. I assume it was derived from the SEM models? This needs further clarification.

      Yes, it is derived from the standardized estimate from each pSEM. This is now explained in the MS.

      Results:

      As mentioned in the public review, it is very important to show the results of analyzing raw data.

      Done, see Figure 3 and Results section.

      Table 1: The font and format of the PCA table are different from other tables and appear vague, resembling a picture rather than a table.

      Changed.

      Table 2 (and supplementary table): "D.f." is not explained in the table legend. Is 1 the numerator df and 30 the denominator df? Is the denominator the residual? Additionally, the table legend mentions "magnitude and direction." ANOVA only tests if the biodiversity effects are significantly different between species or genetic diversity, but not the magnitude. For example, -0.5 and 0.5 are very different, but their effect magnitudes are the same.

      This is a mistake; sorry the format of the Table was from a previous version of the MS in which we used linear models rather that linear mixed models (both lead to the same results). The ANOVA used to test the significance of fixed terms in linear mixed model are based on Wald chi-sqare tests, and it should have been read “Chi-value” rather than “F-value” in both tables and the only degree of freedom in this test is the one at the numerator. This has been changed. We have changed the caption of the Table (“ANOVA table for the linear mixed model testing whether the relationships between biodiversity and ecosystem functions measured in a riverine trophic chain differ between the biodiversity facets (species or genetic diversity) and the types of BEF (within- or between-trophic levels)”)

      Minor:

      There should always be a space between a number and a unit. In the manuscript, spaces are inconsistently used between numbers and units.

      Corrected

      Reviewer #2 (Recommendations for the authors):

      (1) In the introduction, the authors could focus more and build out what they predicted/hypothesized as well as what has been found in the manipulated experiments that examined the role of species and genetic diversity. That would enhance the background information for a more general audience, and highlight expected results and why.

      We modified the Introduction according to comments made by reviewer 1 and clarified the predictions as best as we can.

      (2) Similarly, the discussion is fairly big picture, but this dataset focused exclusively on this 3-trophic interaction in a riverine system. It could be beneficial to dig into the ecology to find out why the opposite effects of species and genetic diversity are seen within trophic levels in this system.

      We have added some explanations based on the specific pSEM (see our responses to the public reviews for details). But as said in the responses to the public reviews, even with mode detailed models, it is hard to tease apart mechanisms. One important point is that genetic and species diversity do not correlate one to each other (they do not co-vary over space), which means the effect of one facet is independent from the other. However, apart from that, we can’t really tell more without more mechanistic approaches. We understand this is frustrating, but this is the nature of field-based data. This does not mean they are useless. On the contrary, they confirm and expand patterns found under controlled conditions (which for ecologists is quite important as nature is our playground), but they are limited in inferences of mechanisms.

      (3) It would also be informative if the authors specified what positive and negative Z scores mean. It seems counterintuitive in Figure 3. For example, in the upper left, it's denoted as a larger intraspecific effect - which I'd assume is higher genetic (within species) diversity - but is this not where species diversity effects are higher? In theory this figure could be similar to Figure 1 from Des Roches et al. 2018 - where showing the 1:1 line of where species and genetic diversity effects are similar and then how some are more impacted by SD or GD as that links to the overall question, right?

      For example: Figure 3 makes it seem that GD effects are stronger (more positive) for within trophic responses (which is reflected in the text), but in that quadrant, it states that the interspecific effect is larger?

      yes, you’re true Figure 3 (now Figure 4) is not ideal. We added an explicit explanation for interpreting Zr in the main text. In addition, we modified the text in the quadrat as this was not correct. Note that it cannot be directly be compared to that of DesRoches et al. In DesRoches et al., there is a single effect size (ES) per situation (which is roughly expressed as “ES = effect of species - effect of genotypes”). Here, there are two ES per situation, one for the species effect, the other for the genetic effect, which makes the biplot more complex (as species and genetic can be similar in magnitude, but opposite in direction, e.g., 0.5 and -0.5). We may have done as DesRoches et al. (“ES = effect of species - effect of genotypes”), but as we don’t have absolute ES (as in DesRoches) the resulting signs of the ES are non sensical…Not easy for us to find a clever solution (or said differently, we were not clever enough to find an easy solution).  Nonetheless, we tried another visualization by including “sub-quadrats” into the four main quadrats. We hope this will be clearer

      (4) It's unclear why authors included both a simplified linear mixed model with diversity type and biodiversity facet as fixed factors, and then a second linear model that included trophic level (with those other 2 factors and interactions), but only showed results of trophic level from that more complex model. It is unclear why they include two models when the more complex one would have evaluated all aspects of their research question and shown the same patterns.

      You’re true, the more complex model evaluates both aspects. Nonetheless, as the hypotheses were strictly separated, we thought it is simpler to associate one model to one hypothesis. We agree that this duplicates information, but we would like to keep the two models to make the text more gradual.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1:

      - The manuscript needs comprehensive proofreading for language and formatting. In many instances, spaces are missing or not required.

      Thank you for your comments. The manuscript has been thoroughly proofread for errors in language and formatting.

      - Could the authors explore correlation network analyses to get additional insights into the structure of different clusters? 

      We have added a co-occurrence analysis (at species taxonomic level) based on SparCC to the manuscript (Figure 2).

      This is described on Page 9 line 141-148

      - The GitHub link is not correct. 

      The github repository has now been made public.

      - It is not possible to access the dataset on ENA. 

      We have changed the ENA study PRJEB57401 status to open.

      - Add the graphs obtained with decontam analysis as a supplementary figure. 

      We have added the outputs of decontam (.csv files with feature lists of ASVs that were filtered based on the prevalence and frequency tests) to the github repository.

      - There is nothing about the RPL group in the results section, while the authors discuss this issue in the introduction. What about the controls with proven fertility? 

      Thank you. We have amended the manuscript to compare characteristics between the RPL, unexplained subfertility and controls groups.

      Line 1279-130 page 8:  

      “The study group represented 85% of samples with high sperm DNA fragmentation, 85% of samples with elevated ROS and 79% of samples with oligospermia. Rates of abnormal seminal parameters including low sperm concentration, reduced progressive motility and ROS concentrations were found to be highest in the MFI group (Supplementary Figure 1). Baseline characteristics between the RPL, unexplained subfertility and controls groups were similar.

      Line 150-154 Page 9: 

      “Bacterial richness, diversity and load were similar between all patient groups examined in the study (Supplementary Figure 4).

      - While correctly stated in the title, the term microbiota should be used throughout the manuscript instead of "microbiome" 

      Thank you. This misnomer has been amended throughout the manuscript.

      Minor corrections:

      Line 25: provoke is not a good term here. 

      Thank you. The term ‘provoke’ has been removed

      Line 26: why does semen culture have a limited scope? 

      Thank you. Line 40-41 Page 3 has been amended:

      “It is therefore plausible that asymptomatic seminal infections may be associated with impaired reproductive function in some men. Since semen culture has a limited scope for studying the seminal microbiota due to its inability to identify all present microbiota next generation sequencing (NGS) approaches have been reported recently by a growing number of investigators (13, 14, 15, 16, 17, 18, 19)”.

      Line 68: write μl correctly

      Thank you. This has been corrected

      Line 131: several organisms at the genus level. 

      Thank you. This has been corrected

      Line 136: what are the relative abundances of these genera? Is this relevant? 

      The mean relative abundances for the key taxa mention in each cluster are all above 20%. This information has been added to the manuscript text on page 9, line 153.

      Line 173: Molina et al. 

      Thank you. This has been corrected

      Line 173: the contaminations are referred to the low biomass nature of testicular samples. If present, bacteria of accessory gland secretions are an integral part of the seminal microbiota itself. Please review these sentences. 

      Thank you. This had been reworked to highlight the important of urethral contamination, which you later allude to as a limitation of our study is the failure to provide paired urine and semen samples.

      Page 11 line 194-196

      “Molina et al report that 50%-70% of detected bacterial reads may be environmental contaminants in a sample from extracted testicular spermatozoa (35); with the addition of passage along the urethra it is likely that contamination of ejaculated semen would be much higher.”

      Table 1: remove results interpretation from table caption. 

      Thank you this has been acted upon.

      Table 1: why in some cases, like in DNA fragmentation index, the total is not equal to n=223? 

      This is due to missing data/ analysis not possible for some men due to the requirement of a minimum number of sperm in the ejaculate to perform DNA fragmentation testing.

      Table 1: "frag" is not defined. 

      Thank you, this has been amended

      Tables 2, 3 & 4: bacterial genera in italics. 

      Thank you, this has been amended

      Figure 1A: add the fertility status information above the cluster colors. 

      Thank you, this has been amended in Figure 1.

      Figure 1C: the color code is confusing. Use different colors for each cluster. 

      Figure 1 legend: bacterial genera in italics. 

      Figures 1 & 2: the authors should use similar chart formatting in the two tables. 

      Thank you, this has been amended

      Reviewer 2:

      (1) The patient groups have different diagnoses and should be handled as different groups, and not fused into one 'patient' group in analyses. <br /> Why are the data in tables presented as controls and cases? I would consider men from couples with recurrent pregnancy loss, unexplained infertility, and male factor infertility to have different seminal parameters (not to fuse them into one group). This means, that the statistical analyses should be performed considering each group separately, and not to fuse 3 different infertility diagnoses into one patient group. 

      We have conducted detailed analyses, requested by the reviewer, comparing seminal DNA, ROS and microbiota characteristics between each individual patient groups (Supplimental figures 1 and 4). No specific taxa (at either genera or species-level) were found to differ in relative abundance between the diagnostic groups. However, we expect associations between parameters such as reactive oxygen species, or DNA fragmentation, and relative abundance of bacterial species, to be general and not restricted to or specific to each diagnostic group. Therefore, we also conducted further analyses aggregating data from all patient groups to investigate relationships common to these different forms of male reproductive dysfunction.

      (2) Were any covariables included in the statistical analyses, e.g. age, BMI, smoking, time of sexual abstinence, etc? 

      Covariates were not included in the statistical analyses. This has been added in the manuscript to the limitations.

      Page 14 line 267-268

      “Additionally, we did not have other covariables such as smoking status with which to include in further analyses”.

      (3) Furthermore, it is known that 16S rRNA gene analysis does not provide sensitive enough detection of bacteria on the species level. How much do the authors trust their results on the species level? 

      The limitations of taxonomic assignment using 16S rRNA gene metataxonomics are well documented. However, the capacity to assign sequence amplicons at species level depends on the sequence variability of the 16S rRNA gene for each of the taxa reported and the specific gene region chosen. In this study, amplification of the V1-V2 region was performed using a mixed 28f primer set (see methods for details) that enables resolution and assignment of several bacterial species highly relevant to the reproductive tract including Lactobacillus spp., such as L. crispatus and L. iners, (e.g. https://doi.org/10.3389/fcell.2021.641921, https://doi.org/10.1128/msystems.01039-23, https://doi.org/10.1186/s12915-023-01702-2). In this study, we report the presence of L. iners, but not L. crispatus in semen samples, and we have also identified a specific association/co-occurrence between Gardnerella vaginalis and Lactobacillus iners, similar to that observed in vaginal bacterial communities.

      (4) Were the analyses of bacterial genera and species abundances with seminal quality parameters controlled for diagnosis and other confounders? 

      As stated in point 2, no adjustment was made for co-variates. No differences in microbiome composition were observed among the three diagnostic groups, so no adjustments were made to our analysis.

      (5) The authors stress that their study is the biggest on the microbiome in semen. However, when considering that the study consists of 4 groups (with n=46-63), it does not stand out from previous studies. 

      Our study is overall the largest investigating interactions between the seminal microbiome and male reproductive dysfunction. Other studies have included greater numbers of men with infertility.

      (6) Weaknesses: There is a lack of paired seminal/urinal samples. 

      Thank you. This limitation has been added.

      Page 14 line 266-267

      “A further limitation of this study, and others, is the lack of reciprocal genital tract microbiota testing of the female partners, or paired seminal and urinary samples from male participants”.

      Recommendation for authors to consider:

      Including previous classical reviews in the introduction: DOI:10.1097/MOU.0000000000000742 <br /> DOI: 10.1038/s41585-019-0250-y 

      Thank you. This has been added.

      Mentioning in the M&M section that there is a supplementary text with a more detailed M&M part. 

      Thank you. This has been added. Further methodological detail can be found in supplementary text.

      Revising the use of 'microbiota' and 'microbiome', they are not synonyms. When talking of 16S rRNA gene analysis, we consider 'microbiome' analysis. 

      Thank you. This misnomer has been amended throughout the manuscript.

      Revising the text, there are several erratas (e.g. verb missing, etc). 

      Thank you for your comments. The manuscript has been thoroughly proofread for errors in language and formatting.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review):

      Summary: 

      In the manuscript entitled "Magnesium modulates phospholipid metabolism to promote bacterial phenotypic resistance to antibiotics", Li et al demonstrated the role of magnesium in promoting phenotypic resistance in V. alginolyticus. Using standard microbiological and metabolomic techniques, the authors have shown the significance of fatty acid biosynthesis pathway behind the resistance mechanism. This study is significant as it sheds light on the role of an exogenous factor in altering membrane composition, polarization, and fluidity which ultimately leads to antimicrobial resistance. 

      Strengths: 

      (1) The experiments were carried out methodically and logically. 

      (2) An adequate number of replicates were used for the experiments. 

      Weaknesses: 

      (1) The introduction section needs to be more informative and to the point.  

      Thank you so much for your suggestion. We have revised the introduction to make it more informative and to the point as following:

      “Non-inheritable antibiotic or phenotypic resistance represents a serious challenge for treating bacterial infections. Phenotypic resistance does not involve genetic mutations Phenotypic resistance does not involve genetic mutations and is transient, allowing bacteria to resume normal growth. Biofilm and bacterial persisters are two phenotypic resistance types that have been extensively studied (Brandis et al., 2023; Corona & Martinez, 2013). Biofilms have complex structures, containing elements that impede antibiotic diffusion, sequestering and inhibiting their activity (Ciofu et al., 2022). Biofilm-forming bacteria and persisters also have distinct metabolic states that significantly reduce their antibiotic susceptibility (Yan & Bassler, 2019). These two types of phenotypic resistance share the common feature in their retarded or even cease of growth in the presence of antibiotics (Corona & Martinez, 2013). However, specific factors that promote phenotypic resistance and allow bacteria to proliferate in the presence of antibiotics remain poorly defined.

      Metal ions have a diverse impact on the chemical, physical, and physiological processes of antibiotic resistance  (Booth et al, 2011; Lu et al, 2020; Poole, 2017). This includes genetic elements that confer resistance to metals and antibiotics (Poole, 2017) and metal cations that directly hinder (or enhance) the activity of specific antibiotic drugs (Zhang et al., 2014). The metabolic environment can also impact the sensitivity of bacteria to antibiotics (Jiang et al., 2023; Lee & Collins, 2012; Peng et al., 2015; Zhang et al., 2020; Zhao et al., 2021). Light metal ions, such as magnesium, sodium, and potassium, can behave as cofactors for different enzymes (Du et al., 2016) and influence drug efficacy. Heavy metal ions, including Cu2+ and Zn2+, confer resistance to antibiotics (Yazdankhah et al., 2014; Zhang et al., 2018). Recent reports suggest that sodium negatively regulates redox states to promote the antibiotic resistance of Vibrio alginolyticus (Yang et al., 2018), while actively growing Bacillus subtilis cope with ribosome-targeting antibiotics by modulating ion flux (Lee et al, 2019). In Gram-negative bacteria, by contrast, zinc enhances antibiotic efficacy by potentiating carbapenem, fluoroquinolone, and β-lactam-mediated killing (Isaei et al., 2016; Zhang et al., 2014). Magnesium influences bacterial structure, cell motility, enzyme function, cell signaling, and pathogenesis (Wang et al., 2019). This mineral also modulates microbiota to harvest energy from the diet (Garcia-Legorreta et al., 2020), allowing Bacillus subtilis to cope with ribosome-targeting antibiotics by modulating ion flux (Lee et al., 2019). However, the role of magnesium in promoting phenotypic resistance is less well understood.

      Vibrios inhabit seawater, estuaries, bays, and coastal waters, regions full of metal ions such as magnesium (Kumarage et al., 2022). Magnesium is the second most dissolved element in seawater after sodium. At a salinity of 3.5% seawater, the magnesium concentration is about 54 mM (Potis, 1968), and in deep seawater, can be as high as 2,500 mM (Wang et al., 2024). Vibrio parahaemolyticus and V. alginilyticus are two representative Vibrio pathogens that infect humans and aquatic animals, resulting in illness and economic loss, respectively (Grimes, 2020). (Fluoro)quinolones such as balofloxacin are used to treat Vibrio infection, however, resistance has emerged due to overuse (Suyamud et al., 2024). Indeed, (fluoro)quinolones are one of China's two primary residual chemicals associated with aquaculture (Liu et al., 2017). Vibrio can develop quinolone resistance through mutations in the DNA gyrase gene or through plasmid-mediated mechanisms (Dutta et al., 2021). Thus, the use of V. parahaemolyticus and V. alginilyticus as bacterial representatives, and balofloxacin as a quinolone-based antibacterial representative, can help to define novel magnesiumdependent phenotypic resistance mechanisms of pathogenic Vibrio species. 

      The current study evaluated whether magnesium induces phenotypic resistance in Vibrio species and defined the molecular/genetic basis for this resistance. Genetic approaches, GC-MS analysis of metabolite and membrane remodeling upon antibiotic exposure, membrane physiology, and extensive antimicrobial susceptibility testing were used for the evaluations.”

      (2) The weakest point of this paper is in the logistics through the results section. The way authors represented the figures and interpreted them in the results section (or the figure legends) does not match. The figures are difficult to interpret and are not at all self-explanatory. 

      Thank you so much for your suggestion. We have followed your suggestion to check the match between result and figures. They are now revised. 

      (3) There are too many mislabeling of the figure panels in the main text which makes it difficult to find out which figures the authors are explaining. There should be more explanation on why and how they did the experiments and how the results were interpreted. 

      Thank you so much for your suggestion. We have checked the figures and main text to ensure that we make every figure clearly stated.  

      Reviewer #2 (Public Review): 

      Summary: 

      In this study, the authors aimed to identify if and how magnesium affects the ability of two particular bacteria species to resist the action of antibiotics. In my view, the authors succeeded in their goals and presented a compelling study that will have important implications for the antibiotic resistance research community. Since metals like magnesium are present in all lab media compositions and are present in the host, the data presented in this study certainly will inspire additional research by the community. These could include research into whether other types of metals also induce multi-drug resistance, whether this phenomenon can be observed in other bacterial species, especially pathogenic species that cause clinical disease, and whether the underlying molecular determinants (i.e. enzymes) of metal-induced phenotypic resistance could be new antimicrobial drug targets themselves. 

      Strengths: 

      This study's strengths include that the authors used a variety of methodologies, all of which point to a clear effect of exogenous Mg2+ on drug resistance in the targeted species. I also commend the authors for carrying out a comprehensive study, spanning evaluation of whole cell phenotypes, metabolic pathways, genetic manipulation, to enzyme activity level evaluation. The fact that the authors uncovered a molecular mechanism underlying Mg2+-induced phenotypic resistance is particularly important as the key proteins should be studied further.

      Weaknesses: 

      I believe there are weaknesses in the manuscript, however. The authors take for granted that the reader is familiar with all the assays utilized, and do not properly explain some experiments, and thus I highly suggest that the authors add a brief statement in each situation describing the rationale for each selected methodology (more details are in the private review to the authors). The Results section is also quite long and bogs down at times, and I suggest that the authors reduce its length by 10 to 20%. In contrast, the Introduction is sparse and lacks key aspects, for example, there should be mention of the study's main purpose and approaches, plus an introduction to the authors' choice of species and their known drug resistance properties, as well as the drug of choice (balofloxacin). Another notable weakness is that the authors evaluated Mg2+-induced phenotypic resistance only against two closely related species, and thus the generalizability of this mechanism of drug resistance is not known. The paper would be strengthened if the authors could demonstrate this type of phenotypic resistance in at least one more Gram-negative species and at least one Gram-positive species (antimicrobial susceptibility evaluations would suffice), each of which should be pathogenic to humans. Demonstrating magnesium-induced phenotypic drug resistance in the WHO Priority Bacterial Pathogens would be particularly important. 

      In general, the conclusions drawn by the authors are justified by the data, except for the interpretation of some experiments. Importantly, this paper has discovered new antimicrobial resistance mechanisms and has also pointed to potential new targets for antimicrobials. 

      Thank you so much for your suggestion! We followed your idea the revise the manuscript as following:

      (1) We added a brief statement in the situation to explain the result and methodology according to your suggestion in the private review.

      (2) To make the streamline of the story more logic, we moved the whole second result to supplementary text and supplementary figure. 

      (3) We revised the introduction part by adding additional information to make it informative and to the point as following:

      “Non-inheritable antibiotic or phenotypic resistance represents a serious challenge for treating bacterial infections. Phenotypic resistance does not involve genetic mutations Phenotypic resistance does not involve genetic mutations and is transient, allowing bacteria to resume normal growth. Biofilm and bacterial persisters are two phenotypic resistance types that have been extensively studied (Brandis et al., 2023; Corona & Martinez, 2013). Biofilms have complex structures, containing elements that impede antibiotic diffusion, sequestering and inhibiting their activity (Ciofu et al., 2022). Biofilm-forming bacteria and persisters also have distinct metabolic states that significantly reduce their antibiotic susceptibility (Yan & Bassler, 2019). These two types of phenotypic resistance share the common feature in their retarded or even cease of growth in the presence of antibiotics (Corona & Martinez, 2013). However, specific factors that promote phenotypic resistance and allow bacteria to proliferate in the presence of antibiotics remain poorly defined.

      Metal ions have a diverse impact on the chemical, physical, and physiological processes of antibiotic resistance  (Booth et al, 2011; Lu et al, 2020; Poole, 2017). This includes genetic elements that confer resistance to metals and antibiotics (Poole, 2017) and metal cations that directly hinder (or enhance) the activity of specific antibiotic drugs (Zhang et al., 2014). The metabolic environment can also impact the sensitivity of bacteria to antibiotics (Jiang et al., 2023; Lee & Collins, 2012; Peng et al., 2015; Zhang et al., 2020; Zhao et al., 2021). Light metal ions, such as magnesium, sodium, and potassium, can behave as cofactors for different enzymes (Du et al., 2016) and influence drug efficacy. Heavy metal ions, including Cu2+ and Zn2+, confer resistance to antibiotics (Yazdankhah et al., 2014; Zhang et al., 2018). Recent reports suggest that sodium negatively regulates redox states to promote the antibiotic resistance of Vibrio alginolyticus (Yang et al., 2018), while actively growing Bacillus subtilis cope with ribosome-targeting antibiotics by modulating ion flux (Lee et al, 2019). In Gram-negative bacteria, by contrast, zinc enhances antibiotic efficacy by potentiating carbapenem, fluoroquinolone, and β-lactam-mediated killing (Isaei et al., 2016; Zhang et al., 2014). Magnesium influences bacterial structure, cell motility, enzyme function, cell signaling, and pathogenesis (Wang et al., 2019). This mineral also modulates microbiota to harvest energy from the diet (Garcia-Legorreta et al., 2020), allowing Bacillus subtilis to cope with ribosome-targeting antibiotics by modulating ion flux (Lee et al., 2019). However, the role of magnesium in promoting phenotypic resistance is less well understood.

      Vibrios inhabit seawater, estuaries, bays, and coastal waters, regions full of metal ions such as magnesium (Kumarage et al., 2022). Magnesium is the second most dissolved element in seawater after sodium. At a salinity of 3.5% seawater, the magnesium concentration is about 54 mM (Potis, 1968), and in deep seawater, can be as high as 2,500 mM (Wang et al., 2024). Vibrio parahaemolyticus and V. alginilyticus are two representative Vibrio pathogens that infect humans and aquatic animals, resulting in illness and economic loss, respectively (Grimes, 2020). (Fluoro)quinolones such as balofloxacin are used to treat Vibrio infection, however, resistance has emerged due to overuse (Suyamud et al., 2024). Indeed, (fluoro)quinolones are one of China's two primary residual chemicals associated with aquaculture (Liu et al., 2017). Vibrio can develop quinolone resistance through mutations in the DNA gyrase gene or through plasmid-mediated mechanisms (Dutta et al., 2021). Thus, the use of V. parahaemolyticus and V. alginilyticus as bacterial representatives, and balofloxacin as a quinolone-based antibacterial representative, can help to define novel magnesiumdependent phenotypic resistance mechanisms of pathogenic Vibrio species. 

      The current study evaluated whether magnesium induces phenotypic resistance in Vibrio species and defined the molecular/genetic basis for this resistance. Genetic approaches, GC-MS analysis of metabolite and membrane remodeling upon antibiotic exposure, membrane physiology, and extensive antimicrobial susceptibility testing were used for the evaluations.”

      (4) We examined the effect of magnesium in WHO listed priority strains, which confirmed the results as following:

      “Importantly, exogenous MgCl2 also increased MICs of clinic isolates, carbapenemresistant Escherichia coli, carbapenem-resistant Klebsiella pneumoniae, carbapenemresistant Pseudomonas aeruginosa and carbapenem-resistant Acinetobacter baumannii to balofloxacin (Fig 1G).”

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      (1) There are many grammatical mistakes to point out. The manuscript needs proofreading and editing.

      We appreciate this comment! The manuscript has been revised by a native speaker.

      (2) The introduction could be more informative. A little more description of magnesium - such as what it does to antibiotics and how it's known to affect the microbiome - might be helpful for the general readers. The question remains why out of all the metal ions that might affect antibiotic resistance (many of them are less explored), authors particularly decided to work on the effect of magnesium. The introduction should cover the rationale of their hypothesis. Also, the authors might want to briefly talk about the model organisms (V. algonolyticus and V. parahemolyticus) describing how threatening they are and how they are becoming resistant to antibiotics. 

      We appreciate this comment! We revise the introduction by providing additional information as following:

      “In Gram-negative bacteria, by contrast, zinc enhances antibiotic efficacy by potentiating carbapenem, fluoroquinolone, and β-lactam-mediated killing (Isaei et al., 2016; Zhang et al., 2014). Magnesium influences bacterial structure, cell motility, enzyme function, cell signaling, and pathogenesis (Wang et al., 2019). This mineral also modulates microbiota to harvest energy from the diet (Garcia-Legorreta et al., 2020), allowing Bacillus subtilis to cope with ribosome-targeting antibiotics by modulating ion flux (Lee et al., 2019). However, the role of magnesium in promoting phenotypic resistance is less well understood.

      Vibrios inhabit seawater, estuaries, bays, and coastal waters, regions full of metal ions such as magnesium (Kumarage et al., 2022). Magnesium is the second most dissolved element in seawater after sodium. At a salinity of 3.5% seawater, the magnesium concentration is about 54 mM (Potis, 1968), and in deep seawater, can be as high as 2,500 mM (Wang et al., 2024). Vibrio parahaemolyticus and V. alginilyticus are two representative Vibrio pathogens that infect humans and aquatic animals, resulting in illness and economic loss, respectively (Grimes, 2020). (Fluoro)quinolones such as balofloxacin are used to treat Vibrio infection, however, resistance has emerged due to overuse (Suyamud et al., 2024). Indeed, (fluoro)quinolones are one of China's two primary residual chemicals associated with aquaculture (Liu et al., 2017). Vibrio can develop quinolone resistance through mutations in the DNA gyrase gene or through plasmid-mediated mechanisms (Dutta et al., 2021). Thus, the use of V. parahaemolyticus and V. alginilyticus as bacterial representatives, and balofloxacin as a quinolone-based antibacterial representative, can help to define novel magnesiumdependent phenotypic resistance mechanisms of pathogenic Vibrio species. 

      The current study evaluated whether magnesium induces phenotypic resistance in Vibrio species and defined the molecular/genetic basis for this resistance. Genetic approaches, GC-MS analysis of metabolite and membrane remodeling upon antibiotic exposure, membrane physiology, and extensive antimicrobial susceptibility testing were used for the evaluations. ”

      (3) Figure 1C is mislabeled as 1B (line 100). Line 101: The sentence is not clear and very confusing. What is meant by 15.6mM - 62.4 mM? Are they talking about the concentration of BLFX (though in the figure the concentration was shown in µg)? Please rewrite the sentence in a simplified way. Also, the zone of inhibition was decreased with increasing MgCl2, not increased. 

      We appreciate this comment! These have been revised, including that Fig 1B is now corrected as Fig. 1C. Line 101, which is now Line 122. The sentence was revised as following:

      “At balofloxacin doses of 1.56, 3.125, 6.25, and 12.5 µg, the zone of inhibition decreased with increasing MgCl2 (Fig 1D)”

      (4) In the western blot images, it would be nice to indicate the MW of the protein bands shown. The loading control used for the experiments should be clearly mentioned in the figure legends. 

      We appreciate this comment! The MWs are indicated in the western-blot image throughout the manuscript. 

      The loading control is clearly stated in the figure legend as following:

      “Whole cell lysates resolved by SDS-PAGE gel was stained with Coomassie brilliant blue as loading control.”. 

      (5) Figures 2 B and C: the figure legend does not explain what the authors wanted to show. It's not clear how they plotted the inhibitory curve, or the binding efficacy. These panels need an explanation of how the analysis was done.

      We appreciate this comment! The figure 2 is now removed to Suppl. Fig 2, and the description of figure 2 is moved to Suppl. Text. We revise the description of the result as following, which is in Suppl. Text:

      “Prior studies suggest that the chelation of antibiotics by magnesium ions inhibits antibiotic uptake (Deitchman et al., 2018; Lunestad and Goksøyr, 1990). To investigate whether magnesium binds to balofloxacin, balofloxacin was pre-incubated with magnesium, and zone of inhibition (ZOI) analysis was conducted. Six different concentrations of balofloxacin were separately incubated with six different concentrations of MgCl2, and then spotted on filter paper so that a defined amount of balofloxacin could be used for ZOI. While lower concentrations of MgCl2, (0.78, 3.125, or 12.5 mM) did not alter the ZOI, higher concentrations, including 50 and 200 mM MgCl2, decreased the ZOI (Suppl. Fig 2A), suggesting that even high doses of magnesium had only a partial effect on balofloxacin through direct binding. For example, at 200 mM MgCl2 and 5 or 10 μg/mL balofloxacin, the balofloxacin ZOI was 53.2 and 70.3% of the ZOI at 0 mM MgCl2, suggesting that  50% of the antibiotics were still functional. Intracellular BLFX also decreased with increasing MgCl2 (Suppl. Fig 2B), while exogenous Mg2+ increased intracellular Mg2+ levels in a dose-dependent manner. For example, exogenous 50 and 200 mM MgCl2 increased intracellular Mg2+ levels to 1.21 and 1.31 mM, respectively (Suppl. Fig 2C). The relationship between TolC, an efflux pump that transports quinolones from bacterial cells, and Mg2+ was also assessed (Kobylka et al., 2020; Song et al., 2020). The expression of TolC/tolC was unaffected by Mg2+ (Suppl. Fig 2D). Magnesium is critical for LPS stability. LPS levels increased at 200 mM Mg2+ (Suppl. Fig 2E), however, the loss of waaF, lpxA, and lpxC, three key genes involved in LPS biosynthesis, did not influence balofloxacin sensitivity/resistance in the presence of Mg2+ (Suppl. Fig 2F). These findings suggest that magnesium-induced LPS biosynthesis does not contribute directly to BLFX resistance and demonstrate that Mg2+ influx is involved in balofloxacin resistance.”

      (6) For the metabolomics results, it will help immensely if the authors provide a volcano plot of the identified metabolites and plot the heat map according to the -log2 metabolite intensities. In Figure 3A, it's not clear what information is conveyed through Euclidean distance calculations of the heat map. In Figure 3 B, the authors mentioned that the OPLS-DA test was conducted, although the figure shows a PCA plot, so it's not clear how these two are connected. Figure 3 E: the figure legend says scattered plot, but the panel represents color-coded numerical values, not a scattered plot. Also, it's not clear how they got those values. 

      We appreciate this comment! We quite agree with you that if the differential metabolites could be shown as volcano plot. However, we didn’t adopt volcano plot in this study because this is a magnesium concentration-dependent metabolomes that includes 6 groups in parallel. Volcano plots may give a complex view of the comparison among different groups. We also tried to plot the heat map according to the -log2 metabolite intensities. Although this analysis cluster 200 mM and 50 mM groups better, the data of low magnesium concentrations was not consistent, which may be due to the minor metabolic change of low concentrations magnesium. Thank you for your understanding. 

      For Euclidean distance calculations, we explain in the figure legend as following:

      “Euclidean distance calculations were used to generate a heatmap that shows clustering of the biological and technical replicates of each treatment.” 

      In Figure 2B, which was Figure 3B in previous version, it has been replaced with OPLS-DA analysis in the revised version. 

      In Figure 2E, which was Figure 3E in previous version, it is revised as following:

      “E. Areas of the peaks of palmitic acid and stearic acid generated by GC-MS analysis.” 

      (7) In Figure 4, the figure legends (as well as the in the text) are not properly referred to. Please make sure to refer to the correct panel. 

      We appreciate this comment! The figure legends have been corrected to match the panel and text. 

      Figure 4F: how was the synergy analysis done? In the methods section, the authors described the antibiotic bactericidal assay protocol, but there was no clear indication of how they generated the isobologram. 

      We appreciate this comment! We provide additional information in the Figure 3F legend, which was Figure 4F in previous version,  as following: 

      “Synergy analysis for BFLX with palmitic acid for V. alginolyticus. Synergy was performed by comparing the dose needed for 50% inhibition of the synergistic agents (white) and non-synergistic (i.e., additive) agents (purple).”

      (8) Figure 5 A: the scatter plot is plotted according to the area along the Y axis: which "area" is represented here? There is absolutely no explanation, neither in the results nor in the figure legends. Using box plots might be a better option than using a scattered plot.

      We appreciate this comment! “Area” has been noted in the revised manuscript as following:

      “The area indicates the area of the peak of the metabolite in total ion chromatography of GC-MS.” 

      (9) In Figure 6 A, the heat map is plotted according to the column Z scores. What is meant by "column Z score"? The corresponding figure legend says, "heat map showing differential abundance of lipid". Z scores do not represent an abundance of a variable, so the conclusion might not be appropriate here. 

      We appreciate this comment! In Figure 5A, which was Figure 6A in previous version, column Z score shows the abundance of metabolites analyzed, which is automatically generated in the heat map analysis to give a sign of these metabolites tested. The legend has been revised as following: 

      “Heatmap showing changes in differential lipid levels at the indicated concentration of MgCl2.”  

      (10) Line 313-314: it should be Figure EV6C.  

      We appreciate this comment! The citation has been corrected.

      (11) The authors have shown that Mg+2 does not alter the LPS transport system, however, there was some significant increase in LPS expression at 200mM MgCl2. It would be interesting if the authors could also check if Mg+2 has any effect on the outer membrane protein (OMP) integrity (by checking OMP components BamA and LptD).  

      We appreciate this comment!  We have carefully examined the membrane permeability in Figure 7. We thus didn’t perform additional experiment here to see the change of BamA and LptD. Thank you very much for your understanding.

      (12) I wonder if the authors could check the effect of extracellular Mg+2 during the co-treatment of palmitic acid, linoleic acid, and balofloxacin. Will there still be the antagonistic effect or the presence of Mg+2 could change the phenotype? 

      We appreciate this comment! Additional experiments is performed as following:

      “Furthermore, magnesium had a minimal effect on the antagonistic effect of palmitic acid, linolenic acid, and balofloxacin (Fig 4G), suggesting that this mineral functions through lipid metabolism.” 

      Reviewer #2 (Recommendations For The Authors)

      (1) As mentioned in the Public Review, I strongly believe that the impact of this study will be more significant if magnesium-induced phenotypic drug resistance could be demonstrated in at least one other Gram-negative and one other Grampositive species, both of which should be human pathogens. The full suite of experiments would not be necessary for this suggestion; evaluation of the effect of Mg concentration in growth media on the drug resistance of other species, testing the different antibiotic types used in this study, would be sufficient. 

      We appreciate this comment! Additional experiments have performed to test this idea. Mg2+ has the similar effect on carbapenem-resistant Escherichia coli, carbapenem-resistant Klebsiella pneumoniae, carbapenem-resistant Pseudomonas aeruginosa and carbapenem-resistant Acinetobacter baumannii as the similar as on the Vibrio species in shown in Figure 1G. These have been described following as

      “Importantly, exogenous MgCl2 also increased MICs of clinic isolates, carbapenemresistant Escherichia coli, carbapenem-resistant Klebsiella pneumoniae, carbapenemresistant Pseudomonas aeruginosa and carbapenem-resistant Acinetobacter baumannii to balofloxacin (Fig 1G).”

      (2) I recommend that the Introduction section be expanded. I recommend one or two sentences introducing the two Vibrio species selected for study. I.e. why did the authors choose these two species? What is known about their phenotypic drug resistance in the literature? Why did the authors select balofloxacin for their studies, is it a common antimicrobial used vs Vibrios? As well, the end of the Introduction section ends abruptly with no transition to the present study itself. The end of the introduction should include one or two sentences introducing the main purpose of the study, its approach, and the techniques undertaken. For example, "In this study, we evaluated whether magnesium induces phenotypic resistance in Vibrio species and the molecular/genetic basis for such resistance. We used genetic approaches, GC-MS analysis of metabolite and membrane remodeling upon antibiotic exposure, membrane physiology, and extensive antimicrobial susceptibility evaluations." 

      We appreciate this comment! We revise the introduction by providing additional information as following:

      “In Gram-negative bacteria, by contrast, zinc enhances antibiotic efficacy by potentiating carbapenem, fluoroquinolone, and β-lactam-mediated killing (Isaei et al., 2016; Zhang et al., 2014). Magnesium influences bacterial structure, cell motility, enzyme function, cell signaling, and pathogenesis (Wang et al., 2019). This mineral also modulates microbiota to harvest energy from the diet (Garcia-Legorreta et al., 2020), allowing Bacillus subtilis to cope with ribosome-targeting antibiotics by modulating ion flux (Lee et al., 2019). However, the role of magnesium in promoting phenotypic resistance is less well understood.

      Vibrios inhabit seawater, estuaries, bays, and coastal waters, regions full of metal ions such as magnesium (Kumarage et al., 2022). Magnesium is the second most dissolved element in seawater after sodium. At a salinity of 3.5% seawater, the magnesium concentration is about 54 mM (Potis, 1968), and in deep seawater, can be as high as 2,500 mM (Wang et al., 2024). Vibrio parahaemolyticus and V. alginilyticus are two representative Vibrio pathogens that infect humans and aquatic animals, resulting in illness and economic loss, respectively (Grimes, 2020). (Fluoro)quinolones such as balofloxacin are used to treat Vibrio infection, however, resistance has emerged due to overuse (Suyamud et al., 2024). Indeed, (fluoro)quinolones are one of China's two primary residual chemicals associated with aquaculture (Liu et al., 2017). Vibrio can develop quinolone resistance through mutations in the DNA gyrase gene or through plasmid-mediated mechanisms (Dutta et al., 2021). Thus, the use of V. parahaemolyticus and V. alginilyticus as bacterial representatives, and balofloxacin as a quinolone-based antibacterial representative, can help to define novel magnesiumdependent phenotypic resistance mechanisms of pathogenic Vibrio species. 

      The current study evaluated whether magnesium induces phenotypic resistance in Vibrio species and defined the molecular/genetic basis for this resistance. Genetic approaches, GC-MS analysis of metabolite and membrane remodeling upon antibiotic exposure, membrane physiology, and extensive antimicrobial susceptibility testing were used for the evaluations. ”

      (3) The authors introduce the acronym AWST but never use it again in the paper, instead they use SWT. The authors should introduce SWT only for consistency. 

      We appreciate this comment! We have corrected all the “SWT” to “ASWT”

      (4) Line 76 is not clear: what is meant by "some of which could influence drug efficacy" - the enzymes that utilize light metal ions are co-factors? Or the metals directly?  

      We appreciate this comment! The information we wanted to deliver is that light metal ions can serve as cofactors to catalyze biochemical reaction. Such chemical reaction would alter the drug efficacy, e.g. the Fe-S cluster are metallocofactor for proteins which regulates redox chemistry including antibioticinduced redox change. However, this information is not appropriate for this manuscript, so we delete this sentence. 

      (5) Line 90: add a reference corroborating that this chemical composition is a mimic of marine water. The NaCl concentration used in particular looks quite low. 

      We appreciate this comment! It was a typo error. The NaCl concentration was 210 mM as shown in Suppl. Table 1. We also provide details of the chemical composition of the marine water as following:

      “Marine environments and agriculture, where antibiotics are commonly used, are rich in magnesium. To investigate whether this mineral impacts antibiotic activity, the minimal inhibitory concentration (MIC) of V. alginolyticus ATCC33787 and V. parahaemolyticus VP01, which we referred as ATCC33787 and VP01 afterwards, isolated from marine aquaculture, to balofloxacin (BLFX) in Luria-Bertani medium

      (LB medium) plus 3% NaCl as LBS medium and “artificial seawater” (ASWT) medium that included the major ion species in marine water (Wilson, 1975) (LB medium plus 210 mM NaCl, 35 mM Mg2SO4, 7 mM KCl, and 7 mM CaCl2) were assessed”

      (6) Line 98 and Figure 1B. M9 is indicated in the text but does not appear in the figure, the figure only shows SWT. This should be checked. Line 99: based on Figure 1C, the authors are adding MgCl2 to SWT, SWT should be mentioned in this line. Line 100: I believe this is referring to Figure 1C, which should be checked. 

      We appreciate this comment! 

      Line 98, which is now Line 118: We have corrected M9 to ASWT as following:

      “However, the MIC for BLFX was higher in ASWT medium supplemented with Mg2SO4 or MgCl2 than in LB medium (Fig 1B).”

      Line 99, which is now Line 133: the sentence is corrected as following:

      “The MIC for BLFX increased at higher concentrations of MgCl2 in ASWT”

      Line 100, which is now Line 135: we have corrected Fig 1B to Fig. 1C.

      (7) Line 101: text and Figure 1D are not consistent, as Figure 1D does not show this level of precision in added MgCl2 as indicated in the text (15.6 - 62.4 mM).  

      We appreciate this comment! The sentence has been corrected as following: “At balofloxacin doses of 1.56, 3.125, 6.25, and 12.5 µg, the zone of inhibition decreased with increasing MgCl2 (Fig 1D)””.  

      (8) MgCl2 clearly induces increasing levels of BLFX resistance, and to high levels, but not for every antibiotic. For example, the level of increased resistance to blactams is low (ceftriaxone) and plateaus (ceftazidime). As well, resistance to gentamicin plateaus at a lower level than the other aminoglycosides. These observations do not take away from the conclusion that Mg induces multi-drug resistance, but since the behaviour of the MICs for these drugs is different than the other drugs, they should be mentioned. Also, Figure 1F - tetracyclines (plural) is used for vertical axis label - does this refer to the tetracycline itself or the class itself, and if the class, which one was tested? 

      We appreciate this comment! We revise the description as following: “Notably, magnesium had a reduced effect on ceftriaxone and gentamicin than other antibiotics.”

      The tetracyclines is labeled as “Oxytetracycline” in the revised manuscript. 

      - The magnesium chelation experiments presented in Figure 2 are not clear. The authors should briefly mention how this was done around line 128, and what data underlies the values in Figure 2C. Figure 2B is also not clear to me at all. Similarly, how the authors measured intracellular balofloxacin and Mg2+ is not clear and should be mentioned briefly around lines 130-132. 

      We appreciate this comment! These have been rewritten following as  “To investigate whether magnesium binds to balofloxacin, balofloxacin was preincubated with magnesium, and zone of inhibition (ZOI) analysis was conducted. Six different concentrations of balofloxacin were separately incubated with six different concentrations of MgCl2, and then spotted on filter paper so that a defined amount of balofloxacin could be used for ZOI. While lower concentrations of MgCl2, (0.78, 3.125, or 12.5 mM) did not alter the ZOI, higher concentrations, including 50 and 200 mM MgCl2, decreased the ZOI (Suppl. Fig 2A), suggesting that even high doses of magnesium had only a partial effect on balofloxacin through direct binding. For example, at 200 mM MgCl2 and 5 or 10 μg/mL balofloxacin, the balofloxacin ZOI was 53.2 and 70.3% of the ZOI at 0 mM MgCl2, suggesting that  50% of the antibiotics were still functional. Intracellular BLFX also decreased with increasing MgCl2 (Suppl. Fig 2B), while exogenous Mg2+ increased intracellular Mg2+ levels in a dose-dependent manner. For example, exogenous 50 and 200 mM MgCl2 increased intracellular Mg2+ levels to 1.21 and 1.31 mM, respectively (Suppl. Fig 2C). The relationship between TolC, an efflux pump that transports quinolones from bacterial cells, and Mg2+ was also assessed (Kobylka et al., 2020; Song et al., 2020). The expression of TolC/tolC was unaffected by Mg2+ (Suppl. Fig 2D). Magnesium is critical for LPS stability. LPS levels increased at 200 mM Mg2+ (Suppl. Fig 2E), however, the loss of waaF, lpxA, and lpxC, three key genes involved in LPS biosynthesis, did not influence balofloxacin sensitivity/resistance in the presence of Mg2+ (Suppl. Fig 2F). These findings suggest that magnesium-induced LPS biosynthesis does not contribute directly to BLFX resistance and demonstrate that Mg2+ influx is involved in balofloxacin resistance.”

      - Line 135: LPS cannot be "expressed", as the authors word it here. This should be corrected. Also, the inspection of Figure 2G actually shows the levels of LPS increase with increased Mg2+. The authors should re-evaluate these results and change their description around this area of the Results. 

      We appreciate this comment! We have removed the whole Figure 2 to Supplementary Text and Supplementary Figure 2. We rewrite this part as following: “The relationship between TolC, an efflux pump that transports quinolones from bacterial cells, and Mg2+ was also assessed (Kobylka et al., 2020; Song et al., 2020). The expression of TolC/tolC was unaffected by Mg2+ (Suppl. Fig 2D). Magnesium is critical for LPS stability. LPS levels increased at 200 mM Mg2+ (Suppl. Fig 2E), however, the loss of waaF, lpxA, and lpxC, three key genes involved in LPS biosynthesis, did not influence balofloxacin sensitivity/resistance in the presence of Mg2+ (Suppl. Fig 2F). These findings suggest that magnesium-induced LPS biosynthesis does not contribute directly to BLFX resistance and demonstrate that Mg2+ influx is involved in balofloxacin resistance.”

      - Section: MgCl2 affects bacterial metabolism. Authors switched to M9 medium - why? This contrasts with other sections using SWT and should be explained. Also, I cannot evaluate whether the statistical analysis of the data here was performed correctly and was appropriate for this type of experiment. I advise the authors to move the details in lines 166-169 to the Materials and Methods and replace this section instead with a more accessible description of the statistical analysis that a non-expert would be able to appreciate. Furthermore, analysis of Figure 3A indicates that the levels of asparagine, 4-hydroxybutyric acid, uracil, cystathionine, fumaric acid, and aminoethanol have significantly changed at high MgCl2, but these are not mentioned in the text. I suggest the authors mention these if they are relevant to the 12 enriched pathways, especially the biosynthesis of fatty acids. 

      We appreciate this comment! 

      We indicate the reason we use M9 medium as following:

      “To better understand how magnesium affects bacterial metabolism” for explaining why the M9 medium was used.”

      The information lines 166-169 indicated has been removed to M &M. 

      We have carefully examined the abundance of the metabolites and the enriched pathway. Among the listed metabolites, only fumarate is within the enriched pathways. We mention this point in our revised manuscript as following:

      “The increase in fatty acid biosynthesis could be partially explained by an imbalanced pyruvate cycle/TCA cycle, in which fumarate levels increased at higher Mg2+ while succinate levels increased at lower Mg2+ (Suppl. Fig 5B). These findings indicated that glycolysis fluxes into fatty acid biosynthesis rather than the pyruvate cycle/TCA cycle. The relevance of fatty acids and BLFX was demonstrated by the observation that exogenous palmitic acid increased bacterial resistance to balofloxacin (Fig 2F). These results suggest that fatty acid metabolism may be critical to magnesium-based phenotypic resistance.”

      - Line 211 appears to refer to Figure 4F and should be checked. Similarly in line 216 - appears this should be Figure 4H, and line 218 should be Figure 4H. Line 226: add a reference to Fig 4I (after arcA was decreased). Line 227: what are genes N646_1004 and N646_1885? Based on Fig 4J these are crp - authors should add to line 227. Line 228 appears to refer to Figure 4J, not Figure 4I. Line 229 - should be Figure 4K, not Figure 4I. Line 231 - should be 4L, not 4K. Line 239 - should be 4M.

      We appreciate this comment! The text and figure is now matched. 

      - Line 312: the descriptions of "11 lipids, 32 lipids, and 53", and then "26 lipids, 52 lipids, and 107 lipids" are not clear at all and should be corrected. 

      We appreciate this comment! The sentence is revised as following:

      “The abundance of 11, 32, and 53 lipids was increased in 3.125, 50, and 200 mM MgCl2-treated bacteria, respectively, while the abundance of 26, 52, and 107 lipids was decreased in 3.125, 50, and 200 mM MgCl2-treated bacteria, respectively (Suppl. Fig 7C)”

      - Line 340. What is the assay the authors are using to measure the levels of the PGS and PSS enzymes? This is not mentioned or clear in this part of the Results.  

      We appreciate this comment!  We provide the information in the manuscript as following:

      “Levels of PGS and PSS were quantified by ELISA kits according to manufacture’s instruction (Shanghai Fusheng Industrial Co., Ltd., China)”

      - Line 372: What is the assay for measuring membrane depolarization? This is not mentioned and I suggest it should be. Line 374: Figure 7B does not show time dependence, only dose dependence, this should be corrected, it is assumed the authors are referring to Fig 7C for the time dependence data. 

      We appreciate this comment! We provide the information in the result as following:  

      “The voltage-sensitive dye, DiBAC4(3) showed that 12.5–200 mM MgCl2 promoted membrane depolarization in a dose-dependent manner (Fig 6A)”

      We also explain how DiBAC4(3) can be used to measure membrane depolarization in the Materials and Methods section as following:

      “DiBAC4(3) is a s voltage-sensitive probe that penetrates depolarized cells, binding intracellular proteins or membranes exhibiting enhanced fluorescence and red spectral shift.”

      To make it clear the specific figure, we revise the sentence as following:

      “Meanwhile, MgCl2 had a dose-dependent (Fig 6B) and time-dependent (Fig 6C) effect on proton motive force (PMF).”

      - Line 384: mention how FM5-95 measures membrane permeability. The authors should also clarify how this reagent is used to measure membrane fluidity, and it is not clear if the data for this is presented in Figure 7 - please clarify. Regarding SYTO9 dye experiment: the authors should briefly explain the experimental design - how SYTO9 dye operates and why FACS was chosen. What is labeled with FITC?  

      We appreciate this comment! We clarify the reason we use FM5-95 in the Methods and Materials section as following:

      “Measurement of fluidity by fluorescence microscopy

      Measurement of membrane fluidity is performed as previously described (Wen et al., 2022). Briefly, ATCC33787 were cultured in medium with indicated concentrations of MgCl2, collected and then adjusted to OD 0.6. Aliquot of 100 μL bacteria cells of each sample were diluted to 1 mL and 10 μL (10 mg/mL) FM5-95 (Thermo Fisher

      Scientific, USA) was added. FM5-95 is a lipophilic styryl dye that insert into the outer leaflet of bacterial membrane and become fluorescence. This dye preferentially bind to the microdomains with high membrane fluidity(Wen et al., 2022). After incubated for 20 min at 30 ℃ at vibration without light, the sample was centrifuged for 10 min at 12,000 rpm. The pellets were resuspended with 20 μL of 3% NaCI. Aliquot of 2 μL sample was dropped on the agarose slide, and take photos under the inverted fluorescence microscope.”

      This data is presented as micrographs in Fig. 6D, which shows the decreased FM5-95 staining with increasing concentrations of MgCl2. We make this description clear with the following revision:

      “FM5-95 staining decreased with increasing concentrations of Mg2+, and no staining was observed in the presence of 200 mM Mg2+ (Fig 6D).”

      We explain the reason why we use SYTO9 as following:

      “SYTO9, a green fluorescent dye that binds to nucleic acid, enters and stains bacteria cells when there is an increase in membrane permeability (Lehtinen et al., 2004; McGoverin et al., 2020). Staining decreased with increasing MgCl2, indicating that bacterial membrane permeability declined in an Mg2+ dose-dependent manner (Fig 6E).”

      We didn’t use FACS in this study, while we only analyze the fluorescence distribution with the equipment. To make it clear, we revise the sentence as following:

      “After incubated for 15 min at 30 ℃ at vibration without light, the mixtures were filtered and measured by flow cytometry (BD FACSCalibur, USA).”

      - Lines 391-397. The statement that palmitic acid shifts the peaks in Figure 7F is not supported by the data. There is essential no change in the major peak position within each MgCl2 concentration set with increasing palmitic acid. For the linolenic acid data, it is clear that linolenic acid increases permeability only at 50 mM MgCl2-this should be mentioned in the text. 

      We appreciate this comment! We revise the sentence as following:

      “Exogenous palmitic acid also shifted the fluorescence signal peaks to the left in an MgCl2-dependent manner while palmitic acid only slightly shifted the peaks (Fig 6F). In contrast, exogenous linolenic acid shifted the peak to the right in a dose-dependent manner at 50 mM MgCl2 (Fig 6G).” 

      - Line 404-405 - as mentioned earlier, the assay for the update of BLFX should be mentioned (if it is done so earlier in the text, then it does not need to be here).  

      We appreciate this comment! It has been mentioned in the introduction.  

      - Discussion: CpxA/R-OmprF pathway is mentioned here for the first time. Is this one of the pathways modified by MgCl2 as determined during the course of the study? If so, this should be reworded to mention that. If not, the relevance of this particular pathway as it relates to light metals and phenotypic resistance should be discussed.

      We appreciate this comment! Since it is not relevant to the discussion of Mg2+ and fatty acid biosynthesis, we delete this sentence in the revised manuscript.  

      -The following grammatical errors should be corrected:

      -line 55 change to: "genetic mutations; instead, this type of resistance is transient, and bacteria resume normal growth"

      -line 57: change to "resistance types are biofilm" 

      -line 61: change to "states that significantly" 

      -line 63: change to "resistance share the common feature in they retard or even cease in the presence" 

      -line 65: change to "resistance that allow bacteria to proliferate" 

      -line 81: change "But whether" to "Whether" 

      -line 178: change to "may be critical to the Mg-based phenotypic resistance"

      -line 86: change to "Marine environments and agriculture are rich in magnesium, where..." 

      -line 93: change in to vs

      -line 154: insert space after metabolism 

      -line 158: change 'identified" to "focused on the levels of" 

      -line 160: change "The levels of forty-one metabolites" 

      -line 198: change shared to share 

      -line 310: increased is duplicated, delete one 

      -line 451: add "the" before ratio 

      -line 453: gram should be capitalized 

      -line 462: "the regulation" should be reworded to "More importantly, the effect of exogenous MgCl targets the..." 

      -line 469: add dash between Mg2+ and limited

      -line 478: change "the crucial" to "a crucial" 

      -there are numerous locations in the manuscript where the word "magnetism" is used when clearly the word is supposed to be magnesium - this should be corrected

      We appreciate this comment! These have been corrected or revised. 

      Editors comments:

      Page 2 line 27; Page 25 line number 426; page 27 line number 481: In the abstract and discussion, only Vibrio alginolyticus was mentioned, even though two Vibrio species were used in the study. It would be helpful to understand the rationale behind the focus on this particular species.

      We appreciate this comment! We have revised the introduction to provide additional information as following:

      “Vibrios inhabit seawater, estuaries, bays, and coastal waters, regions full of metal ions such as magnesium (Kumarage et al., 2022). Magnesium is the second most dissolved element in seawater after sodium. At a salinity of 3.5% seawater, the magnesium concentration is about 54 mM (Potis, 1968), and in deep seawater, can be as high as 2,500 mM (Wang et al., 2024). Vibrio parahaemolyticus and V. alginilyticus are two representative Vibrio pathogens that infect humans and aquatic animals, resulting in illness and economic loss, respectively (Grimes, 2020). (Fluoro)quinolones such as balofloxacin are used to treat Vibrio infection, however, resistance has emerged due to overuse (Suyamud et al., 2024). Indeed, (fluoro)quinolones are one of China's two primary residual chemicals associated with aquaculture (Liu et al., 2017). Vibrio can develop quinolone resistance through mutations in the DNA gyrase gene or through plasmid-mediated mechanisms (Dutta et al., 2021). Thus, the use of V. parahaemolyticus and V. alginilyticus as bacterial representatives, and balofloxacin as a quinolone-based antibacterial representative, can help to define novel magnesium-dependent phenotypic resistance mechanisms of pathogenic Vibrio species.”

      On Page 2, line 34: The abstract contains some undefined abbreviations, such as 'PE' and 'PG', which should be explained. 

      We appreciate this comment! We explain the PE and PG in the revised abstract as following:

      “phosphatidylethanolamine (PE) biosynthesis is reduced and phosphatidylglycerol (PG)”

      On Page 2, line 31-32: For the statement "Exogenous supplementation of fatty acids confirm the role of fatty acids in antibiotic resistance…" it would be beneficial to specify whether the fatty acids were saturated or unsaturated. 

      Response, We appreciate this comment! We revise the sentence as following:

      “Exogenous supplementation of unsaturated and saturated fatty acids increased and decreased bacterial susceptibility to antibiotics, respectively, confirming the role of fatty acids in antibiotic resistance.”

      The potential effects of the specific ions (SO4 and Cl2) present in the Mg2SO4 and MgCl2 compounds used in the study were not discussed. It would be useful to understand if these ions had any influence on the observed outcomes.

      We appreciate this comment! We revise the sentence as following:

      “However, the MIC for BLFX was higher in ASWT medium supplemented with Mg2SO4 or MgCl2 than in LB medium (Fig 1B). And Mg2SO4 or MgCl2 had no

      difference on MIC, suggesting it is Mg2+ not other ions contribute to the MIC change.”

      On Page 8, line 141: The heading of Figure 2, "Mg2+ elevates intracellular Mg2+," seems redundant and could be revised for clarity or modified. 

      We appreciate this comment! Figure 2 is now moved to supplementary figure as Suppl. Fig 2. The title is revised as following:

      “Figure 2. Mg2+ decreases balofloxacin uptake.”

      On Page 4, line 91: some terms/abbreviations, such as 'LB' and 'M9,' require expansion or definition to ensure the reader's understanding.

      We appreciate this comment! We include the expansion for LB and M9 in the  revised manuscript as following:

      “Luria-Bertani medium (LB medium)” and “M9 minimal medium (M9 medium)”

      Page 4, line 92: The real seawater composition used in the experiments should be supported by a reference.

      We appreciate this comment! We provide the reference in the revised manuscript as following:

      ““artificial seawater” (ASWT) medium that included the major ion species in marine water (Wilson, 1975) (LB medium plus 210 mM NaCl, 35 mM Mg2SO4, 7 mM KCl, and 7 mM CaCl2)”

      Page 4 line, number 93: the he full names of the bacterial strains (e.g., ATCC33787 and VP01) should be provided instead of just the strain numbers.

      We appreciate this comment! We revised the sentence as following:

      “To investigate whether this mineral impacts antibiotic activity, the minimal inhibitory concentration (MIC) of V. alginolyticus ATCC33787 and V. parahaemolyticus VP01, which we referred as ATCC33787 and VP01 afterwards,”

      Finally, there appears to be a potential contradiction between the statements on page 12, lines 211-212 and 214-216, regarding the effects of Mg2+ on the synthesis of unsaturated fatty acids. Further explanation may be needed to reconcile these seemingly contradictory points.

      We appreciate this comment! For line 221-226, which was previously line 211-212, is about the gene expression for fatty acid biosynthesis. While, Line 228 and 233, which was previously line 214-216 is about the gene expression for fatty acid degradation. We agree that the previous description is a little bit confuse. We revise the sentence to emphasize that we focus on fatty acid degradation so that the readers can tell them apart. 

      In the text, we revised it as following:

      “In addition, we also quantified gene expression during fatty acid degradation to determine whether Mg2+ affects this process”  In the figure legend, we also indicate that 

      “H. qRT-PCR for the expression of genes encoding fatty acid degradation in the absence or presence of the indicated concentrations of MgCl2”

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      In this paper Homan et al used mouse models of Metabolic Dysfunction-Associated Steatotic Liver Disease and different specific target deletions in cells to rule out the role of Complement 3a Receptor 1 in the pathogenesis of disease. They provided limited evidence and only descriptive results that despite C3aR being relevant in different contexts of inflammation, however, these tenets did not hold true.

      Weaknesses:

      (1) The results are based on readouts showing that C3aR is not involved in the pathogenesis of liver metabolic disease.

      (2) The description of the mouse models they used to validate their findings is not clear. Lysm-cre mice - which are claimed to delete C3aR in (?) macrophages are not specific for these cells, and the genetic strategy to delete C3aR in Kupffer cells is not clear.

      (3) Taking this into account, it is very challenging to determine the validity of these data, also considering that they are merely descriptive and correlative.

      We generated 2 different cohorts of mice using LysM-Cre (Jackson Strain #004781) to drive deletion in all macrophages and Clec4f-Cre (Jackson Strain #033296) to specifically ablate C3ar1 in Kupffer cells. These experimental models have been clearly defined in the revised manuscript on pages 5 and 7 and in the methods section (page 10). The reviewer’s point is well taken that the LysM-Cre transgene can also be active in granulocytes and some dendritic cells. Even so, despite deletion of C3ar1 in macrophages and other granulocytes, we do not see a major effect on hepatic steatosis and fibrosis in this GAN diet induced model of MASLD/MASH. This was a somewhat surprising finding. We do not agree that our findings are correlative. We specifically ablated C3aR1 in macrophages or Kupffer cells and found no significant differences in the major readouts of steatosis and fibrosis for MASLD/MASH between control and knockout mice. It is possible that in other models of liver injury that we did not test (e.g., short-term treatment with a hepatotoxin such as carbon tetrachloride), there may be differences in liver injury in mice lacking C3ar1 in macrophages, but the GAN diet model has been shown to better parallel the gene expression changes in human MAFLD/MASH. This has been added to the discussion (page 9).

      Reviewer #2 (Public review):

      Summary:

      Homan et al. examined the effect of macrophage- or Kupffer cell-specific C3aR1 KO on MASLD/MASH-related metabolic or liver phenotypes.

      Strengths:

      Established macrophage- or Kupffer cell-specific C3aR1 KO mice.

      Weaknesses:

      Lack of in-depth study; flaws in comparisons between KC-specific C3aR1KO and WT in the context of MASLD/MASH, because MASLD/MASH WT mice likely have a low abundance of C3aR1 on KCs.

      Homan et al. reported a set of observation data from macrophage or Kupffer cell-specific C3aR1KO mice. Several questions and concerns as follows could challenge the conclusions of this study:

      (1) As C3aR1 is robustly repressed in MASLD or MASH liver, GAN feeding likely reduced C3aR1 abundance in the liver of WT mice. Thus, it is not surprising that there were no significant differences in liver phenotypes between WT vs. C3aR1KO mice after prolonged GAN diet feeding. It would give more significance to the study if restoring C3aR1 abundance in KCs in the context of MASLD/MASH.

      GAN diet feeding resulted in higher liver C3ar1 compared to regular diet (Figure 1H). This thus became an impetus for studying the effects of C3ar1 deletion in macrophages or Kupffer cells, which are responsible for the majority of liver C3ar1 expression, in MASLD/MASH (Figures 2B and 3H). This point has been added to the text on page 5.

      (2) Would C3aR1KO mice develop liver abnormalities after a short period of GAN diet feeding?

      We did not assess if short term GAN diet feeding resulted in significant differences in liver abnormalities in the C3ar1 macrophage or Kupffer cell knockout mice. Perhaps the reviewer’s point is that perhaps with shorter periods of GAN diet feeding there may be a phenotype in the KO mice. We agree that this is entirely possible, though with shorter feeding timeframes what is typically seen is hepatic steatosis without fibrosis. Nevertheless, the most important element in our opinion for a disease preventing or modifying model lies with the longer-term GAN diet feeding. With long term GAN diet feeding that has been previously shown to model human MASLD/MASH, we did not observe significant differences in liver abnormalities with the KO mice. This has been added to the discussion (page 8).

      (3) What would be the liver macrophage phenotypes in WT vs C3aR1KO mice after GAN feeding?

      Similar to the above point, given the lack of a major MASLD/MASH phenotype in hepatic steatosis and fibrosis, we did not further profile the liver macrophage profiles of the macrophage or Kupffer cell C3ar1 KO mice with GAN feeding.

      (4) In Fig 1D, >25wks GAN feeding had minimal effects on female body weight gain. These GAN-fed female mice also develop NASLD/MASH liver abnormalities?

      We thank the reviewer for this question. In general, female GAN-fed mice develop milder MASLD/MASH abnormalities. We have included additional data in the revised manuscript in Figure S4. These results show no to minimal development of a MASLD/MASH gene signature.

      (5) Would C3aR1KO result in differences in liver phenotypes, including macrophage population/activation, liver inflammation, lipogenesis, in lean mice?

      We have provided additional data further characterizing liver inflammation, lipogenesis and macrophages in macrophage C3ar1 KO mice under lean/regular diet conditions in Figure 2K. These results show a potential trend but no substantial development of a MASLD/MASH gene signature.

      (6) The authors should provide more information regarding the generation of KC-specific C3aR1KO. Which Cre mice were used to breed with C3aR1 flox mice?

      Clec4f-Cre transgenic mice were used to generate Kupffer cell specific KO of C3ar1. This has been clarified and explicitly stated in the revised manuscript on page 7 and in the methods section.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      These data should be repeated using a more established model of Kupffer cell target deletion via Clec4-F mice.

      Our data with Kupffer cell C3ar1 deletion is indeed done with Clec4f-Cre transgenic mice. This has been clarified in the revised manuscript on page 7 and in the methods section.

      Reviewer #2 (Recommendations for the authors):

      (1) Typo: "iver" in the abstract

      (2) Line 97, "GAN diet I" should be "GAN diet"?

      These points have been corrected in the revised manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review): 

      Summary: 

      Recent years have seen spectacular and controversial claims that loss of function of the RNA splicing factor Ptbp1 can efficiently reprogram astrocytes into functional neurons that can rescue motor defects seen in 6-hydroxydopamine (6-OHDA)-induced mouse models of Parkinson's disease (PD). This latest study is one of a series that fails to reproduce these observations, but remarkably also reports that neuronal-specific loss of function of Ptbp1 both induces expression of dopaminergic neuronal markers in striatal neurons and rescues motor defects seen in 6-OHDA-treated mice. The claims, if replicated, are remarkable and identify a straightforward and potentially translationally relevant mechanism for treating motor defects seen in PD models. However, while the reported behavioral effects are strong and were collected without sample exclusion, other claims made here are less convincing. In particular, no evidence that Ptbp1 loss of function actually occurs in striatal neurons is provided, and the immunostaining data used to claim that dopaminergic markers are induced in striatal neurons is not convincing. Furthermore, no characterization of the molecular identity of Ptbp1-deficient striatal neurons is provided using single-cell RNA-Seq or spatial transcriptomics, making it difficult to conclude that these cells are indeed adopting a dopaminergic phenotype. 

      Overall, while the claims of behavioral rescue of 6-OHDA-treated mice appear compelling, it is essential that these be independently replicated as soon as possible before further studies on this topic are carried out. Insights into the molecular mechanisms by which neuronalspecific loss of function of Ptbp1 induces behavioral rescue are lacking, however. Moreover, the claims of induction of neuronal identity in striatal neurons by Ptbp1 require considerable additional work to be convincing.

      We thank the reviewer for the detailed analysis of our study. Please find our answers to the points raised by the reviewer below in blue.

      Strengths of the study: 

      (1) The effect size of the behavioral rescue in the stepping and cylinder tests is strong and significant, essentially restoring 6-OHDA-lesioned mice to control levels.

      (2) Since the neurotoxic effects of 6-OHDA treatment are highly variable, the fact that all behavioral data was collected blinded and that no samples were excluded from analysis increases confidence in the accuracy of the results reported here. 

      We appreciate the reviewer’s feedback and acknowledgement of the strengths of our study. We undertook several optimization steps in the surgery, post-operative care, and handling of the animals for behavior experiments to ensure high reproducibility of our experiments.

      Weaknesses of the study:  

      (1) Neurons express relatively little Ptbp1. Indeed, cellular expression levels as measured by scRNA-Seq are substantially below those of astrocytes and other non-neuronal cell types, and Ptbp1 immunoreactivity has not been observed in either striatal or midbrain neurons (e.g. Hoang, et al. Nature 2023). This raises the question of whether any recovery of Th expression is indeed mediated by the loss of function of Ptbp1 rather than by off-target effects. AAVmediated rescue of Ptbp1 expression could help clarify this.

      In the original manuscript, we delivered control vectors that only express the ABE to 6-OHDAlesioned mice (labeled as AAV-ctrl) and did not detect TH positive cells in the midbrain or striatum of control mice or rescue of spontaneous motor skills. We can therefore exclude that the delivery procedure, AAV-PHP.eB capsid, or ABE expression caused adverse effects leading to induction of TH expression and functional rescue of spontaneous motor behaviors in PD mice. To further exclude that these effects were caused by off-target editing, we experimentally determined off-target binding sites of our sgRNA (sgRNA-ex3) using GUIDEseq and subsequently analyzed these sites in treated animals by NGS (Figure 3 – supplement 3). While two off-target sites were identified, it is unlikely that base editing at these sites caused the observed phenotypes. One off-target site was identified in the myopalladin (Mypn) gene, which encodes for a muscle-specific protein that plays a role in regulating the structure and growth of skeletal and cardiac muscle (Filomena et al., 2021, 2020).  The other site is not located in a coding region, but in an intron of the ankyrin-1 (Ank1) gene, encoding for an adaptor protein linking membrane proteins to the underlying cytoskeleton (Cunha and Mohler, 2009). Even though this gene is also expressed in neurons, base editing within this intronic region did not lead to changes in transcript levels (Figure 3 – supplement 3). Thus, the induction of TH expression upon adenine base editing with sgRNA-ex3 is likely a direct consequence of PTBP1 downregulation.

      Further supporting this conclusion, in the revised manuscript we additionally show PTBP1 downregulation at the RNA and protein level in the SNc and striatum after base editor treatment (Figure 2 – figure supplement 5; figure 3 – supplement 2).

      (2) It is not clear why dopaminergic neurons, which are not normally found in the striatum, are observed following Ptbp1 knockout. This is very similar to the now-debunked claims made in Zhou, et al. Cell 2020, but here performed using the hSyn rather than GFAP mini promoter to control AAV expression. While this is the most dramatic and potentially translationally relevant claim of the study, this claim is extremely surprising and lacks any clear mechanistic explanation for why it might happen in the first place.  

      We agree with the reviewer that our study does not provide mechanistic insights into how Ptbp1 downregulation in neurons leads to the induction of dopaminergic markers in the striatum. As we believe that this is not within the scope of a revision, we discuss potential follow-up experiments in the discussion section of the revised manuscript.

      This observation is even more surprising in light of reports that antisense oligonucleotidemediated knockdown of Ptbp1, which should have affected both neuronal and glial Ptbp1 expression, failed to induce expression of dopaminergic neuronal markers in the striatum (Chen, et al. eLife 2022). Selective loss of function of Ptbp1 in striatal and midbrain astrocytes likewise results in only modest changes in gene expression. 

      Using 6-OHDA lesioned Aldh1l1-CreERT2;Rpl22lsl-HA mice, the Chen et al. study (eLife 2022) assessed potential astrocyte to neuron conversion by quantifying the presence of HA-labeled neurons after ASO-mediated knockdown of Ptbp1. Even though they did not detect HApositive neurons in the SNc, suggesting absence of astrocyte to neuron conversion, the images in Figure 4D reveal TH positive cells in the lesioned hemisphere, similar to our observations in Figure 2B-D. While it cannot be excluded that these TH positive cells are remnants from an incomplete 6-OHDA lesion, they could also be endogenous neurons with induced expression of dopaminergic markers after ASO-mediated knockdown of Ptbp1. Furthermore, Chen et al. performed the apomorphine test to assess changes in motor skills, which did not reveal an improvement in our study either.

      It is critically important that this claim be independently replicated, and that additional data be provided to conclusively show that striatal neurons are indeed expressing dopaminergic markers.

      Our behavior and immunofluorescence experiments involving mice injected into the striatum were performed with two independently generated cohorts of 6-OHDA mice. In detail, the 6OHDA mice were generated by two independent surgeons from different labs (>6 months between experiments of these cohorts), leading to comparable behavioral outcomes before and after treatment. Subsequent behavior and immunofluorescence experiments with each cohort were performed and analyzed by two independent and blinded researchers, showing comparable results.

      (3) More generally, since multiple spectacular and irreproducible claims of single-step glial-toneuron reprogramming have appeared in high-profile journals in recent years, a consensus has emerged that it is essential to comprehensively characterize the identity of "transformed" cells using either single-cell RNA-Seq or spatial transcriptomics (e.g. Qian, et al. FEBS J 2021; Wang and Zhang, Dev Neurobiol 2022). These concerns apply equally to claims of neuronal subtype conversion such as those advanced here, and it is essential to provide these same datasets. 

      In the revised version, we have analyzed the expression of additional neuronal markers in TH positive cells of the striatum using 4i imaging. Briefly, our results showed that the vast majority of TH-expressing cells also expressed the markers DAT and NEUN, further corroborating the neuronal and dopaminergic identity of these cells. Additional analysis revealed that this TH/DAT/NEUN expressing cell population expressed markers of GABAergic neurons, either of medium spiny neurons (~50%) and various types of interneurons (~50%). While our 4i analysis has allowed us to broadly classify these TH-expressing populations, we agree that detailed transcriptional analysis at the single cell level is required to understand the molecular mechanisms underlying the generation of TH positive cells. These analyses are, however, not within the scope of a revision and would require a thorough dedicated study. We have added these results and discussion points to the revised manuscript.

      (4) Low-power images are generally lacking for immunohistochemical data shown in Figures 3 and 4, which makes interpretation difficult. DAPI images in Figure 3C do not appear nuclear. Immunostaining for Th, DAT, and Dcx in Figure 4 shows a high background and is difficult to interpret. 

      We thank the reviewer for closely evaluating these images and suggestions for improvement. In the revised manuscript, we provide low power images and higher magnification insets as requested to allow for easier interpretation.

      (5) Insights into the mechanism by which neuronal-specific loss of Ptbp1 function induces either functional recovery, or dopaminergic markers in striatal neurons, is lacking.

      In the revised manuscript, we provide a more detailed discussion of mechanisms that could potentially be involved in the functional recovery or expression of dopaminergic markers. However, deciphering the exact molecular mechanisms underlying these observations requires thorough transcriptional analysis at the single cell level, which is out of scope of this revision.

      Reviewer #2 (Public Review):

      Summary: 

      The manuscript by Bock and colleagues describes the generation of an AAV-delivered adenine base editing strategy to knockdown PTBP1 and the behavioral and neurorestorative effects of specifically knocking down striatal or nigral PTBP1 in astrocytes or neurons in a mouse model of Parkinson's disease. The authors found that knocking down PTBP1 in neurons, but not astrocytes, and in striatum, but not nigra, results in the phenotypic reorganization of neurons to TH+ cells sufficient to rescue motor phenotypes, though insufficient to normalize responses to dopaminomimetic drugs.

      Strengths: 

      The manuscript is generally well-written and adds to the growing literature challenging previous findings by Qian et al., 2020 and Zhou et al., 2020 indicating that astrocytic downregulation of PTBP1 can induce conversion to dopaminergic neurons in the midbrain and improve parkinsonian symptoms. The base editing approach is interesting and potentially more therapeutically relevant than previous approaches.

      Weaknesses: 

      The manuscript has several weaknesses in approach and interpretation. In terms of approach, the animal model utilized, the 6-OHDA model, though useful to examine dopaminergic cell loss, exhibits accelerated neurodegeneration and none of the typical pathological hallmarks (synucleinopathy, Lewy bodies, etc.) compared to the typical etiology of Parkinson's disease, limiting its translational interpretation. 

      We thank the reviewer for the detailed assessment of our study and pinpointing its current weaknesses. Please find our answers to all comments below in blue.

      We agree with the reviewer that the 6-OHDA model lacks the typical pathological hallmarks of PD. Nevertheless, we chose this model for two reasons:

      i) The 6-OHDA model was used by both Qian et al. (2020) and Zhou et al. (2020). To allow comparison of our results to these studies, it was crucial to use the same model. Notably, the 6-OHDA model was also used by Chen et al. (2022) and Hoang et al. (2023) for comparison to the two studies from 2020.

      ii) The 6-OHDA model is straightforward to generate and displays robust motor impairments for evaluation of potential therapeutic effects of neuroregeneration treatment approaches. We therefore believe that the model is well-suited to analyze the cellular and behavioral effects (specifically motor skills) of PTBP1 downregulation. 

      In future studies, it would be critical to include models that also display typical pathological hallmarks of the disease to further evaluate the therapeutic effect of this base editing approach. These experiments are, however, not within the scope of this study, which was aimed to focus on the cellular and behavioral effects of PTBP1 downregulation. 

      In addition, there is no confirmation of a neuronal or astrocytic knockdown of PTBP1 in vivo; all base editing validation experiments were completed in cell lines. 

      In the revised manuscript, we assess in vivo base editing efficiencies at the Ptbp1 target site in the SNc (AAV-hsyn, 15.6%) and striatum (AAV-hysn, 21.1%). Furthermore, we assessed in vivo Ptbp1 downregulation at the RNA and protein level to complement our in vitro data (Figure 2 – figure supplement 5; figure 3 – supplement 2).

      Finally, it is unclear why the base editing approach was used to induce loss-of-function rather than a cell-type specific knockout, if the goal is to assess the effects of PTBP1 loss in specific neurons. 

      We expressed base editors under cell-type specific promoter to induce a reliable loss-offunction mutation at the Ptbp1 exon-intron junction in neurons or astrocytes. Performing these mutations with Cas9 nucleases instead would have had potential limitations and risks, including i) indel mutations do not always lead to a frameshift and loss-of-function despite high indel formation at the targeted site, ii) nucleases induce DNA double strand breaks, which can have serious side effects (e.g. chromosomal rearrangements or translocations), and iii) ‘mosaicisms’ as edited cells contain different indel mutations, which may result in different effects and thus complicate analysis of the downstream effects. We discuss these points in the revised manuscript.  

      In terms of interpretation, the conclusion by the authors that PTBP1 knockdown has little likelihood to be therapeutically relevant seems overstated, particularly since they did observe a beneficial effect on motor behavior. We know that in PD, patients often display negligible symptoms until 50-70% of dopaminergic input to the striatum is lost, due to compensatory activity of remaining dopaminergic cells. Presumably, a small recovery of dopaminergic neurons would have an outsized effect on motor ability and may improve the efficacy of dopaminergic drugs, particularly levodopa, at lower doses, averting many problematic side effects. Since striatal dopamine was assessed by whole-tissue analysis, which is not necessarily reflective of synaptic dopamine availability, it is difficult to assess whether the ~10% increase in TH+ cells in the striatum was sufficient to improve dopamine function. However, the improvement in motor activity suggests that it was.

      As pointed out by the reviewer, it is difficult to estimate the therapeutic effect and importance of a ~10% increase in TH+ cells for PD patient. Guided by the reviewer’s suggestion, we have included a more in-depth discussion of our results and its potential therapeutic value as well as outstanding questions for future studies in the revised manuscript.

      Reviewer #3 (Public Review):

      This study explores the use of an adenine base editing strategy to knock down PTBP1 in astrocytes and neurons of a Parkinson's disease mouse model, as a potential AAV-BE therapy. The results indicate that editing Ptbp1 in neurons, but not astrocytes, leads to the formation of tyrosine hydroxylase (TH)+ cells, rescuing some motor symptoms.

      Several aspects of the manuscript stand out positively. Firstly, the clarity of the presentation. The authors communicate their ideas and findings in a clear and understandable manner, making it easier for readers to follow. 

      The Materials and methods section is well-elaborated, providing sufficient detail for reproducibility. 

      The logical flow of the manuscript makes sense, with each section building upon the previous one coherently.

      The ABE strategy employed by the authors appears sound, and the manuscript presents a coherent and well-supported argument.

      Positively, some of the data in this study effectively counteracts previous work in line with more recent publications, demonstrating the authors' ability to contribute to the ongoing conversation in the field.

      We thank the reviewer for appreciating the effort we have put into this study. Please find below a point-by-point reply to the weaknesses raised by the reviewer. 

      However, while the in vitro data yields promising results, it may have been overly optimistic to assume that the efficiencies observed in dividing cells will directly translate to in vivo conditions. This consideration is important given the added complexities of vector optimization, different cell types targeted in vitro versus in vivo, as well as unknown intrinsic limitations of the base editing technology. 

      We agree with the reviewer that in vitro base editing efficiencies might not directly translate to in vivo editing outcomes. We therefore assessed in vivo base editing efficiencies at the Ptbp1 locus and PTBP1 downregulation in the striatum and midbrain. Our data revealed that in vivo base editing activity was lower than in our in vitro setting (in vitro: Figure 1; figure 1 – figure supplement 2; in vivo: figure 2 – figure supplement 5; figure 3 – supplement 2). However, we believe that these rates are slightly underestimated since we sequenced DNA isolated from the whole tissue (striatum or SNc) and not from purified astrocytes or neurons. Moreover, we could demonstrate that editing led to a reduction of Ptbp1 transcript and PTBP1 protein level (Figure 2 – figure supplement 5; figure 3 – supplement 2).

      In addition, certain aspects of the manuscript would benefit from a more in-depth and comprehensive discussion rather than being only briefly touched upon. Such a discussion would enhance the relevance of the obtained results and provide the foundation for improvement when using similar approaches.

      Following the reviewer’s suggestion, we included a more in-depth discussion of our results in the revised manuscript.

      Recommendations for the authors:

      Reviewing Editor (Recommendations for the Authors):

      A summary of key recommendations that might improve the eLife assessment in a subsequent submission are provided below, as a guide to help the authors focus on changes that might enhance the strength of evidence (e.g., from "incomplete" to "solid").

      (1) Provide further explanation of the mechanistic relationship between the downregulation of Ptbp1 and TH+ dopaminergic neuron reprogramming. Additional discussion of this topic should also be included.

      (2) Demonstrate proof of editing in the intended targeted cells in vitro and/or in vivo.

      (3) Show evidence of successful Base Editor delivery in vivo.

      (4) Perform a deeper characterization of TH+ cells in vivo and provide a more thorough discussion of the identity of the targeted cells. This may include an exploration of whether TH+ cells detected are TH+ interneurons and/or establish their identity based on transcriptomics or a similar approach.

      (5) Provide better-quality representative images supporting the quantitative data.

      (6) Please include full statistical reporting including exact p-values wherever possible alongside the summary statistics (test statistic and df) and 95% confidence intervals. These should be reported for all key questions and not only when the p-value is less than 0.05 in the main manuscript.

      In the revised manuscript, we provided 1) suggestions of the mechanistic relationship between Ptbp1 knockdown, dopamine synthesis, and the functional rescue of spontaneous behaviors, 2) proof of in vivo base editing and successful base editor delivery, 3) deeper characterization of TH-expressing cells in vivo using 4i imaging, 4) better quality images, and 5) full statistical reporting.  

      Individual Reviewer recommendations for the authors are included below.

      Reviewer #1 (Recommendations For The Authors):

      Confirm loss of Ptbp1 function in infected striatal neurons. Single-cell RNA-Seq or spatial transcriptomic analysis must be performed to characterize the identity of the edited striatal neurons. The quality of the immunostaining in Figures 3 and 4 needs to be improved, and lowpower images provided. Were eLife a conventional journal, I would have insisted on all these being included prior to publication. Please also arrange for independent replication of the behavioral rescue and induction of dopaminergic marker gene expression in the striatum. 

      In the revised manuscript, we confirmed Ptbp1 downregulation at the tissue level in the SNc and striatum by RT-qPCR and western blot and included low-power images for easier interpretation. Additionally, we assessed expression of additional neuronal markers on striatal sections using 4i imaging and found that TH/DAT/NEUN positive populations either expressed markers of medium spiny neurons or interneurons. We have included these results in the revised manuscript.

      Our behavioral and imaging experiments involving mice injected into the striatum were in fact performed with two independently generated cohorts of 6-OHDA mice. In detail, the 6OHDA mice were generated by two independent surgeons from different labs (>6 months between experiments of these two cohorts), leading to comparable behavioral outcomes before and after treatment. The experiments with each cohort were performed and analyzed by two independent and blinded researchers, yielding comparable results. 

      Reviewer #2 (Recommendations For The Authors):

      (1) In the introduction, lines 43-45: This statement is inaccurate. Current treatment strategies do not focus on slowing or halting disease progression. There is currently no accepted therapy that does this. Dopaminergic therapies and deep brain stimulation can compensate for circuitry dysfunction as a result of dopamine cell loss but do not slow the disease. The referenced paper used is older and does not refer to new treatments for PD and is a summary article for a special issue of the Disease Models and Mechanisms journal. Please ensure that all references used are appropriate for the statement they are attached to.

      We thank the reviewer for pointing this out. We have rephrased this statement accordingly and provided an appropriate reference describing current treatment strategies.

      (2) The number of TH+ cells in the intact nigra seems low compared to published data. Suggest a stereological approach may be better than the Abercrombie method.

      Following the reviewer’s suggestion, we re-quantified the number of TH positive cells using a stereological approach (Nv:Vref method). We have included these results in the revised manuscript. 

      (3) Have the authors considered that the striatal TH+ cells could be TH+ striatal interneurons? 

      In the revised manuscript, we performed additional 4i imaging experiments to further analyze the identity of the TH positive cells in the striatum. Briefly, we found that TH/DAT/NEUN positive populations either expressed markers of GABAergic medium spiny neurons or interneurons. We have added these results to the revised manuscript (Figure 4). 

      (4) The Western blot shown in Figure 1 C for C8-D1A has some abnormalities and makes it difficult to judge the bands. Also, for 1B, the legends are difficult to see.

      In the revised manuscript, we have repeated the respective western blot to make interpretation of the bands easier, and adapted the legends in Figure 1B for better visibility.

      (5) Figure 2: Please show representative images for the GFAP-targeted editing.

      Representative images of the GFAP-targeted groups can be found in Figure 2 – figure supplement 3.

      (6) Figure 2, Supplement 3: Please include quantification.

      The quantifications for these images can be found in Figure 2D and 2F. 

      (7) Figure 1, Supplement 2: The gene name in A is misspelled.

      Thank you for point this out. In the revised manuscript, we added the correct gene name.

      (8) Line 267-276: As previously indicated, the statement here is overstated based on the data provided. In addition, the citation provided to justify this claim (Kannari et al., 2000) is an odd choice as the dosage of L-DOPA utilized was not therapeutically relevant (50 mg/kg). A better indication of efficacy would be the return to basal, unaffected levels rather than the fold increase in dopamine levels. A better comparison would be Lindgren et al., 2010 who showed that L-DOPA-treated animals with a physiologically relevant dose (6 mg/kg) that did not induce dyskinesia, showed a return to basal, non-lesioned dopamine levels in the striatum after LDOPA by microdialysis. To really support this claim, the authors would need to use an approach that could measure synaptic dopamine availability, rather than whole-tissue dopamine levels, such as microdialysis, fiber photometry, or an equivalent.

      Following the reviewer’s suggestions, we replaced this reference with Lindgren et al. (2010) and provide a more detailed interpretation of our results and remaining questions for future studies.  

      Reviewer #3 (Recommendations For The Authors):

      Major and minor issues are discussed below by section.

      INTRODUCTION and AIM - Lines 36-73

      - The authors effectively contextualize the aim of their study by providing comprehensive background information on previous research regarding cell 'reprogramming' into dopaminergic neurons in the SNc. However, the introduction lacks contextualization of TH+ cells and PD. For readers who may not be well-versed in the Parkinson's field, understanding the importance of TH (Tyrosine Hydroxylase) may be challenging, since the term "TH+ cells" is mentioned only once by the end of the introduction (line 71), to then become a key element in the entire study.

      - Providing a brief explanation of the role of Tyrosine Hydroxylase in the synthesis of L-DOPA would facilitate the reader's comprehension of why the presence of TH+ cells following Base Editing treatment is relevant.

      - Further elaboration on the relationship between the downregulation of the general RNA binding protein, PTBP1, and the specific dopaminergic-related readout, TH, would improve coherence and strengthen the linkage between the introductory section and the results.

      We thank the reviewer for the constructive suggestions. In the introduction of the revised manuscript, we describe the meaning and importance of TH in the context of dopamine synthesis and PD. Likewise, we briefly outlined the importance of the PTBP1/nPTBP regulatory loops during neuronal differentiation and maturation. 

      RESULTS 

      Result Section 1 - Line 75-109

      - Thorough screening of sgRNAs targeting splice junctions across the Ptbp1 gene in HEPA cells, shows the achievement of high levels of editing (80-90%) with sgRNA-ex3 and sgRNAex7. 

      - The data also indicates that editing translates into significant reductions in ptbp1 expression, along with an increase in the expression of genes repressed by PTBP1.

      - Despite obtaining lower percentages of editing events in N2a neuroblastoma cells and the C8-D1A astroglial cell line, the differential expression levels of ptbp1 and the readout genes remain significant. However, the gRNA screening assay is performed in immortalized, dividing cells. 

      - Providing proof that Adenosine Base Editing of Ptbp1 is successful in non-dividing cells (such as SNc and/or striatal primary neurons) would strengthen the case for the potential therapy in the intended cell type.

      Following the reviewer’s comment, we show in vivo base editing rates in the SNc and striatum of treated PD mice in the revised manuscript (Figure 2 – figure supplement 5; figure 3 – supplement 2).

      - Moreover, assessing the expression levels of tyrosine hydroxylase by qPCR after Ptbp1 base editing in vitro could help contextualize the use of TH+ detection as an in vivo readout and may help explain why the total number of TH+ cells is low after ABE treatment in vivo - as shown in following sections.

      In the revised manuscript, we now provide quantifications of in vivo base editing efficiencies in the SNc (~15%) and striatum (~20%). As expected from these lower in vivo base editing rates, downregulation of Ptbp1 at the transcript and protein level was less pronounced compared to our in vitro experiments. It seems likely that higher base editing efficiency and more pronounced downregulation of Ptbp1 could lead to a larger population of TH expressing cells. We have added these results and interpretations to the revised manuscript.

      - Furthermore, although ABEs are less prone to generating bystander and other nucleotide changes compared to CBEs, it is still possible. Figures 1 (line 811) and 1-supplement 2 (line 842) only show a brief window of the Sanger sequencing trace. Updating these figures to display a wider view of the sequencing trace would enhance transparency. If unwanted edits are detected, while they may not significantly alter the relevance, impact, or structure of the paper, they may become an important aspect of the discussion. 

      Indeed, ABEs can induce bystander edits and we also detected such edits at the Ptbp1 target site. However, since our base editing strategy was designed to yield a loss of Ptbp1 function, bystander editing at the splice site was not a primary focus in our analysis. Nevertheless, we included CRISPResso output images showing the specific editing outcomes in a wider analysis window in the revised manuscript (Figure 3 – figure supplement 2). 

      Result Section 2 - Lines 110-159

      A split intein system is used in vivo with sgRNA-ex3, after updating the promoter to make it cell-specific: hSyn to restrict expression to neurons and GFAP to restrict expression to astrocytes. 

      However, no other assay is performed to assess whether a) the promoter change and/or b) splitting Cas9 may affect the editing efficiency compared to their initial in vitro approach.

      In the revised manuscript, we assessed the performance of the in vivo AAV vectors encoding the split intein ABE with sgRNA-ex3 in vitro in N2a and C8-D1A cells. Our results show that all vectors are functional and result in base editing at the target locus.

      -  Addressing whether this is the case may explain the low number of TH+ cells observed in vivo. 

      - The authors could also consider staining for Cas9 to address whether the low number of TH+cells could be attributed to a poor Cas9 delivery.

      To confirm successful in vivo base editor delivery, we quantified in vivo base editing efficiencies in the SNc and striatum of PD mice. Our analysis revealed in vivo base editing efficiencies at both tissue sites, confirming that base editors were successfully delivered. Editing efficiencies were, however, substantially lower (Figure 2 – figure supplement 5; figure 3 – supplement 2).  than in our in vitro cell line setting (Figure 1; figure 1 – figure supplement 2). Even though tissue editing rates likely underestimate the cell type-specific editing rates in astrocytes or neurons, higher base editing rates would have likely resulted in a higher number of TH positive cells. We have added these results and their implications to the revised manuscript. 

      -  Moreover, despite the presence of TH, in Figure 2 E,F authors examine the striatal innervation from newly generated TH+ cells in the SNc by Fluorescence Intensity (FI) to conclude that the edited cells do not form projections towards the striatum. Considering the low levels of TH+ positive cells obtained, the accumulation of gross FI might not be the most accurate way to assess the presence or absence of cell projections.

      - Using another marker that stains the projections rather than the cell soma, and that is a marker of dopaminergic neurons, might be a better way to address this.

      To address the reviewer’s comment, we analyzed the presence of potential dopaminergic fibers in the mfb, where projections are more concentrated (around the injection coordinates of 6-OHDA), using the dopaminergic marker DAT. In line with our previous observations in the striatum, we did not detect an increase in DAT fluorescence intensity upon treatment on the lesioned hemisphere (Figure 2 – figure supplement 4).  

      Result Section 3 - Line 160-182

      Minor issue

      - The same dual split intein system is used in the striatum. However, in Figure 3 - Figure Supplement 1 - line 958 and in Figure 3 - Figure Supplement 4 - line 1000authors show the injection of 2x the viral genomes indicated along the manuscript. In previous experiments the SNc 2x108vg/animal was used whereas this figure shows 4x108vg/animal injected in the striatum. 

      - The authors should clarify if the vg injected in the striatum was different from what they previously indicated.

      Compared to injection in the SNc, the volume of vector injected in the striatum was doubled since the region is significantly larger. We clarified that the injected vector genomes were different between striatum and SNc in the revised manuscript.

      Result Section 4- Line 183-220

      In this section, the authors thoroughly examine the neuronal nature of TH+ cells through NeuN co-staining and iterative immunofluorescence imaging (4i). BrdU experiments are conducted to determine the origin of these cells, leading to the conclusion that TH+ cells derive from nondividing cells and express the neuronal marker DAT, characteristic of dopamine-producing neurons (DANs). Cell shape of the TH+ cells in the striatum and SNc is also evaluated measuring their Feret's diameter and their cell surface. Authors conclude there's heterogeneity in the TH+ cell population due to the presence of TH+/Neun- as well as differences in cell shape. 

      However, their explanation of this heterogeneity is solely attributed to differences in the microenvironment and lacks further elaboration. Similarly, their observation that almost half the number of TH+ striatal cells after treatment express CTIP2 (Line 213 and Figure 4B), a marker for GABAergic medium spiny neurons, which they state as "interesting" (line 213) is not developed further. Delving deeper into these topics could strengthen the discussion.

      In the revised manuscript, we provided a more in-depth discussion of the 4i imaging results and potential therapeutic implications. Additionally, we suggest follow-up experiments to analyze the identity, function, and molecular mechanisms underlying the expression of TH upon PTBP1 downregulation in future studies. 

      Result Section 5- Line 221-243

      Two drug-free and two drug-induced behavioral tests are conducted in control and treated animals to evaluate the restoration of motor functions following treatment. Consistent with their previous findings, only the treatment targeted to neurons resulted in the restoration of motor functions in drug-free behavioral tests. The rationale behind each test and its evaluation is clearly explained.

      DISCUSSION 

      - In the discussion section, the authors effectively re-examine their results contextualizing their data with previous studies in the field. However, it would be helpful at this point in the manuscript to reconsider the use of the term 'cell reprogramming,' as this study does not involve actual cell reprogramming. The concept "reprograming" entails the process of transforming adult cells into a stem cell-like state, to then differentiate them into a different cell type. As proven in section 4 by a BrdU proliferation assay, the targeted cells are differentiated neurons. Considering BrdU is administered 5 days after ABE treatment, if true cell reprogramming was taking place, there should be evidence of BrdU incorporation. Cell reprogramming or reprograming is mentioned 4 times in the manuscript (line 34, line 54, line 265, line 277). Therefore, using another terminology would be more accurate.

      Following the reviewer’s suggestion, we removed the term “cell reprograming” from the manuscript and rather describe it as induction of TH expression in endogenous neurons.

      - As noted in the comments of section 4, a more thorough discussion about the various possibilities for heterogeneity would enhance the manuscript's contribution to the PD field.

      In the revised manuscript, we provided a more in-depth discussion of the 4i imaging results and potential therapeutic implications. 

      - Despite observing low numbers of TH+ cells, no significant rescue of drug-induced behaviors, and low levels of released dopamine, the authors merely state that these results make the therapy non-viable, but there is no further exploration or discussion. Whether the limitations lie in the ABE strategy itself, such as its efficiency in targeting and editing of differentiated neurons; or if the issues lie on the injection and delivery, is never discussed. A deeper argumentation on the possible underlying reasons for these challenges would greatly enhance the manuscript and contribute to the advancement of ABE therapies in the brain.

      We believe that the efficacy of our base editing approach could be significantly enhanced by optimizing the delivery. Currently, we are using a dual AAV approach to deliver intein-split ABEs. Since this approach relies on the delivery of higher AAV doses to achieve cotransduction of a cell by two different AAVs, the efficiency could be significantly enhanced by using smaller Cas9 orthologues that can be delivered as a single AAV. Furthermore, in this study we performed a single injection into the dorsal striatum to deliver ABE-expressing AAVs. Performing multiple injections into the rostral, medial, and caudal regions of the striatum might allow us to transduce more cells and induce TH expression in a larger population of striatal neurons. We have included these points in the revised manuscript.

      - While drug-induced behaviors are not recovered, the data demonstrates a rescue of spontaneous behaviors. Further discussion on the potential differences in circuitry underlying these variations in behavioral rescue would also enrich the manuscript's discussion.

      In the revised manuscript, we provide suggestions for potential mechanisms involved in the rescue of spontaneous behavior vs. absence of rescue of drug-induced behaviors. 

      FIGURES AND FIGURE SUPPLEMENTS

      General minor issue - low magnification images in the following figures, make it difficult to visualize positive cells in tissue sections: Figure 2; Figure 2- supplement 1; Figure 2 - supplement 3, Figure 3- supplement 1. Adding a higher magnification imaging of positive cells in tissue sections of SNc and striatum might help with the visualization. 

      As suggested by the reviewer, we included higher magnification images in the corresponding figures to improve interpretation of our results.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review): 

      In the presented manuscript, the authors investigate how neural networks can learn to replay presented sequences of activity. Their focus lies on the stochastic replay according to learned transition probabilities. They show that based on error-based excitatory and balance-based inhibitory plasticity networks can selforganize towards this goal. Finally, they demonstrate that these learning rules can recover experimental observations from song-bird song learning experiments. 

      Overall, the study appears well-executed and coherent, and the presentation is very clear and helpful. However, it remains somewhat vague regarding the novelty. The authors could elaborate on the experimental and theoretical impact of the study, and also discuss how their results relate to those of Kappel et al, and others (e.g., Kappel et al (doi.org/10.1371/journal.pcbi.1003511))). 

      We agree with the reviewer that our previous manuscript lacked comparison with previously published similar works. While Kappel et al. demonstrated that STDP in winner-take-all circuits can approximate online learning of hidden Markov models (HMMs), a key distinction from our model is that their neural representations acquire deterministic sequential activations, rather than exhibiting stochastic transitions governing Markovian dynamics. Specifically, in their model, the neural representation of state B would be different in the sequences ABC and CBA, resulting in distinct deterministic representations like ABC and C'B'A', where ‘A’ and ‘A'’ are represented by different neural states (e.g., activations of different cell assemblies). In contrast, our network learns to generate stochastically transitioning cell assemblies which replay Markovian trajectories of spontaneous activity obeying the learned transition probabilities between neural representations of states. For example, starting from reactivation from assembly ‘A’, there may be an 80% probability to transition to assembly ‘B’ and 20% to ‘C’. Although Kappel et al.'s model successfully solves HMMs, their neural representations do not themselves stochastically transition between states according to the learned model. Similar to the Kappel et al.'s model, while the models proposed in Barber (2002) and Barber and Agakov (2002) learn the Markovian statistics, these models learned a static spatiotemporal input patterns only and how assemblies of neurons show stochastic transition in spontaneous activity has been still unclear. In contrast with these models, our model captures the probabilistic neural state trajectories, allowing spontaneous replay of experienced sequences with stochastic dynamics matching the learned environmental statistics.

      We have included new sentences for explain these in ll. 509-533 in the revised manuscript.

      Overall, the work could benefit if there was either (A) a formal analysis or derivation of the plasticity rules involved and a formal justification of the usefulness of the resulting (learned) neural dynamics; 

      We have included a derivation of our plasticity rules in ll. 630-670 in the revised manuscript. Consistent with our claim that excitatory plasticity updates the excitatory synapse to predict output firing rates, we have shown that the corresponding cost function measures the discrepancy between the recurrent prediction and the output firing rate. Similarly, for inhibitory plasticity, we defined the cost function that evaluates the difference between the excitatory and inhibitory potential within each neuron. We showed that the resulting inhibitory plasticity rule updates the inhibitory synapses to maintain the excitation-inhibition balance.

      and/or (B) a clear connection of the employed plasticity rules to biological plasticity and clear testable experimental predictions. Thus, overall, this is a good work with some room for improvement. 

      Our proposed plasticity mechanism could be implemented through somatodendritic interactions. Analogous to previous computational works (Urbanczik and Senn., 2014; Asabuki and Fukai., 2020; Asabuki et al., 2022), our model suggests that somatic responses may encode the stimulus-evoked neural activity states, while dendrites encode predictions based on recurrent dynamics that aim to minimize the discrepancy between somatic and dendritic activity. To directly test this hypothesis, future experimental studies could simultaneously record from both somatic and dendritic compartments to investigate how they encode evoked responses and predictive signals during learning (Francioni et al., 2022).

      We have included new sentences for explain these in ll. 476-484 in the revised manuscript.

      Reviewer #2 (Public Review): 

      Summary: 

      This work proposes a synaptic plasticity rule that explains the generation of learned stochastic dynamics during spontaneous activity. The proposed plasticity rule assumes that excitatory synapses seek to minimize the difference between the internal predicted activity and stimulus-evoked activity, and inhibitory synapses try to maintain the E-I balance by matching the excitatory activity. By implementing this plasticity rule in a spiking recurrent neural network, the authors show that the state-transition statistics of spontaneous excitatory activity agree with that of the learned stimulus patterns, which are reflected in the learned excitatory synaptic weights. The authors further demonstrate that inhibitory connections contribute to well-defined state transitions matching the transition patterns evoked by the stimulus. Finally, they show that this mechanism can be expanded to more complex state-transition structures including songbird neural data. 

      Strengths: 

      This study makes an important contribution to computational neuroscience, by proposing a possible synaptic plasticity mechanism underlying spontaneous generations of learned stochastic state-switching dynamics that are experimentally observed in the visual cortex and hippocampus. This work is also very clearly presented and well-written, and the authors conducted comprehensive simulations testing multiple hypotheses. Overall, I believe this is a well-conducted study providing interesting and novel aspects of the capacity of recurrent spiking neural networks with local synaptic plasticity. 

      Weaknesses: 

      This study is very well-thought-out and theoretically valuable to the neuroscience community, and I think the main weaknesses are in regard to how much biological realism is taken into account. For example, the proposed model assumes that only synapses targeting excitatory neurons are plastic, and uses an equal number of excitatory and inhibitory neurons. 

      We agree with the reviewer. The network shown in the previous manuscript consists of an equal number of excitatory and inhibitory neurons, which seems to lack biological plausibility. Therefore, we first tested whether a biologically plausible scenario would affect learning performance by setting the ratio of excitatory to inhibitory neurons to 80% and 20% (Supplementary Figure 7a; left). Even in such a scenario, the network still showed structured spontaneous activity (Supplementary Figure 7a; center), with transition statistics of replayed events matching the true transition probabilities (Supplementary Figure 7a; right). We then asked whether the model with our plasticity rule applied to all synapses would reproduce the corresponding stochastic transitions. We found that the network can learn transition statistics but only under certain conditions. The network showed only weak replay and failed to reproduce the appropriate transition (Supplementary Fig. 7b) if the inhibitory neurons were no longer driven by the synaptic currents reflecting the stimulus, due to a tight balance of excitatory and inhibitory currents on the inhibitory neurons. We then tested whether the network with all synapses plastic can learn transition statistics if the external inputs project to the inhibitory neurons as well. We found that, when each stimulus pattern activates a non-overlapping subset of neurons, the network does not exhibit the correct stochastic transition of assembly reactivation (Supplementary Fig. 7c). Interestingly, when each neuron's activity is triggered by multiple stimuli and has mixed selectivity, the reactivation reproduced the appropriate stochastic transitions (Supplementary Fig. 7d).

      We have included these new results as new Supplementary Figure 7 and they are explained in ll.215-230 in the revised manuscript.

      The model also assumes Markovian state dynamics while biological systems can depend more on history. This limitation, however, is acknowledged in the Discussion. 

      We have included the following sentence to provide a possible solution to this limitation: “Therefore, to learn higher-order stochastic transitions, recurrent neural networks like ours may need to integrate higher-order inputs with longer time scales.” in ll.557-559 in the revised manuscript. 

      Finally, to simulate spontaneous activity, the authors use a constant input of 0.3 throughout the study. Different amplitudes of constant input may correspond to different internal states, so it will be more convincing if the authors test the model with varying amplitudes of constant inputs. 

      We thank the reviewer for pointing this out. In the revised manuscript, we have tested constant input with three different strengths. If the strength is moderate, the network showed accurate encoding of transition statistics in the spontaneous activity as we have seen in Fig.2. We have additionally shown that the weaker background input causes spontaneous activity with lower replay rate, which in turn leads to high variance of encoded transition, while stronger inputs make assembly replay transitions more uniform. We have included these new results as new Supplementary Figure 6 and they are explained in ll.211214 in the revised manuscript.

      Reviewer #3 (Public Review): 

      Summary: 

      Asabuki and Clopath study stochastic sequence learning in recurrent networks of Poisson spiking neurons that obey Dale's law. Inspired by previous modeling studies, they introduce two distinct learning rules, to adapt excitatory-to-excitatory and inhibitory-to-excitatory synaptic connections. Through a series of computer experiments, the authors demonstrate that their networks can learn to generate stochastic sequential patterns, where states correspond to non-overlapping sets of neurons (cell assemblies) and the state-transition conditional probabilities are first-order Markov, i.e., the transition to a given next state only depends on the current state. Finally, the authors use their model to reproduce certain experimental songbird data involving highly-predictable and highly-uncertain transitions between song syllables. 

      Strengths: 

      This is an easy-to-follow, well-written paper, whose results are likely easy to reproduce. The experiments are clear and well-explained. The study of songbird experimental data is a good feature of this paper; finches are classical model animals for understanding sequence learning in the brain. I also liked the study of rapid task-switching, it's a good-to-know type of result that is not very common in sequence learning papers. 

      Weaknesses: 

      While the general subject of this paper is very interesting, I missed a clear main result. The paper focuses on a simple family of sequence learning problems that are well-understood, namely first-order Markov sequences and fully visible (nohidden-neuron) networks, studied extensively in prior work, including with spiking neurons. Thus, because the main results can be roughly summarized as examples of success, it is not entirely clear what the main point of the authors is. 

      We apologize the reviewer that our main claim was not clear. While various computational studies have suggested possible plasticity mechanisms for embedding evoked activity patterns or their probability structures into spontaneous activity (Litwin-Kumar et al., Nat. Commun. 2014, Asabuki and Fukai., Biorxiv 2023), how transition statistics of the environment are learned in spontaneous activity is still elusive and poorly understood. Furthermore, while several network models have been proposed to learn Markovian dynamics via synaptic plasticity (Brea, et al. (2013); Pfister et al. (2004); Kappel et al. (2014)), they have been limited in a sense that the learned network does not show stochastic transition in a neural state space. For instance, while Kappel et al. demonstrated that STDP in winner-take-all circuits can approximate online learning of hidden Markov models (HMMs), a key distinction from our model is that their neural representations acquire deterministic sequential activations, rather than exhibiting stochastic transitions governing Markovian dynamics. Specifically, in their model, the neural representation of state B would be different in the sequences ABC and CBA, resulting in distinct deterministic representations like ABC and C'B'A', where ‘A’ and ‘A'’ are represented by different neural states (e.g., activations of different cell assemblies). In contrast, our network learns to generate stochastically transitioning cell assemblies that replay Markovian trajectories of spontaneous activity obeying the learned transition probabilities between neural representations of states. For example, starting from reactivation from assembly ‘A’, there may be an 80% probability to transition to assembly ‘B’ and 20% to ‘C’. Although Kappel et al.'s model successfully solves HMMs, their neural representations do not themselves stochastically transition between states according to the learned model. Similar to the Kappel et al.'s model, while the models proposed in Barber (2002) and Barber and Agakov (2002) learn the Markovian statistics, these models learned a static spatiotemporal input patterns only and how assemblies of neurons show stochastic transition in spontaneous activity has been still unclear. In contrast with these models, our model captures the probabilistic neural state trajectories, allowing spontaneous replay of experienced sequences with stochastic dynamics matching the learned environmental statistics.

      We have explained this point in ll.509-533 in the revised manuscript.

      Going into more detail, the first major weakness I see in this paper is the heuristic choice of learning rules. The paper studies Poisson spiking neurons (I return to this point below), for which learning rules can be derived from a statistical objective, typically maximum likelihood. For fully-visible networks, these rules take a simple form, similar in many ways to the E-to-E rule introduced by the authors. This more principled route provides quite a lot of additional understanding on what is to be expected from the learning process. 

      We thank the reviewer for pointing this out. To better demonstrate the function of our plasticity rules, we have included the derivation of the rules of synaptic plasticity in ll. 630-670 in the revised manuscript. Consistent with our claim that excitatory plasticity updates the excitatory synapse to predict output firing rates, we have shown that the corresponding cost function measures the discrepancy between the recurrent prediction and the output firing rate. Similarly, for inhibitory plasticity, we defined the cost function that evaluates the difference between the excitatory and inhibitory potential within each neuron. We showed that the resulting inhibitory plasticity rule updates the inhibitory synapses to maintain the excitation-inhibition balance.

      For instance, should maximum likelihood learning succeed, it is not surprising that the statistics of the training sequence distribution are reproduced. Moreover, given that the networks are fully visible, I think that the maximum likelihood objective is a convex function of the weights, which then gives hope that the learning rule does succeed. And so on. This sort of learning rule has been studied in a series of papers by David Barber and colleagues [refs. 1, 2 below], who applied them to essentially the same problem of reproducing sequence statistics in recurrent fully-visible nets. It seems to me that one key difference is that the authors consider separate E and I populations, and find the need to introduce a balancing I-to-E learning rule. 

      The reviewer’s understanding that inhibitory plasticity to maintain EI balance is one of a critical difference from previous works is correct. However, we believe that the most striking point of our study is that we have shown numerically that predictive plasticity rules enable recurrent networks to learn and replay the assembly activations whose transition statistics match those of the evoked activity. Please see our reply above.

      Because the rules here are heuristic, a number of questions come to mind. Why these rules and not others - especially, as the authors do not discuss in detail how they could be implemented through biophysical mechanisms? When does learning succeed or fail? What is the main point being conveyed, and what is the contribution on top of the work of e.g. Barber, Brea, et al. (2013), or Pfister et al. (2004)? 

      Our proposed plasticity mechanism could be implemented through somatodendritic interactions. Analogous to previous computational works (Senn, Asabuki), our model suggests that somatic responses may encode the stimulusevoked neural activity states, while dendrites encode predictions based on recurrent dynamics that aim to minimize the discrepancy between somatic and dendritic activity. To directly test this hypothesis, future experimental studies could simultaneously record from both somatic and dendritic compartments to investigate how they encode evoked responses and predictive signals during learning.

      To address the point of the reviewer, we conducted addionnal simulations to test where the model fails. We found that the model with our plasticity rule applied to all synapses only showed faint replays and failed to replay the appropriate transition (Supplementary Fig. 7b). This result is reasonable because the inhibitory neurons were no longer driven by the synaptic currents reflecting the stimulus, due to a tight balance of excitatory and inhibitory currents on the inhibitory neurons. Our model predicts that mixed selectivity in the inhibitory population is crucial to learn an appropriate transition statistics (Supplementary Fig. 7d). Future work should clarify the role of synaptic plasticity on inhibitory neurons, especially plasticity at I to I synapses. We have explained this result as new supplementary Figure7 in the revised manuscript.

      The use of a Poisson spiking neuron model is the second major weakness of the study. A chief challenge in much of the cited work is to generate stochastic transitions from recurrent networks of deterministic neurons. The task the authors set out to do is much easier with stochastic neurons; it is reasonable that the network succeeds in reproducing Markovian sequences, given an appropriate learning rule. I believe that the main point comes from mapping abstract Markov states to assemblies of neurons. If I am right, I missed more analyses on this point, for instance on the impact that varying cell assembly size would have on the findings reported by the authors.

      The reviewer’s understanding is correct. Our main point comes from mapping Markov statistics to replays of cell assemblies. In the revised manuscript, we performed additional simulations to ask whether varying the size of the cell assemblies would affect learning. We ran simulations with two different configurations in the task shown in Figure 2. The first configuration used three assemblies with a size ratio of 1:1.5:2. After training, these assemblies exhibited transition statistics that closely matched those of the evoked activity (Supplementary Fig.4a,b). In contrast, the second configuration, which used a size ratio of 1:2:3, showed worse performance compared to the 1:1.5:2 case (Supplementary Fig.4c,d). These results suggest that the model can learn appropriate transition statistics as long as the size ratio of the assemblies is not drastically varied.

      Finally, it was not entirely clear to me what the main fundamental point in the HVC data section was. Can the findings be roughly explained as follows: if we map syllables to cell assemblies, for high-uncertainty syllable-to-syllable transitions, it becomes harder to predict future neural activity? In other words, is the main point that the HVC encodes syllables by cell assemblies? 

      The reviewer's understanding is correct. We wanted to show that if the HVC learns transition statistics as a replay of cell assemblies, a high-uncertainty syllable-to-syllable transition would make predicting future reactivations more difficult, since trial-averaged activities (i.e., poststimulus activities; PSAs) marginalized all possible transitions in the transition diagram.

      (1) Learning in Spiking Neural Assemblies, David Barber, 2002. URL: https://proceedings.neurips.cc/paper/2002/file/619205da514e83f869515c782a328d3c-Paper.pdf  

      (2) Correlated sequence learning in a network of spiking neurons usingmaximum likelihood, David Barber, Felix Agakov, 2002. URL: http://web4.cs.ucl.ac.uk/staff/D.Barber/publications/barber-agakovTR0149.pdf  

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      In more detail: 

      A) Theoretical analysis 

      The plasticity rules in the study are introduced with a vague reference to previous theoretical studies of others. Doing this, one does not provide any formal insight as to why these plasticity rules should enable one to learn to solve the intended task, and whether they are optimal in some respect. This becomes noticeable, especially in the discussion of the importance of inhibitory balance, which does not go into any detail, but rather only states that its required, both in the results and discussion sections. Another unclarity appears when error-based learning is discussed and compared to Hebbian plasticity, which, as you state, "alone is insufficient to learn transition probabilities". It is not evident how this claim is warranted, nor why error-based plasticity in comparison should be able to perform this (other than referring to the simulation results). Please either clarify formally (or at least intuitively) how plasticity rules result in the mentioned behavior, or alternatively acknowledge explicitly the (current) lack of intuition. 

      The lack of formal discussion is a relevant shortcoming compared to previous research that showed very similar results with formally more rigorous and principled approaches. In particular, Kappel et al derived explicitly how neural networks can learn to sample from HMMs using STDP and winner-take-all dynamics. Even though this study has limitations, the relation with respect to that work should be made very clear; potentially the claims of novelty of some results (sampling) should be adjusted accordingly. See also Yanping Huang, Rajesh PN Rao (NIPS 2014), and possibly other publications. While it might be difficult to formally justify the learning rules post-hoc, it would be very helpful to the field if you very clearly related your work to that of others, where learning rules have been formally justified, and elaborate on the intuition of how the employed rules operate and interact (especially for inhibition). 

      Lastly, while the importance of sampling learned transition probabilities is discussed, the discussion again remains on a vague level, characterized by the lack of references in the relevant paragraphs. Ideally, there should be a proof of concept or a formal understanding of how the learned behaviour enables to solve a problem that is not solved by deterministic networks. Please incorporate also the relation to the literature on neural sampling/planning/RL etc. and substantiate the claims with citations. 

      We have included sentences in ll. 691-696 in the revised manuscript to explain that for Poisson spiking neurons, the derived learning rule is equivalent to the one that minimizes the Kullback-Leibler divergence between the distributions of output firing and the dendritic prediction, in our case, the recurrent prediction (Asabuki and Fukai; 2020). Thus, the rule suggests that the recurrent prediction learns the statistical model of the evoked activity, which in turn allows the network to reproduce the learned transition statistics.

      We have also added a paragraph to discuss the differences between previously published similar models (e.g., Kappel et al.). Please see our response above.

      B) Connection to biology 

      The plasticity rules in the study are introduced with a vague reference to previous theoretical studies of others. Please discuss in more detail if these rules (especially the error-based learning rule) could be implemented biologically and how this could be achieved. Are there connections to biologically observed plasticity? E.g. for error-based plasticity has been discussed in the original publication by Urbanzcik and Senn, or more recently by Mikulasch et al (TINS 2023). The biological plausibility of inhibitory balance has been discussed many times before, e.g. by Vogels and others, and a citation would acknowledge that earlier work. This also leaves the question of how neurons in the songbird experiment could adapt and if the model does capture this well (i.e., do they exhibit E-I balance? etc), which might be discussed as well. 

      Last, please provide some testable experimental predictions. By proposing an interesting experimental prediction, the model could become considerably more relevant to experimentalists. Also, are there potentially alternative models of stochastic sequence learning (e.g., Kappel et al)? How could they be distinguished? (especially, again, why not Hebbian/STDP learning?) 

      We have cited the Vogels paper to acknowledge the earlier work. We have also included additional paragraphs to discuss a possible biologically plausible implementation of our model and how our model differs from similar models proposed previously (e.g., Kappel et al.). Please see our response above.

      Other comments 

      As mentioned, a derivation of recurrent plasticity rules is missing, and parameters are chosen ad-hoc. This leaves the question of how much the results rely on the specific choice of parameters, and how robust they are to perturbations. As a robustness check, please clarify how the duration of the Markov states influences performance. It can be expected that this interacts with the timescale of recurrent connections, so having longer or shorter Markov states, as it would be in reality, should make a difference in learning that should be tested and discussed.

      We thank the reviewer for pointing this out. To address this point, we performed new simulations and asked to what extent the duration of Markov states affect performance. Interestingly, even when the network was trained with input states of half the duration, the distributions of the durations of assembly reactivations remain almost identical to those in the original case (Supplementary Figure 3a). Furthermore, the transition probabilities in the replay were still consistent with the true transition probabilities (Supplementary Figure 3b). We have also included the derivation of our plasticity rule in ll. 630-670 in the revised manuscript. 

      Similarly, inhibitory plasticity operates with the same plasticity timescale parameter as excitatory plasticity, but, as the authors discuss, lags behind excitatory plasticity in simulation as in experiment. Is this required or was the parameter chosen such that this behaviour emerges? Please clarify this in the methods section; moreover, it would be good to test if the same results appear with fast inhibitory plasticity. 

      We have performed a new simulation and showed that even when the learning rate of inhibitory plasticity was larger than that of excitatory plasticity, inhibitory plasticity still occurred on a slower timescale than excitatory plasticity. We have included this result in a new Supplementary Figure 2 in the revised manuscript.

      What is the justification (biologically and theoretically) for the memory trace h and its impact on neural spiking? Is it required for the results or can it be left away? Since this seems to be an important and unconventional component of the model, please discuss it in more detail. 

      In the model, it is assumed that each stimulus presentation drives a specific subset of network neurons with a fixed input strength, which avoids convergence to trivial solutions. Nevertheless, we choose to add this dynamic sigmoid function to facilitate stable replay by regulating neuron activity to prevent saturation. We have explained this point in ll.605-611 in the revised manuscript.

      Reviewer #2 (Recommendations For The Authors): 

      I noticed a couple of minor typos: 

      Page 3 "underly"->"underlie" 

      Page 7 "assemblies decreased settled"->"assemblies decreased and settled"

      We have modified the text. We thank the reviewer for their careful review.

      I think Figure 1C is rather confusing and not intuitive. 

      We apologize that the Figure 1C was confusing. In the revised figure, we have emphasized the flow of excitatory and inhibitory error for updating synapses.

      Reviewer #3 (Recommendations For The Authors): 

      One possible path to improve the paper would be to establish a relationship between the proposed learning rules and e.g. the ones derived by Barber. 

      When reading the paper, I was left with a number of more detailed questions I omitted from the public review: 

      (1) The authors introduce a dynamic sigmoidal function for excitatory neurons, Eq. 3. This point requires more discussion and analysis. How does this impact the results? 

      In the model, it is assumed that each stimulus presentation drives a specific subset of network neurons with a fixed input strength, which avoids convergence to trivial solutions. Nevertheless, we choose to add this dynamic sigmoid function to facilitate stable replay by regulating neuron activity to prevent saturation. We have explained this point in ll.605-611 in the revised manuscript.

      (2) For Poisson spiking neurons, it would be great to understand what cell assemblies bring (apart from biological realism, i.e., reproducing data where assemblies can be found), compared to self-connected single neurons. For example, how do the results shown in Figure 2 depend on assembly size? 

      We have changed the cell assembly size ratio and how it affects learning performance in a new Supplementary Figure 4. Please see our reply above.

      (3) The authors focus on modeling spontaneous transitions, corresponding to a highly stochastic generative model (with most transition probabilities far from 1). A complementary question is that of learning to produce a set of stereotypical sequences, with probabilities close to 1. I wondered whether the learning rules and architecture of the model (in particular under the I-to-E rule) would also work in such a scenario. 

      We thank the reviewer for pointing this out. In fact, we had the same question, so we considered a situation in which the setting in Figure 2 includes both cases where the transition matrix is very stochastic (prob=0.5) and near deterministic (prob=0.9).

      (4) An analysis of what controls the time so that the network stays in a certain state would be welcome. 

      We trained the network model in two cases, one with a fast speed of plasticity and one with a slow speed of plasticity. As a result, we found that the duration of assembly becomes longer in the slow learning case than in the fast case. We have included these results as Supplementary Figure 5 in the revised manuscript.

      Regarding the presentation, given that this is a computational modeling paper, I wonder whether *all* the formulas belong in the Methods section. I found myself skipping back and forth to understand what the main text meant, mainly because I missed a few key equations. I understand that this is a style issue that is very much community-dependent, but I think readability would improve drastically if the main model and learning rule equations could be introduced in the main text, as they start being discussed. 

      We thank the reviewer for the suggestion. To cater to a wider audience, we try to explain the principle of the paper without using mathematical formulas as much as possible in the main text.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The authors aimed to quantify feral pig interactions in eastern Australia to inform disease transmission networks. They used GPS tracking data from 146 feral pigs across multiple locations to construct proximity-based social networks and analyze contact rates within and between pig social units.

      Strengths:

      (1) Addresses a critical knowledge gap in feral pig social dynamics in Australia.

      (2) Uses robust methodology combining GPS tracking and network analysis.

      (3) Provides valuable insights into sex-based and seasonal variations in contact rates.

      (4) Effectively contextualizes findings for disease transmission modeling and management.

      (5) Includes comprehensive ethical approval for animal research.

      (6) Utilizes data from multiple locations across eastern Australia, enhancing generalizability.

      Weaknesses:

      (1) Limited discussion of potential biases from varying sample sizes across populations

      This is a really good comment, and we will address this in the discussion as one of the limitations of the study.

      (2) Some key figures are in supplementary materials rather than the main text.

      We will move some of our supplementary material to the main text as suggested.

      (3) Economic impact figures are from the US rather than Australia-specific data.

      We included the impact figures that are available for Australia (for FDM), and we will include the estimated impact of ASF in Australia in the introduction.

      (4) Rationale for spatial and temporal thresholds for defining contacts could be clearer.

      We will improve the explanation of why we chose the spatial and temporal thresholds based on literature, the size of animals and GPS errors.

      (5) Limited discussion of ethical considerations beyond basic animal ethics approval.

      This research was conducted under an ethics committee's approval for collaring the feral pigs. This research is part of an ongoing pest management activity, and all the ethics approvals have been highlighted in the main manuscript.

      The authors largely achieved their aims, with the results supporting their conclusions about the importance of sex and seasonality in feral pig contact networks. This work is likely to have a significant impact on feral pig management and disease control strategies in Australia, providing crucial data for refining disease transmission models.

      Reviewer #2 (Public review):

      Summary:

      The paper attempts to elucidate how feral (wild) pigs cause distortion of the environment in over 54 countries of the world, particularly Australia.

      The paper displays proof that over $120 billion worth of facilities were destroyed annually in the United States of America.

      The authors have tried to infer that the findings of their work were important and possess a convincing strength of evidence.

      Strengths:

      (1) Clearly stating feral (wild) pigs as a problem in the environment.

      (2) Stating how 54 countries were affected by the feral pigs.

      (3) Mentioning how $120 billion was lost in the US, annually, as a result of the activities of the feral pigs.

      (4) Amplifying the fact that 14 species of animals were being driven into extinction by the feral pigs.

      (5) Feral pigs possessing zoonotic abilities.

      (6) Feral pigs acting as reservoirs for endemic diseases like brucellosis and leptospirosis.

      (7) Understanding disease patterns by the social dynamics of feral pig interactions.

      (8) The use of 146 GPS-monitored feral pigs to establish their social interaction among themselves.

      Weaknesses:

      (1) Unclear explanation of the association of either the female or male feral pigs with each other, seasonally.

      This will be better explain in the methods.

      (2) The "abstract paragraph" was not justified.

      We have justified the abstract paragraph as requested by the reviewer.

      (3) Typographical errors in the abstract.

      Typographical errors have been corrected in the Abstract.

      Reviewer #3 (Public review):

      Summary:

      The authors sought to understand social interactions both within and between groups of feral pigs, with the intent of applying their findings to models of disease transmission. The authors analyzed GPS tracking data from across various populations to determine patterns of contact that could support the transmission of a range of zoonotic and livestock diseases. The analysis then focused on the effects of sex, group dynamics, and seasonal changes on contact rates that could be used to base targeted disease control strategies that would prioritize the removal of adult males for reducing intergroup disease transmission.

      Strengths:

      It utilized GPS tracking data from 146 feral pigs over several years, effectively capturing seasonal and spatial variation in the social behaviors of interest. Using proximity-based social network analysis, this work provides a highly resolved snapshot of contact rates and interactions both within and between groups, substantially improving research in wildlife disease transmission. Results were highly useful and provided practical guidance for disease management, showing that control targeted at adult males could reduce intergroup disease transmission, hence providing an approach for the control of zoonotic and livestock diseases.

      Weaknesses:

      Despite their reliability, populations can be skewed by small sample sizes and limited generalizability due to specific environmental and demographic characteristics. Further validation is needed to account for additional environmental factors influencing social dynamics and contact rates

      This is a good point, and we thank the reviewer for pointing out this issue. We will discuss the potential biases due to sample size in our discussion. We agree that environmental factors need to be incorporated and tested for their influence on social dynamics, and this will be added to the discussion as we have plans to expand this research and conduct, the analysis to determine if environmental factors are influencing social dynamics.

    1. Author response:

      Reviewer #1:

      (1) This concern is addressed in the ESM6, and partly in the ESM1. Indeed, many of the concerns raised by the reviewer later are already addressed on the multiple supplementary materials provided, so we kindly ask the reviewer to read them before moving forward into the discussion.

      (2) This concern is reasonable, but its solution is not "extremely easy", as the reviewer states. The reviewer indicates the use of captive-based versus non-captive-based sources, remarking maximum lifespan, the main variable that is clearly expected to be systematically biased by the source of the data. Nevertheless, except for the ZIMS database, which includes only captive individuals, and some sources, as CNRS databases and EURING, which exclusively includes wild populations, the remaining databases, which are indeed where the vast majority of the data was collected from (i.e. Amniotes database, Birds of the World and AnAge) do not make any distinction. This means that they include just the maximum lifespan from the species as known by the authors of such databases' entries, regardless of provenance, which is also not usually made explicit by the database. Therefore, correcting for this would imply checking all the primary sources. Considering that these databases sometimes do not cite the primary source, but a secondary one, and that on several occasions such source is a specialized book that is not easily accessible, and still these referenced datasets may not indicate the source of the data, tracing all of this information becomes an arduous task, that would even render the usage of databases themselves useless. We will include some details about the concerns of database usage in the discussion to address this.

      Furthermore, it remains relevant to indicate that what we discuss later about the possible effects of captivity is about our usage of animals that come from both sources, not about the provenance of the literature-extracted data used (i.e. captive or wild maximum lifespan, for example), which is an independent matter. We can test for the first for next submission, but very difficultly could we test for the second (as the reviewer seems to be pointing to). In any case, as we do not have in any case the same species from both a captive and a wild source, it would be difficult to determine if the effect tested comes from captivity or from species-specific differences.

      (3) We will add data on the replicability of the glycation measurement in the next manuscript version. The CV for several individuals of different species measured repeated times is quite low (always below 2%).

      (4) The reviewer remarks reported here are already addressed on the supplementary material (ESM6), given the lack of space in the main manuscript. We therefore kindly ask the reviewer to read the supplementary material added to the submission. If the editors agree, all or a considerable part of this could be transferred to the main text for clarity, but this would severely extend the length of a text that the reviewer already considered very long.

      Reviewer #2:

      Thanks for spotting this issue with the coefficient, as it is actually a redaction mistake. It is a remnant of a previous version of the manuscript in which a log-log relation was performed instead. Previous reviewers raised concerns about the usage of log transformation for glycation, this variable being (theoretically) a proportion variable (to which we argue that it does not behave as such), which they considered not to be transformed with a logarithm. After this, we still finally took the decision of not to transform this variable. In this line, the transformations of variables were decided generally by preliminary data exploration. In this particular case, both approaches lead to the same conclusion of higher glycation resistance in the species with higher glucose. Nevertheless, we will consider exploring the comparison of different versions for the resubmission.

      About the issue related to handling time, this variable is not available, for the reasons already exposed in the answer to the other reviewer. Moreover, Kruskal-Wallis test, by its nature, does not determine differences in medians between groups per se, as the reviewer claims, but just differences in ranks-sums. It can be equivalently used for that purpose when the groups' distributions are similar, but not when they differ, as we see here with a difference in variance. What a significant outcome in a Kruskal-Wallis test tells us, thus, is just that the groups differ (in their ranks-sums), which here is plausibly caused by the higher variance in the stressed individuals. Even if we conclude that the average is higher in those groups, mere comparisons of averages for groups with very different variances render different interpretations than when homoscedasticity is met, particularly more so when the distribution of groups overlaps. For example, in a case like this, where the data is left censored (glucose levels cannot be lower than 0), most of this higher variance is related to many values in the stressed groups lying above all the baseline values. This, of course, would increase the average, but such a parameter would not mean the same as if the distributions did not overlap.

      Regarding the GVIFs, why the values are above 1.6 is not well known, but we do not consider this a major concern, as the values are never above 2.2, level usually considered more worrying. We will include a brief explanation of this in the results section. Also, we explicitly calculated life history variables adjusted for body mass, which should eliminate their otherwise strong correlation. There exist other biological and interpretational reasons justified in the ESM6 for using the residuals on the models, instead of the raw values, despite previously raised concerns.

      Given the asseveration by the reviewer that credible intervals are not to be used for the post hoc comparisons, as this is what the whiskers shown in Figure 4B represent, the affirmation of this graph suggesting any difference between groups remains doubtful. New comparisons have now been made with the function HPDinterval() applied to the differences between each diet category calculated from the posterior values of each group, confirming no significant differences exist.

      We do not understand the suggestion made in relation to the model shown in Table 2. Removing glucose from the model could have two results, as the reviewer indicates: 1. Maximum lifespan (ML) relates with glycation, potentially spuriously through the effect of glucose (in this case not included) on both; 2. ML does not relate to glycation, and therefore "high glycation levels do not preclude the evolution of long lifespans", which is what we are already showing with the current model, which also controls for glucose, in an attempt to determine if not just raw glycation values, but glycation resistance, relates to longevity. This is intended to asses if long-lived species may show mechanisms that avoid glycation, by showing levels lower than expected for a non-enzymatic reaction.

    1. Author response:

      In this manuscript, we have addressed one of the possible modes of recruitment of Swi6 to the putative heterochromatin loci.

      Our investigation was guided by earlier work showing ability of HP1 a to bind to a class of RNAs and the role of this binding in recruitment of HP1a to heterochromatin loci in mouse cells (Muchardt et al). While there has been no clarity about the mechanism of Swi6 recruitment given the multiple pathways being involved, the issue is compounded by the overall lack of understanding as to how Swi6 recruitment occurs only at the repeat regions. At the same time, various observations suggested a causal role of RNAi in Swi6 recruitment.

      Thus, guided by the work of Muchardt et al we developed a heuristic approach to explore a possibly direct link between Swi6 and heterochromatin through RNAi pathway. Interestingly, we found that the lysine triplet found in the hinge domain in HP1, which influences its recruitment to heterochromatin in mouse cells, is also present in the hinge domain of Swi6, although we were cautious, keeping in mind the findings of Keller et al showing another role of Swi6 in binding to RNAs and channeling them to the exosome pathway. 

      Accordingly, we envisaged that a mode of recruitment of Swi6 through binding to siRNAs to cognate sites in the dg-dh repeats shared among mating type, centromere and telomere loci could explain specific recruitment as well as inheritance following DNA replication. In accordance we framed the main questions as follows: i) Whether Swi6 binds specifically and with high affinity to the siRNAs and the cognate siRNA-DNA hybrids and whether the Swi63K-3A mutant is defective in this binding, ii) whether this lack of binding of Swi63K-3A affects its localization to heterochromatin, iii) whether the this specificity is validated by binding of Swi6 but not Swi63K-3A  to siRNAs and siRNA-DNA hybrids in vivo and iv) whether the binding mode was qualitatively and quantitatively different from that of Cen100 RNA or random RNAs, like GFP RNA.

      We think that our data provides answers to these lines of inquiry to support a model wherein the Swi6-siRNA mediated recruitment can explain a cis-controlled nucleation of heterochromatin at the cognate sites in the genome. We have also partially addressed the points raised by the study by Keller et al by invoking a dynamic balance between different modes of binding of Swi6 to different classes of RNA to exercise heterochromatin formation by Swi6 under normal conditions and RNA degradation under other conditions.

      While we aver about our hypothesis, we do acknowledge the need for more detailed investigation both to buttress our hypothesis and address the dynamics of siRNA binding and recruitment of Swi6  and how Swi6 functions fit in the context of other components of heterochromatin assembly, like the HDACs and Clr4 on one hand and exosome pathway on the other. Our future studies will attempt to address these issues.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This manuscript explores the RNA binding activities of the fission yeast Swi6 (HP1) protein and proposes a new role for Swi6 in RNAi-mediated heterochromatin establishment. The authors claim that Swi6 has a specific and high affinity for short interfering RNAs (siRNAs) and recruits the Clr4 (Suv39h) H3K9 methyltransferases to siRNA-DNA hybrids to initiate heterochromatin formation. These claims are not in any way supported by the incomplete and preliminary RNA binding or the in vivo experiments that the authors present. The proposed model also lacks any mechanistic basis as it remains unclear (and unexplored) how Swi6 might bind to specific small RNA sequences or RNA-DNA hybrids. Work by several other groups in the field has led to a model in which siRNAs produced by the RNAi pathway load onto the Ago1-containing RITS complex, which then binds to nascent transcripts at pericentromeric DNA repeats and recruits Clr4 to initiate heterochromatin formation. Swi6 facilitates this process by promoting the recruitment of the RNA-dependent RNA polymerase leading to siRNA amplification.

      Weaknesses:

      (1) a) The claims that Swi6 binds to specific small RNAs or to RNA-DNA hybrids are not supported by the evidence that the authors present. Their experiments do not rule out non-specific charged-based interactions.

      We disagree. We have used synthetic siRNAs of 20-22 nt length to do EMSA assay, as mentioned in the manuscript. Further, we have sequenced the small RNAs obtained after RIP experiments to validate the enrichment of siRNA in Swi6 bound fraction as compared to the mutant Swi6-bound fraction. These results are internally consistent regardless of the mode of binding. In any case the binding occurs primarily through the chromodomain although it is influenced by the hinge domain (see below).

      Furthermore, we have carried out EMSA experiments using Swi6 mutants carrying all three possible double mutations of the K residues in the KKK triplet and found that there was no difference in the binding pattern as compared to the wt Swi6: only the triple mutant “3K-3A” showed the effect. These results suggest that that the bdining is not completely dependent on the basic residues. These results will be included in the revised version.

      We also have some preliminary data from SAXS study showing that the CD of wt Swi6 shows a change in its structure upon binding to the siRNA, while the “3K-3A” mutant of Swi6 has a compact, folded structure that occludes the binding site of Swi6 in the chromodomain.” We propose to mention this preliminary finding in the revised version as unpublished data.

      b) Claims about different affinities of Swi6 for RNAs of different sizes are based on a comparison of KD values derived by the authors for a handful of S. pombe siRNAs with previous studies from the Buhler lab on Swi6 RNA binding. The authors need to compare binding affinities under identical conditions in their assays.

      Thus, the EMSA data do suggest sequence specificity in binding of Swi6 to specific siRNA sequences (Figure S5) and implies specific residues in Swi6 being responsible for that. Thus, Identification of the residues in Swi6 involved in siRNA binding in the CD would definitely be interesting, as also the experimental confirmation of the consensus siRNA sequence. It may however be noted that as against the binding of Swi6 to siRNAs occurs through CD, that of Cen100 or GFP RNA was shown be through the hinge domain by Keller et al.

      The estimation of Kd by the Buhler group was based on NMR study, which we are not in a position to perform in the near future. Nonetheless, we did carry out EMSA study using the ‘Cen100’ RNA, same as the one used by the Keller et al study. Surprisingly, in contrast with the result of EMSA in agarose gel showing binding of Swi6 to “Cen100” RNA as reported by Keller et al, we fail to observe any binding in EMSA done in acrylamide gel. (The same is true of the RevCen 100). While this raises issues of why the Keller et al chose to do EMSA in agarose gel instead of the conventional approach of using acrylamide gel, it does lend support to our claim of stronger binding of Swi6 to siRNAs. Another relevant observation of binding of Swi6 to the “RevCen” RNA precursor RNAs but a detectable binding to siRNAs denoted as VI-IX (as measured by competition experiments, that are derived from RevCen RNA; Figure S4 and S7), which are derived by Dcr1 cleavage of the ‘’RevCen’’ RNA.

      We also disagree that we carried out EMSA with a small bunch of siRNAs. As indicated in Figure 1 and S1, we synthesized nearly 12 siRNAs representing the dg-dh repeats at Cen, mat and tel loci and measured their specificity of binding to Swi6 using EMSA assay by labeling the ones labelled “D”, “E” and “V” directly and those of the remaining ones by the latter’s ability to compete against the binding (Figure 1, S4). These results point to presence of a consensus sequence in siRNAs that shows highly specific and strong binding to Swi6 in the low micromolar range.

      Further, our claim of binding of Swi6 and not Swi63K>3A to siRNA in vivo is validated by RIP experiments, as shown in Fig 2 and S9.

      c) The regions of Swi6 that bind to siRNAs need to be identified and evidence must be provided that Swi6 binds to RNAs of a specific length, 20-22 mers, to support the claim that Swi6 binds to siRNAs. This is critical for all the subsequent experiments and claims in the study.

      We have provided both in vitro data, which is va;idiated in vivo by RIP experiments, as mentioned above. However, we agree that it wpuld be very interesting to identify the residues in Swi6 chromdomain responsible for binding to siRNA. However, such an investigation is beyond the scope of the present study.

      (2) a) The in vivo results do not validate Swi6 binding to specific RNAs, as stated by the authors. Swi6 pulldowns have been shown to be enriched for all heterochromatic proteins including the RITS complex. The sRNA binding observed by the authors is therefore likely to be mediated by Ago1/RITS.

      We disagree with the first comment. Our RIP experiments do validate the in vitro results (Fig 1, 2, S4 and S9), as argued above. The observation alluded to by the reviewer “Swi6 pulldowns have been shown to be enriched for all heterochromatic proteins including the RITS complex” is not inconsistent with our observation; it is possible that the siRNA may be released from the RITS complex and transferred to Swi6, possibly due to its higher affinity.

      Thus, we would like to suggest that the role of Swi6 is likely to be coincidental or subsequent to that of Ago1/RITS (see below). We think that the binding by Swi6 to the siRNA and siRNA-DNA hybrid and could be also carried out in cis at the level of siRNA-DNA hybrids.

      This point needs to be addressed in future studies.

      b) Most of the binding in Figure S8C seems to be non-specific.

      We would like to point out that the result in Figure S8C needs to be examined together with the Figure S8B, which shows RNA bound by Swi6 but not Swi63K-3A to hybridize with dg, dh and dh-k probes.

      c) In Figure S8D, the authors' data shows that Swi6 deletion does not derepress the rev dh transcript while dcr1 delete cells do, which is consistent with previous reports but does not relate to the authors' conclusions.

      The purpose of results shown in Figure S8D is just to compare the results of Swi6 with that of Swi63K-3A.

      d) Previous results have shown that swi6 delete cells have 20-fold fewer dg and dh siRNAs than swi6+ cells due to decreased RNA-dependent RNA polymerase complex recruitment and reduced siRNA amplification.

      This result is consistent with our results invoking a role of Swi6 in binding to, protecting and recruiting siRNAs to homologous sites.

      To find if the overall production of siRNA is compromised in swi6 3K->3A mutant, we i) calculated the RIP-Seq read counts for swi6 3K->3A , swi6+ and vector control in 200 bp genomic bins , ii) divided the Swi6 3K->3A and swi6+ signals by that of control, iii) removed the background using the criteria of signal value < 25% of max signal, and iv) counted the total reads (in excess to control) in all peak regions in both samples.  This revealed a total count of 10878 and 8994 respectively for Swi6 3K->3A  and swi6+ samples, possibly implying that the overall siRNA production is not compromised in the Swi6 3K->3A mutant.

      (3) a) The RIP-seq data are difficult to interpret as presented. The size distribution of bound small RNAs, and where they map along the genome should be shown as for example presented in previous Ago1 sRNA-seq experiments.

      Please see the response to 2(d).

      b) It is also unclear whether the defects in sRNA binding observed by the authors represent direct sRNA binding to Swi6 or co-precipitation of Ago1-bound sRNAs.

      The correspondence between our in vivo and in vitro results suggests that the binding to Swi6 would be direct. We do not observe a complete correspondence between the Swi6- and Ago-bound siRNAs. We think Swi6 binding may be coincident with or following RITS complex formation.

      This point will be discussed in the Revision.

      The authors should also sequence total sRNAs to test whether Swi6-3A affects sRNA synthesis, as is the case in swi6 delete cells.

      Please see response to 2(d) above.

      (4) The authors examine the effects of Swi6-3A mutant by overexpression from the strong nmt1 promoter. Heterochromatin formation is sensitive to the dosage of Swi6. These experiments should be performed by introducing the 3A mutations at the endogenous Swi6 locus and effects on Swi6 protein levels should be tested.

      Although we agree, we think that the heterochromatin formation is occurring in presence of nmt1-driven Swi6 but not Swi63K>3A, as indicated by the phenotype and Swi6 enrichment at otr1R::ade6, imr1::ura4 and his3-telo (Figure 3) and mating type (Fig. S10). Furthermore, the both GFP-Swi6 and GFPSwi63K>3A are expressed at similar level (Fig. S8A).

      (5) The authors' data indicate an impairment of silencing in Swi6-3A mutant cells but whether this is due to a general lower affinity for nucleosomes, DNA, RNA, or as claimed by the authors, siRNAs is unclear. These experiments are consistent with previous findings suggesting an important role for basic residues in the HP1 hinge region in gene silencing but do not reveal how the hinge region enhances silencing.

      Our study aims to correlate the binding of Swi6 but not Swi63K-3A to siRNA with its localization to heterochromatin. A similar difference in binding of Swi6 but not Swi63K-3A to siRNA-DNA hybrid, together with sensitivity of silencing and Swi6 localization to heterochromatin to RNaseH support the above correlations as being causally connected.

      In terms of mechanism of binding, we need to clarify that the primary mode of binding is through the CD and not the hinge domain, although the hinge domain does influence this binding. This result is different from those of Keller et al.

      We have some structural data based on preliminary SAXS experiment supporting binding of siRNA to the CD and influence of the hinge domain on this binding. However, this line of investigation need to be extended and will be subject of future investigations.

      (6) RNase H1 overexpression may affect Swi6 localization and silencing indirectly as it would lead to a general reduction in R loops and RNA-DNA hybrids across the genome. RNaseH1 OE may also release chromatin-bound RNAs that act as scaffolds for siRNA-Ag1/RITS complexes that recruit Clr4 and ultimately Swi6.

      These are formal possibilities. However, the correlation between swi6 binding to siRNA-DNA hybrid and delocalization upon RNase H1 treatment argues for a more direct link.

      (7) Examples of inaccurate presentation of the literature.

      a) The authors state that "RNA binding by the murine HP1 through its hinge domains is required for heterochromatin assembly (Muchardt et al, 2002). The cited reference provides no evidence that HP1 RNA binding is required for heterochromatin assembly. Only the hinge region of bacterially produced HP1 contributes to its localization to DAPI-stained heterochromatic regions in fixed NIH 3T3 cells.

      Noted. Statement will be corrected.

      b) "... This scenario is consistent with the loss of heterochromatin recruitment of Swi6 as well as siRNA generation in rnai mutants (Volpe et al, 2002)." Volpe et al. did not examine changes in siRNA levels in swi6 mutant cells. In fact, no siRNA analysis of any kind was reported in Volpe et al., 2002.

      Correct.  We only say that Swi6 recruitment is reduced in rnai mutants and correlate it with ability of SWi6 to bind to siRNA generated by RNAi and subsequently to siRNA-DNA hybrid.

      Reviewer #2 (Public review):

      The aim of this study is to investigate the role of Swi6 binding to RNA in heterochromatin assembly in fission yeast. Using in vitro protein-RNA binding assays (EMSA) they showed that Swi6/HP1 binds centromere-derived siRNA (identified by Reinhardt and Bartel in 2002) via the chromodomain and hinge domains. They demonstrate that this binding is regulated by a lysine triplet in the conserved region of the Swi6 hinge domain and that wild-type Swi6 favours binding to DNA-RNA hybrids and siRNA, which then facilitates, rather than competes with, binding to H3K9me2 and to a lesser extent H3K9me3.

      However, the majority of the experiments are carried out in swi6 null cells overexpressing wild-type Swi6 or Swi63K-3A mutant from a very strong promoter (nmt1). Both swi6 null cells and overexpression of Swi6 are well known to exhibit phenotypes, some of which interfere with heterochromatin assembly. This is not made clear in the text.

      We think that the argument is not valid as we show that swi6 but not Swi63K-3A could restore silencing at imr1::ura4, otr1::ade6 and his3-telo (Fig 3) and mating type (Fig. S10), when transformed into a swi6D strain.

      Whilst the RNA binding experiments show that Swi6 can indeed bind RNA and that binding is decreased by Swi63K-3A mutation in vitro (confusingly, they only much later in the text explained that these 3 bands represent differential binding and that II is likely an isotherm). The gels showing these data are of poor quality and it is unclear which bands are used to calculate the Kd.

      We disagree with the comment about the quality of EMSA data. We think it is of similar quality or better than that of Keller et al, except in some cases, like Fig 1D, a shorter exposure shown to distinguish the slowest shifted band has caused the remaining bands to look fainter.

      RNA-seq data shows that overall fewer siRNAs are produced from regions of heterochromatin in the Swi63K-3A mutant so it is unsurprising that analysis of siRNA-associated motifs also shows lower enrichment (or indeed that they share some similarities, given that they originate from repeat regions).

      Please see response to comment 2(d) of the first reviewer above.

      It is not clear which bands are being alluded to. However, we‘ll rectify any gaps in information in the revision.

      The experiments are seemingly linked yet fail to substantiate their overall conclusions. For instance, the authors show that the Swi63K-3A mutant displays reduced siRNA binding in vitro (Figure 1D) and that H3K9me2 levels at heterochromatin loci are reduced in vivo (Figure 3C-D). They conclude that Swi6 siRNA binding is important for Swi6 heterochromatin localization, whilst it remains entirely possible that heterochromatin integrity is impaired by the Swi63K-3A mutation and hence fewer siRNAs are produced and available to bind. Their interpretation of the data is really confusing.

      Our argument is that the lack of binding by Swi63K>3A to siRNA can explain the loss of recruitment to heterochromatin loci and thus affect the integrity of heterochroamtin; the recruitment of Swi6 can occur possibly by binding initially to siRNA and thereafter as siRNA-DNA hybrid. However, the overall level of siRNAs is not affected, as in 2(D) above. This interpretation is supported by results of ChIP assay and confocal experiments, as also by the effect of RNaseH1 in the recruitment of Swi6.

      The authors go on to show that Swi63K-3A cells have impaired silencing at all regions tested and the mutant protein itself has less association with regions of heterochromatin. They perform DNA-RNA hybrid IPs and show that Swi63K-3A cells which also overexpress RNAseH/rnh1 have reduced levels of dh DNA-RNA hybrids than wild-type Swi6 cells. They interpret this to mean that Swi6 binds and protects DNA-RNA hybrids, presumably to facilitate binding to H3K9me2. The final piece of data is an EMSA assay showing that "high-affinity binding of Swi6 to a dg-dh specific RNA/DNA hybrid facilitates the binding to Me2-K9-H3 rather than competing against it." This EMSA gel shown is of very poor quality, and this casts doubt on their overall conclusion.

      We do agree with the reviewer about the quality of EMSA (Fig. 5B). However, as may be noticed in the EMSA for siRNA-DNA hybrid binding  (Fig 4A), the bands of Swi6-bound siRNA-DNA hybrid are extremely retarded. Hence the EMSA for subsequent binding by H3-K9-Me peptides required a longer electrophoretic run, which led to reduction in the sharpness of the bands. Nevertheless, the data does indicate binding efficiency in the order H3K9-Me2> H3-K9-Me3 > H3-K9-Me0. Having said that, we plan to repeat the EMSA or address the question by other methods, like SPR.

      Unfortunately, the manuscript is generally poorly written and difficult to comprehend. The experimental setups and interpretations of the data are not fully explained, or, are explained in the wrong order leading to a lack of clarity. An example of this is the reasoning behind the use of the cid14 mutant which is not explained until the discussion of Figure 5C, but it is utilised at the outset in Figure 5A.

      We tend to agree somewhat and will attempt to submit a revised version with greater clarity, as also the explanation of experiment with cid14D strain.

      Another example of this lack of clarity/confusion is that the abstract states "Here we provide evidence in support of RNAi-independent recruitment of Swi6". Yet it then states "We show that...Swi6/HP1 displays a hierarchy of increasing binding affinity through its chromodomain to the siRNAs corresponding to specific dg-dh repeats, and even stronger binding to the cognate siRNA-DNA hybrids than to the siRNA precursors or general RNAs." RNAi is required to produce siRNAs, so their message is very unclear. Moreover, an entire section is titled "Heterochromatin recruitment of Swi6-HP1 depends on siRNA generation" so what is the author's message?

      The reviewer has correctly pointed out the error. Indeed, our results actually indicate an RNAi-dependent rather than independent mode of recruitment. Rather, we would like to suggest an H3-K9-Me2-indpendnet recruitment of Swi6. We will rectify this error in our revised manuscript.

      The data presented, whilst sound in some parts is generally overinterpreted and does not fully support the author's confusing conclusions. The authors essentially characterise an overexpressed Swi6 mutant protein with a few other experiments on the side, that do not entirely support their conclusions. They make the point several times that the KD for their binding experiments is far higher than that previously reported (Keller et al Mol Cell 2012) but unfortunately the data provided here are of an inferior quality and thus their conclusions are neither fully supported nor convincing.

      We have used the method of Heffler et al (2012) to compute the Kd from EMSA data.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Joint Public Review:

      (1) This work investigates numerically the propagation of subthreshold waves in a model neural network that is derived from the C. elegans connectome. Using a scattering formalism and tight-binding description of the network -- approximations which are commonplace in condensed matter physics -- this work attempts to show the relevance of interference phenomena, such as wavenumber-dependent propagation, for the dynamics of subthreshold waves propagating in a network of electrical synapses.

      (2) The primary strength of the work is in trying to use theoretical tools from a far-away corner of fundamental physics to shed light on the properties of a real neural system. While a system composed of neurons and synapses is classical in nature, there are occasions in which interference or localization effects are useful for understanding wave propagation in complex media [review, van Rossum & Nieuwenhuizen, 1999]. However, it is expected that localization effects only have an impact in some parameter regimes and with low phase dissipation. The authors should have addressed the existence of this validity regime in detail prior to assuming that interference effects are important.

      The theoretical concept and tool used in this study are not situated in a far-away corner of fundamental physics but hold one of the central positions in condensed matter physics and statistical physics. In fact, the non-scientific statement about where the theoretical concept and tool employed by the researchers are positioned within the realm of fundamental physics is irrelevant. The fundamental physics governs the foundations of all natural phenomena, and thus it provides indispensable principles for interpreting not only neural systems but also all life phenomena. One such principle explored in our study is the interference and localization of waves.

      Specifically, in the third paragraph of the Introduction, we introduced that the interference effect of subthreshold oscillating waves, beyond being a theoretical possibility, is a phenomenon actually observed in neural tissue (Chiang and Durand, 2023; Gupta et al., 2016). Moreover, according to Devor and Yarom (2002), the propagation of subthreshold oscillations observed in the inferior olivary nucleus extended beyond a distance of 0.2 mm. Therefore, considering the propagation of subthreshold waves and the resulting interference in the connectome of C. elegans, which has a total body length of less than 1 mm, a diameter of about 0.08 mm, and most neurons distributed in the ring structure near its neck, provides sufficient validity for the initiation of theoretical and computational studies.

      The primary objective of our study is to investigate which regimes of signal transmission/localization and interference phenomena are valid within the network of electrical synapses in C. elegans, the only system for which the neural connectome structure is perfectly known. As the Reviewer rightly pointed out in the question, this is exactly the issue that the Reviewer is curious about. Therefore, the existence of this validity regime cannot be addressed prior to conducting the study but can only be identified as a result of performing the research. And we have conducted such a study.

      (3) An additional approximation that was made without adequate justification is the use of a tight-binding Hamiltonian. This can be a reasonable approximation, even for classical waves, in particular in the presence of high-quality-factor resonators, where most of the wave amplitude is concentrated on the nodes of the network, and nodes are coupled evanescently with each other. Neither of these conditions were verified for this study.

      The tight-binding Anderson Hamiltonian we used in this study originally consisted of the on-site energy at each node and the hopping matrix between nodes. When the on-site energy is relatively much more stable (i.e., has a large negative value) compared to the hopping matrix, most of the wave amplitude becomes concentrated on the nodes as the Reviewer mentioned. However, as is well-known from reference papers (Anderson, 1958; Chang et al., 1995; Meir et al., 1989; Shapir et al., 1982; Thomas and Nakanishi, 2016), in this study, we also removed the on-site energy to prevent the waves from being concentrated on the nodes. Therefore, the tight-binding Hamiltonian we used in this study ensures that waves propagate through edges in the network where the values of the hopping matrix exist.

      To assist the Reviewer in better understanding the model used in this study, we provide additional explanations as follows. In the manuscript, we have already provided detailed descriptions of the setup using the tight-binding Anderson Hamiltonian in the Method section under “Construction of our circuit model” and the explanation of Figure 1. In the model we used, the edges represented by solid lines are perfect conductors, while the dotted lines representing gap junctions act as potential barriers (Fig. 1B). Therefore, when electric signals propagate, we are dealing with the phenomenon where signals transmitted through the edges encounter potential barriers, causing scattering or attenuation. The model described by the Reviewer is indeed a commonly used model in condensed matter physics, but we did not use the exact model mentioned by the Reviewer. Instead, as is common in well-known reference papers, we modified it to suit our purposes. We hope this explanation helps the Reviewer gain a better understanding.

      (4) The motivation for this work is to understand the basic mechanisms underlying subthreshold intrinsic oscillations in the inferior olive, but detailed connectivity patterns in this brain area are not available. The connectome is known for C elegans, but sub-threshold oscillations have not been observed there, and the implications of this work for C elegans neuroscience remain unclear. The authors should also give more evidence for the claim that their study may give a mechanism for synchronized rhythmic activity in the mammalian inferior olive nucleus, or refrain from making this conclusion.

      We agree with the Reviewer's point. In this study, we do not provide additional analysis on the mammalian inferior olive nucleus beyond what is already known from previous research. What we intended to discuss in the Discussion section was to suggest that within our model, there is a “possibility” that a group of cells exchanging wave signals of a specific wavenumber with high transmittance may show synchronized rhythmic activity. Therefore, to avoid any misunderstanding for the reader, we have revised the corresponding sentence in the Discussion as follows.

      In the Discussion, “The plausible possibility according to our model study is that the constructive interference of subthreshold membrane potential waves with a specific wavenumber may generate the synchronized rhythmic activation.

      (5) In the same vein, since the work emphasizes the dependence on the wavenumber for the propagation of subthreshold oscillations, they should make an attempt at estimating the wavenumber of subthreshold oscillations in C elegans if they were to exist and be observed. Next, the presence of two "mobility edges" in the transmission coefficient calculated in this work is unmistakably due to the discrete nature of the system, coming from the tight-binding approximation, and it is unclear if this approximation is justified in the current system.

      In this study, we modeled the propagation of subthreshold waves on the electrical synapse network of C. elegans, but we did not explain the generation of subthreshold oscillations themselves. Here, we simply injected wave signals with various wavenumber values into the network using a hypothetical device called an "Injector." As the Reviewer pointed out, estimating the wavenumbers of subthreshold oscillations that may exist or be observed in C. elegans would require a comprehensive investigation of the membrane potential dynamics occurring in the membranes of individual neurons. However, this is beyond the scope of this study and would require considerable effort to accomplish.

      As for the use of the tight-binding Hamiltonian, we have addressed that in our response to the third paragraph in the Joint Public Review above.

      (6) Similarly, it is possible that the wavenumber-dependent transmission observed depends strongly on the addition of a large number of virtual nodes (VNs) in the network, which the authors give little to no motivation for. As these nodes are not present in the C elegans connectome, the authors should explain the motivation for their inclusion in the model and should discuss their consequences on the transmission properties of the network.

      As mentioned in our response to the third paragraph in the Joint Public Review above, in our model, a node is simply a pathway for waves to pass through. Therefore, inserting virtual nodes between two neurons that are connected in the C. elegans connectome does not alter the actual connection structure. In other words, virtual nodes do not create new connections between cells that didn’t exist in the connectome. The virtual nodes we introduced are merely a way to divide the sections—axon, gap junction, dendrite—through which the wave passes when it is transmitted between two neurons. As we have already explained in Fig. 1B, the edge connected by two virtual nodes, represented by a dotted line, is motivated to depict the gap junction acting as a potential barrier. We hope this explanation helps the Reviewer better understand the model used in this study.

      (7) As it stands, the work would only have a very limited impact on the understanding of subthreshold oscillations in the rat or in C elegans. Indeed, the preprint falls short of relating its numerical results to any phenomena which could be observed in the lab.

      In this study, we proposed a minimalistic model built using the currently available but limited C. elegans connectome information. Specifically, our model is not a phenomenological one that adjusts parameters to accurately predict experimental measurements, but rather an attempt at a novel conceptual approach to theoretically possible scenarios. While the model may not be satisfactory enough to explain experimental phenomena at present, it is a theoretical/computational study that someone needs to undertake. We believe this is the path of scientific progress. Therefore, as the Reviewer has expressed concern, it is entirely understandable that reproducing the numerical results measured in actual experiments is difficult in this study. Nevertheless, we believe that this study makes a basic contribution to the conceptual understanding of subthreshold signal propagation in C. elegans’ electric synapses.

      Rather than offering a stretched opinion, we maintain a positive hope that future researchers in this field will improve the model by incorporating more detailed and extensive biological data through follow-up studies, allowing us to get closer to describing real phenomena.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      The word "Sensory" was misspelled in Figures 2, 4 and 5.

      We appreciate the feedback from Reviewer #1. We have corrected the mentioned typos in Figures 2, 4, and 5 of the revised manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      What neurophysiological changes support the learning of new sensorimotor transformations is a key question in neuroscience. Many studies have attempted to answer this question at the neuronal population level - with varying degrees of success - but few, if any, have studied the change in activity of the apical dendrites of layer 5 cortical neurons. Neurons in layer 5 of the sensory cortex appear to play a key role in sensorimotor transformations, showing important decision and reward-related signals, and being the main source of cortical and subcortical projections from the cortex. In particular, pyramidal track (PT) neurons project directly to subcortical regions related to motor activity, such as the striatum and brainstem, and could initiate rapid motor action in response to given sensory inputs. Additionally, layer 5 cortical neurons have large apical dendrites that extend to layer 1 where different neuromodulatory and long-range inputs converge, providing motor and contextual information that could be used to modulate layer 5 neurons output and/or to establish the synaptic plasticity required for learning a new association. 

      In this study, the authors aimed to test whether the learning of a new sensorimotor transformation could be supported by a change in the evoked response of the apical dendrites of layer 5 neurons in the mouse whisker primary somatosensory cortex. To do this, they performed longitudinal functional calcium imaging of the apical dendrites of layer 5 neurons while mice learned to discriminate between two multi-whisker stimuli. The authors used a simple conditioning task in which one whisker stimulus (upward or backward air pu , CS+) is associated with a reward after a short delay, while the other whisker stimulus (CS-) is not. They found that task learning (measured by the probability of anticipatory licking just after the CS+) was not associated with a significant change in the average population response evoked by the CS+ or the CS-, nor a change in the average population selectivity. However, when considering individual dendritic tufts, they found interesting changes in selectivity, with approximately equal numbers of dendrites becoming more selective for CS+ and dendrites becoming more selective for CS-. 

      One of the major challenges when assessing changes in neural representation during the learning of such Go/NoGo tasks is that the movements and rewards themselves may elicit strong neural responses that may be a confounding factor, that is, inexperienced mice do not lick in response to the CS+, while trained mice do. In this study, the authors addressed this issue in three ways: first, they carefully monitored the orofacial movements of mice and showed that task learning is not associated with changes in evoked whisker movements. Second, they show that whisking or licking evokes very little activity in the dendritic tufts compared to whisker stimuli (CS+ and CS-). Finally, the authors introduced into the design of their task a post-conditioning session after the last conditioning session during which the CS+ and the CS- are presented but no reward is delivered. During this post-session, the mice gradually stopped licking in response to the CS+. A better design might have been to perform the pre-conditioning and post-conditioning sessions in nonwater-restricted, unmotivated mice to completely exclude any lick response, but the fact that the change in selectivity persists after the mice stopped licking in the last blocks of the post-conditioning session (in mice relying only on their whiskers to perform the task) is convincing. 

      The clever task design and careful data analysis provide compelling evidence that learning this whisker discrimination task does not result in a massive change in sensory representation in the apical dendritic tufts of layer 5 neurons in the primary somatosensory cortex on average. Nevertheless, individual dendritic tufts do increase their selectivity for one or the other sensory stimulus, likely enhancing the ability of S1 neurons to accurately discriminate the two stimuli and trigger the appropriate motor response (to lick or not to lick). 

      One limitation of the present study is the lack of evidence for the necessity of the primary somatosensory cortex in the learning and execution of the task. As the authors have strongly emphasized in their previous publications, the primary somatosensory cortex may not be necessary for the learning and execution of simple whisker detection tasks, especially when the stimulus is very salient. Although this new task requires the discrimination between two whisker stimuli, the simplicity and salience of the whisker stimuli used could make this task cortex-independent. Especially when considering that some mice seem to not rely entirely on their whiskers to execute the task. 

      Nevertheless, this is an important result that shows for the first time changes in the selectivity to sensory stimuli at the level of individual apical dendritic tufts in correlation with the learning of a discrimination task. This study sheds new light on the cortical cellular substrates of reward-based learning and opens interesting perspectives for future research in this area. In future studies, it will be important to determine whether the change in selectivity of dendritic calcium spikes is causally involved in the learning of the task or whether it simply correlates with learning, as a consequence of changes in synaptic inputs caused by reward. The dendritic calcium spikes may be involved in the establishment of synaptic plasticity required for learning and impact the output of layer 5 pyramidal neurons to trigger the appropriate motor response. It would be important also to study the changes in selectivity in the apical dendrite of the identified projection neurons.  

      Reviewer #2 (Public Review):

      Summary: 

      The authors did not find an increased representation of CS+ throughout reinforcement learning in the tuft dendrites of Rbp4-positive neurons from layer 5B of the barrel cortex, as previously reported for soma from layer 2/3 of the visual cortex. 

      Alternatively, the authors observed an increased selectivity to both stimuli (CS+ and CS-) during reinforcement learning. This feature: 

      (1) was not present in repeated exposures (without reinforcement), 

      (2) was not explained by the animal's behaviour (choice, licking, and whisking), and 

      (3) was long-lasting, being present even when the mice disengaged from the task. 

      Importantly, increased selectivity was correlated with learning (% correct choices), and neural discriminability between stimuli increased with learning. 

      In conclusion, the authors show that tuft dendrites from layer 5B of the barrel cortex increase the representation of conditioned (CS+) and unconditioned stimuli (CS-) applied to the whiskers, during reinforcement learning. 

      Strengths: 

      The results presented are very consistent throughout the entire study, and therefore very convincing: 

      (1) The results observed are very similar using two different imaging techniques (2-photon planar imaging- and SCAPE-volumetric imaging). Figure 3 and Figure 4 respectively. 

      (2) The results are similar using "different groups" of tuft dendrites for the analysis (e.g.

      initially unresponsive and responsive pre- and post-learning). Figure 5. 

      (3) The results are similar from a specific set of trials (with the same sensory input, but di erent choices). Figure 7. 

      (4) Additionally, the selectivity of tuft dendrites from layer 5B of the barrel cortex was higher in the mice that exclusively used the whisker to respond to the stimuli (CS+ and CS-).  The results presented are controlled against a group of mice that received the same stimuli presentation, except for the reinforcement (reward). 

      Additionally, the behaviour outputs, such as choice, whisking, and licking could not account for the results observed. 

      Although there are no causal experiments, the correlation between selectivity and learning (percentage of correct choices), as well as the increased neural discriminability with learning, but not in repeated exposure, are very convincing. 

      Weaknesses: 

      The biggest weakness is the absence of causality experiments. Although inhibiting specifically tuft dendritic activity in layer 1 from layer 5 pyramidal neurons is very challenging, tuft dendritic activity in layer 1 could be silenced through optogenetic experiments as in Abs et al. 2018. By manipulating NDNF-positive neurons the authors could specifically modify tuft dendritic activity in the barrel cortex during CS presentations, and test if silencing tuft dendritic activity in layer 1 would lead to the lack of selectivity and an impairment of reinforcement learning. Additionally, this experiment will test if the selectivity observed during reinforcement learning is due to changes in the local network, namely changes in local synaptic connectivity, or solely due to changes in the long-range inputs.    

      We agree that such causal manipulations are a logical next step. Such manipulations are unfortunately not specific to layer 5 apicals, so the results would be difficult to interpret. We now discuss the challenge of such manipulations in the Discussion section.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Overall, the study is solid and the article is well and clearly written. I have no suggestion for other experiments that would fall within the scope of this article. I would like only to suggest some additional analyses and clarifications in the writing. 

      Additional analyses: 

      Obviously, the main confounding factor in this type of data comes from the acquired motor response which follows - with a short latency - the sensory stimulus. This is particularly problematic for functional calcium imaging which has very low temporal resolution. The authors have addressed this question to some extent by showing that motor-evoked activity does not account for the change in selectivity acquired with learning and through the use of a post-conditioning session during which no reward was delivered. Figures 8C-D show that mice gradually stop licking in response to CS+ in this session and that the distribution of the selectivity index remains similar in these last blocks. Perhaps a more convincing analysis would be to simply select Miss and Correct rejection trials in which mice did not lick in response to the CS+ and CS-, respectively. Ideally, if the number of trials is sufficient, one could even select trials devoid of any evoked movement (no licking and no whisking).  

      We agree it would be interesting to compare Miss and Correct rejection trials to further rule out effects of a motor response, but there were never enough Miss trials to conduct such an analysis. Even in very early learning, there are few Miss trials (see Figure 1, session 2). We found that in early learning, animals would lick in most trials. Then, over the course of conditioning, they would learn to withhold licks during CS- presentation. Thus, we were able to examine Hits, Correct rejections, and False alarms (Figure 7), but not Miss trials. We have added text suggesting a future experiment in which the stimulus strengths are substantially reduced to drastically increase the error rates.

      The fact that changes in selectivity occur in both directions overall is really interesting. However, in the way the data are presented currently, one may wonder about mice/field of view vs single cell effect. i.e., do di erent dendritic tufts in the same field of view show opposite changes in selectivity? If we were to replot Figure 3A for a single mouse, would we obtain the same picture?  

      We appreciate this very good suggestion and have added scatter plots and selectivity index histograms for individual conditioned animals in Supplementary figure 2. These data demonstrate that different dendritic tufts in the same field of view exhibit opposite changes in selectivity.

      The authors point out that they observed no change in the mean response or selectivity during learning, but did find changes in selectivity at the level of individual dendritic tufts. This suggests that, at the population level, the ability to discriminate between the two stimuli should improve. A possible complementary analysis would be to show that the ability to decode stimulus identity from dendritic tuft population activity increases with learning.  

      Given the substantial change in individual tuft selectivity and that the tuft events occur are not rare, the population result is guaranteed. If individual tufts increase selectivity, the population will also increase its selectivity on a trial-by-trial basis. We have nevertheless included a new supplementary figure with a population analysis using SVMs to demonstrate this.

      Clarification: 

      The authors should make it clear from the beginning that mice are still water-restricted during the post-conditioning session and actually do keep licking for many CS+ trials. Therefore, this session is not devoid of motor response. 

      We have clarified this in the text.

      Did mice in the repeated exposure condition receive any reward during the recording sessions? If so when were rewards delivered? 

      We previously described in the Methods that these mice received water in their home cage, but we now additionally clarify this in the Results section.

      Minor: 

      Figure 2Aii, the labels of the Alpha and Betta barrels should be swapped. 

      Fixed

      Line 218: I believe this sentence should read "Using SCAPE microscopy, ...". 

      Corrected.

      Line 665: 'Reconstruction from 50' does that refer to the single cell reconstruction on the left panel? 

      Yes – Clarified in legend

      Reviewer #2 (Recommendations For The Authors): 

      Minor suggestions: 

      The 'summary' should mention from which brain area the results were acquired. Otherwise, it is misleading, giving the idea that the results described a generic feature, which is still unknown.  

      Added to the text.

      Please correct sentence 219: "SCAPE microscopy, we image tuft activity of additional mice..." 

      Added to the text.

      In the same sentence (219) it would be good to provide the number of additional mice imaged (2). 

      Added to the text.

      Regarding Supplementary Figure 1, it would be interesting to correlate the second peak after reward and learning rate, to provide further support to the sentences 109 to 113. 

      We agree this would be interesting to examine, but only four animals exhibited this second peak, which is too small of a sample to observe a meaningful correlation. We now clarify this in the text.

      In Figure 3, why not present the correlation between 'neural discriminability' and % of correct choices? 

      We appreciate the suggestion and have added this plot to Figure 3.

      The 'results' section will benefit tremendously if the authors consistently indicate the figures to which the results are being described, or 'data not shown' if it is the case. To give a few examples: 

      Sentence 108 - "averaged 28% ΔF/F" - From which figure is this result coming from?  Sentence 123 - "(p = 0.62, 0.64, respectively)" - comparison not shown, but see Figures 2E and D respectively? 

      Sentence 125 - "(CS+ responsive (...) across all sessions)" - From which figure is this result coming from? 

      Sentence 130 - "during pre-conditioning (p=0.66) or post-conditioning sessions (p=0.44) - From which figure? 

      Sentence 154 - "(Pre: p=0.20; last rewarded: p=0.43; Post: p=0.64, sign-rank test)" - From which figure? 

      Sentence 175 - "(-0.049, -0.001, and 0.003" - From which figure? Please show the graph that shows that the mean SI is not different. It can be supplementary. The distribution of SI will be strengthened by it.  

      We added this plot to supplementary figure 2.

      Sentence 244 - "(conditioned: 458/603; repeated exposure: 334/457) - From Figure 5E. 

      Sentence 256 - "(p=0.04, 2-sample t-test comparison mice) - From Figure 5B.  Sentence 258 - "(p=0.03, paired t-test) - from Figure 5B  Sentences 370 to 378 - No reference to the figure. 

      The 'discussion' section (sentences 459 to 494) refers to the differences between the current and previous studies (references 1,3,5), namely soma vs. dendrites and layer 2/3 vs. layer 5. However, it should also mention the difference between the nature of the stimuli and the brain area recorded (visual cortex vs. barrel cortex).

      We have addressed these issues in the text.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1:

      Authors reject the substance of Reviewer 1’s feedback primarily due to clear lack of understanding of typical parameterization practices used to avoid overfitting. To ensure the Spearman-rank correlation accuracy, 70% of all data was withheld from the optimization process and used solely for testing to yield figure 6. Data was withheld prior to model parameterization and therefore avoids Reviewer 1’s charge of “artificially forcing the correlation”. Authors did appreciate the request for clarification of additional definitions and minor reorganization suggestions. Below we provide specific responses to each numbered point (note: multiple responses are provided for some of the reviewer points).

      Point 1: Clarify Metrics Definition and Evaluation

      Authors clarified the description of biodiversity metrics. The metrics associated with manual methods are detailed in the third paragraph of the Materials and Methods: Data Analysis section, while the sensor-based metric is described in the second paragraph, and summarized in its last sentence.

      Text Additions:

      Authors added clarification to the introduction’s first paragraph defining biodiversity metrics, including species richness.

      Authors added detailed definitions of community metrics and their significance in community ecology in the Materials and Methods section (3rd paragraph of “Data Analysis” section). The discussion was updated to include a reference to community ecology and the benefits of big data, specifically highlighting the potential of autonomous optical sensors in entomology.

      Methods Reorganization

      We have reorganized the Methods section for clarity. Updated section clarifies metrics studied, location, dates, a description and methods around optical sensors, Malaise traps, and sweep netting.

      Text Additions:

      An overview paragraph was added to “Data analysis” (3rd paragraph) detailing key metrics used, specifying metrics such as abundance, richness, Shannon index, and Simpson index.

      Visualization methods for sensor data to deliver analogous metrics of abundance, richness, and diversity indices was added to “Data analysis” section.

      Supplementary Table 1 and the first paragraph of the Materials and Methods section cover location, dates, and other general information.

      Detailed descriptions and methods for optical sensors, Malaise traps, and sweeping are provided.

      Integration of Metrics

      Authors integrated two paragraphs explaining the fundamental differences between conventional methods in the 3rd paragraph of the discussion and the presented method of biodiversity measurement.

      Point 2: Body-to-Wing Ratio Calculation

      The backscattered optical cross-section is now clearly defined as the value measured at the maximum point of the event. Specifically, we have added the word ‘maximum’ to our methods section for clarity.

      Point 3: Ecosystem Services Paragraph

      We have shortened and edited this paragraph for clarity. The revised text is now more straightforward and comprehensible.

      Point 4: Results Section Structure

      We believe restructuring the results section around each metric would result in redundancy. The value of our analysis is in the comparison of different methods; therefore, instead of talking about methods in isolation, we provide an integrated discussion and comparison of all three methods across all metrics. Instead, we have maintained our current structure but ensured that the metrics are consistently described and analyzed.

      Point 5: Abundance Correlation

      We agree that the lack of a correlation between methods for abundance remains an open question. However, we maintain that fitting a linear model would be inappropriate and potentially misleading in the absence of significant correlation. We have clarified this in our manuscript.

      Point 6: Richness and Diversity Evaluations

      The authors disagree with Reviewer 1's feedback, citing a clear misunderstanding of standard parameterization practices used to prevent overfitting. Specifically, authors implemented a 30/70 Training/Testing split. Therefore only 30% of the data was used to fit the model and 70% of the dataset was reserved for testing to ensure the validity and reliability of our clustering results. By validating with a 70% testing dataset, we ensure that the clustering model can accurately group new data points and is robust against overfitting. This process helps verify that the identified clusters are meaningful and consistent across different subsets of the data.  Spearman's rho converts the data values into ranks and does not assume a linear relationship between the variables or require the data to follow a normal distribution. Spearman's rank correlation offers robustness against non-linearity and outliers by focusing on ranks. This approach is explained in the 4th paragraph of the “Data Analysis” section.

      Point 7: Clustering Method Credibility

      Authors acknowledge the variability in optical sensor features. However, the Law of Large Numbers supports increased insect measurement accuracy and stability occurs from optical insect sensors due to the increased number of observations made by the optical sensors compared to conventional methods. The manuscript now includes a detailed discussion of these aspects in the 3rd paragraph of discussion, emphasizing the correlation observed despite variability.

      Reviewer 2:

      Authors appreciate Reviewer 2’s feedback especially regarding contextualization. While authors disagree with the need for more specific experimental questions in a methods paper and the suggested need for more complex analysis, we agree with the essence of the review and added additional text regarding potential questions, method applications, and ecosystem processes for contextualization.

      Point 1: Larger Question Framing

      We present this article as a methodological paper rather than asking a specific experimental question. This approach is justified by the generalizable nature of methods papers, akin to those describing ImageJ or mass spectrometers. The method is widely applicable to a range of scientific questions. 

      We provided a discussion on how this technology could be applied in community ecology, conservation, and managed ecological systems like agriculture.

      In the Conclusion section we provided elaboration on the potential research questions and applications.

      Point 2: Complex Analyses

      While complex analyses like NMDS are useful for specific questions, this paper aims to establish the method. Once established, this method can be applied to various research questions in future studies. Therefore, as we are not directly asking an experimental question, more complex analysis is unnecessary.

      Point 3: Ecosystem Process (Granivory) Assay

      We have improved the contextualization and explanation of the ecosystem process assay throughout the manuscript, ensuring it is well-integrated and clear to readers.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      This paper explores how diverse forms of inhibition impact firing rates in models for cortical circuits. In particular, the paper studies how the network operating point affects the balance of direct inhibition from SOM inhibitory neurons to pyramidal cells, and disinhibition from SOM inhibitory input to PV inhibitory neurons. This is an important issue as these two inhibitory pathways have largely been studies in isolation. Support for the main conclusions is generally solid, but could be strengthened by additional analyses.

      Strengths:

      A major strength of the paper is the systematic exploration of how circuit architecture effects the impact of inhibition. This includes scans across parameter space to determine how firing rates and stability depend on effective connectivity. This is done through linearization of the circuit about an effective operating point, and then the study of how perturbations in input effect this linear approximation.

      Weaknesses:

      The linearization approach means that the conclusions of the paper are valid only on the linear regime of network behavior. The paper would be substantially strengthened with a test of whether the conclusions from the linearized circuit hold over a large range of network activity. Is it possible to simulate the full network and do some targeted tests of the conclusions from linearization? Those tests could be guided by the linearization to focus on specific parameter ranges of interest.

      We agree with the reviewer that it would be interesting to test if our results hold in a nonlinear regime of network behaviour (i.e. the chaotic regime, see also comment 1 by reviewer 2). As mentioned above, this requires a different type of model (either rate-based or spiking model with multiple neurons instead of modelling the mean population rate dynamics) which, in our opinion, exceeds the scope of this manuscript. Furthermore, the core measures of our study, network gain, and stability require linearization. In a chaotic regime where the linearization approach is impossible, we would need to consider/define new measures to characterize network response/activity. Therefore, while certainly being an interesting question to study, the broad scope of the studying networks in a nonlinear regime is better tackled in a separate study. We now acknowledge in the discussion of our manuscript that the linearization approach is a limitation in our study and that it would be an interesting future direction to investigate chaotic dynamics.

      The results illustrated in the figures are generally well described but there is very little intuition provided for them. Are there simplified examples or explanations that could be given to help the results make sense? Here are some places such intuition would be particularly helpful:

      page 6, paragraph starting ”In sum ...”

      Page 8, last paragraph

      Page 10, paragraph starting ”In summary ...”

      Page 11, sentence starting ”In sum ...”

      We agree with the reviewer that we didn’t provide enough intuition to our results. We now extended the paragraphs listed by the reviewer with additional information, providing a more intuitive understanding of the results presented in the respective chapter.

      Reviewer #2 (Public Review):

      Summary:

      Bos and colleagues address the important question of how two major inhibitory interneuron classes in the neocortex differentially affect cortical dynamics. They address this question by studying Wilson-Cowan-type mathematical models. Using a linearized fixed point approach, they provide convincing evidence that the existence of multiple interneuron classes can explain the counterintuitive finding that inhibitory modulation can increase the gain of the excitatory cell population while also increasing the stability of the circuit’s state to minor perturbations. This effect depends on the connection strengths within their circuit model, providing valuable guidance as to when and why it arises.

      Overall, I find this study to have substantial merit. I have some suggestions on how to improve the clarity and completeness of the paper.

      Strengths:

      (1) The thorough investigation of how changes in the connectivity structure affect the gain-stability relationship is a major strength of this work. It provides an opportunity to understand when and why gain and stability will or will not both increase together. It also provides a nice bridge to the experimental literature, where different gain-stability relationships are reported from different studies.

      (2) The simplified and abstracted mathematical model has the benefit of facilitating our understanding of this puzzling phenomenon. (I have some suggestions for how the authors could push this understanding further.) It is not easy to find the right balance between biologically detailed models vs simple but mathematically tractable ones, and I think the authors struck an excellent balance in this study.

      Weaknesses:

      (1) The fixed-point analysis has potentially substantial limitations for understanding cortical computations away from the steady-state. I think the authors should have emphasized this limitation more strongly and possibly included some additional analyses to show that their conclusions extend to the chaotic dynamical regimes in which cortical circuits often live.

      We agree with the reviewer that it would be interesting to test if our results hold in a chaotic regime of network behaviour (see also comment by reviewer 1). As mentioned above, this requires a different type of model (either rate-based or spiking model with multiple neurons instead of modelling the mean population rate dynamics) which, in our opinion, exceeds the scope of this manuscript. Furthermore, the core measures of our study, network gain, and stability require linearization. In a chaotic regime where the linearization approach is impossible, we would need to consider/define new measures to characterize network response/activity. Therefore, while certainly being an interesting question to study, the broad scope of the studying networks in a nonlinear regime is better tackled in a separate study. We now acknowledge in the discussion of our manuscript that the linearization approach is a limitation in our study and that it would be an interesting future direction to investigate chaotic dynamics.

      (2) The authors could have discussed – even somewhat speculatively – how SST interneurons fit into this picture. Their absence from this modelling framework stands out as a missed opportunity.

      We believe that the reviewer wanted us to speculate about VIP interneurons (and not SST interneurons, which we already do extensively in the manuscript). Previous models have included VIP neurons in the circuit (e.g. del Molino et al., 2017; Palmigiano et al., 2023; Waitzmann et al., 2024). While we do not model VIP cells explicitly, we implicitly assume that a possible source of modulation of SOM neurons comes from VIP cells. We have now added a short discussion on VIP cells in the last paragraph in our discussion section.

      (3) The analysis is limited to paths within this simple E,PV,SOM circuit. This misses more extended paths (like thalamocortical loops) that involve interactions between multiple brain areas. Including those paths in the expansion in Eqs. 11-14 (Fig. 1C) may be an important consideration.

      We agree with the reviewer that our framework can be extended to study many other different paths, like thalamocortical loops, cortical layer-specific connectivity motifs, or circuits with VIP or L1 inhibitory neurons. Studying these questions, however, are beyond the scope of our work. In our discussion, we now mention the possibility of using our framework to study those questions.

      Reviewer #3 (Public Review):

      Summary:

      Bos et al study a computational model of cortical circuits with excitatory (E) and two subtypes of inhibition parvalbumin (PV) and somatostatin (SOM) expressing interneurons. They perform stability and gain analysis of simplified models with nonlinear transfer functions when SOM neurons are perturbed. Their analysis suggests that in a specific setup of connectivity, instability and gain can be untangled, such that SOM modulation leads to both increases in stability and gain. This is in contrast with the typical direction in neuronal networks where increased gain results in decreased stability.

      Strengths:

      - Analysis of the canonical circuit in response to SOM perturbations. Through numerical simulations and mathematical analysis, the authors have provided a rather comprehensive picture of how SOM modulation may affect response changes.

      - Shedding light on two opposing circuit motifs involved in the canonical E-PV-SOM circuitry - namely, direct inhibition (SOM → E) vs disinhibition (SOM → PV → E). These two pathways can lead to opposing effects, and it is often difficult to predict which one results from modulating SOM neurons. In simplified circuits, the authors show how these two motifs can emerge and depend on parameters like connection weights.

      - Suggesting potentially interesting consequences for cortical computation. The authors suggest that certain regimes of connectivity may lead to untangling of stability and gain, such that increases in network gain are not compromised by decreasing stability. They also link SOM modulation in different connectivity regimes to versatile computations in visual processing in simple models.

      Weaknesses:

      The computational analysis is not novel per se, and the link to biology is not direct/clear.

      Computationally, the analysis is solid, but it’s very similar to previous studies (del Molino et al, 2017). Many studies in the past few years have done the perturbation analysis of a similar circuitry with or without nonlinear transfer functions (some of them listed in the references). This study applies the same framework to SOM perturbations, which is a useful and interesting computational exercise, in view of the complexity of the high-dimensional parameter space. But the mathematical framework is not novel per se, undermining the claim of providing a new framework (or ”circuit theory”).

      In the introduction we acknowledge that our analysis method is not novel but is rather based on previous studies (del Molino et al., 2017; Kuchibhotla et al., 2017; Kumar et al., 2023, Litwin-Kumar et al., 2016; Mahrach et al., 2020; Palmigiano et al., 2023; Veit et al., 2023; Waitzmann et al., 2024). We now rewrote parts of the introduction to make sure that it does not sound like the computational analysis has been developed by us, but that we rather use those previously developed frameworks to dissect stability and gain via SOM modulation.

      Link to biology: the most interesting result of the paper with regard to biology is the suggestion of a regime in which gain and stability can be modulated in an unconventional way - however, it is difficult to link the results to biological networks: - A general weakness of the paper is a lack of direct comparison to biological parameters or experiments. How different experiments can be reconciled by the results obtained here, and what new circuit mechanisms can be revealed? In its current form, the paper reads as a general suggestion that different combinations of gain modulation and stability can be achieved in a circuit model equipped with many parameters (12 parameters). This is potentially interesting but not surprising, given the high dimensional space of possible dynamical properties. A more interesting result would have been to relate this to biology, by providing reasoning why it might be relevant to certain circuits (and not others), or to provide some predictions or postdictions, which are currently missing in the manuscript.

      - For instance, a nice motivation for the paper at the beginning of the Results section is the different results of SOM modulation in different experiments - especially between L23 (inhibition) and L4 (disinhibition). But no further explanation is provided for why such a difference should exist, in view of their results and the insights obtained from their suggested circuit mechanisms. How the parameters identified for the two regimes correspond to different properties of different layers?

      As pointed out by the reviewer, the main goal of our manuscript is to provide a general understanding of how gain and stability depend on different circuit motifs (ie different connectivity parameters), and how circuit modulations via SOM neurons affect those measures. However, we agree with the reviewer that it would be useful to provide some concrete predictions or postdictions following from our study.

      An interesting example of a postdiction of our model is that the firing rate change of excitatory neurons in response to a change in the stimulus (which we define as network gain, Eq. 2) depends on firing rates of the excitatory, PV, and SOM neurons at the moment of stimulus presentation (Fig. 3ii; Fig. 4Aii,Bii,Cii; Fig. 5Aii, Bii, Cii). Hence any change in input to the circuit can affect the response gain to a stimulus presentation, in line with experimental evidence which suggests that changes in inhibitory firing rates and changes in the behavioral state of the animal lead to gain modifications (Ferguson and Cardin 2020).

      Another recent concrete example is the study of Tobin et al., 2023, in which the authors show that optogenetically activating SOM cells in the mouse primary auditory cortex (A1) decreases the excitatory responses to auditory stimuli. In our framework, this corresponds to the case of decreases in network gain (gE) for positive SOM modulation, as seen in the circuit with PV to SOM feedback connectivity (Suppl. Fig. S1).

      Another example is the study by Phillips and Hasenstaub 2016, in which the authors study the effect of optogenetic perturbations of SOM (and PV) cells on tuning curves of pyramidal cells in mouse A1. While they find large heterogeneity in additive/subtractive or multiplicative/divisive tuning curve changes following SOM inactivation, most cells have a purely multiplicative or purely additive component (and none of the cells have a divisive component). In our study, we see that large multiplicative responses of the excitatory population follow from circuits with strong E to SOM feedback connectivity.

      We note that in future computational studies, it would be useful to apply our framework with a focus on a specific brain region and add all relevant cell types (at a minimum E, PV, SOM, and VIP) plus a dendritic compartment, in order to formulate much more precise experimental predictions.

      We have now added additional information to the discussion section.

      - Another caveat is the range of parameters needed to obtain the unintuitive untangling as a result of SOM modulation. From Figure 4, it appears that the ”interesting” regime (with increases in both gain and stability) is only feasible for a very narrow range of SOM firing rates (before 3 Hz). This can be a problem for the computational models if the sweet spot is a very narrow region (this analysis is by the way missing, so making it difficult to know how robust the result is in terms of parameter regions). In terms of biology, it is difficult to reconcile this with the realistic firing rates in the cortex: in the mouse cortex, for instance, we know that SOM neurons can be quite active (comparable to E neurons), especially in response to stimuli. It is therefore not clear if we should expect this mechanism to be a relevant one for cortical activity regimes.

      We agree with the reviewer that it’s important to test the robustness of our results. As suggested by the reviewer, we now include a new supplementary figure (Suppl. Fig. S2) which measures the percentage of data points in the respective quadrant Q1-Q4 when changing the SOM firing rates (as done in Fig. 5). We see that the quadrants in which the network gain and stability change in the same direction (Q2 and Q3) remain high in the case for E to SOM feedback (Suppl. Fig. S2A) over SOM rates ranging over 0-10 Hz (and likely beyond).

      - One of the key assumptions of the model is nonlinear transfer functions for all neuron types. In terms of modelling and computational analysis, a thorough analysis of how and when this is necessary is missing (an analysis similar to what has been attempted at in Figure 6 for synaptic weights, but for cellular gains). In terms of biology, the nonlinear transfer function has experimentally been reported for excitatory neurons, so it’s not clear to what extent this may hold for different inhibitory subtypes. A discussion of this, along with the former analysis to know which nonlinearities would be necessary for the results, is needed, but currently missing from the study. The nonlinearity is assumed for all subtypes because it seems to be needed to obtain the results, but it’s not clear how the model would behave in the presence or absence of them, and whether they are relevant to biological networks with inhibitory transfer functions.

      It is true that the nonlinear transfer function is a key component in our model. We chose identical transfer functions for E, PV, and SOM (; Eq. 4) to simplify our analysis. If the transfer function of one of the neuron types would be linear (β \= 1), then the corresponding b terms (the slope of the nonlinearity at the steady state; b \= dfX/dqX; Fig. 1B; Eq. 4) would be equal to α. Therefore, if neurons had a linear transfer function in our model, there would not be a dependence of network gain on E and PV firing rate as studied in Fig. 3-5. This is because the relationship between PV rates and their gain would be constant (bP \= α) in Fig. 1B (bottom).

      If all the transfer functions were linear, changes in firing rates would not have an impact on network gain or stability. Changing the nonlinear transfer function by changing the α or β terms in Eq. 4 would only scale the way a change in the rates affects the b terms and hence the results presented in Fig. 3-5. More interesting would be to study how different types of nonlinearities, like sigmoidal functions or sublinear nonlinearities (i.e. saturating nonlinearities), would change our results. However, we think that such an investigation is out of scope for this study. We now added a comment to the Methods section.

      Experimentally, F-I curves have been measured also for PV and SOM neurons. For example, Romero-Sosa et al., 2021 measure the F-I curve of pyramidal, PV and SOM neurons in mouse cortical slices. They find that similar to pyramidal neurons, PV and SOM neurons show a nonlinear F-I curve. We now added the citation of Romero-Sosa et al., 2021 to our manuscript.

      - Tuning curves are simulated for an individual orientation (same for all), not considering the heterogeneity of neuronal networks with multiple orientation selectivity (and other visual features) - making the model too simplistic.

      The reviewer is correct that we only study changes in tuning curves in a simplistic model. In our model, the excitatory and PV populations are tuned to a single orientation (in the case of Fig. 7 to θ \= 90). While this is certainly an oversimplification, it allows us to understand how additive/subtractive and multiplicative/divisive changes in the tuning curves come about in networks with different connectivity motifs. To model heterogeneity of tuning responses within a network, it requires more complex models. A natural choice would be to extend a classical ring attractor model (Rubin et al., 2015) by splitting the inhibitory population into PV and SOM neurons, or study the tuning curve heterogeneity that occurs in balanced networks (Hansel and van Vreeswijk 2012). However, this model has many more parameters, like the spatial connectivity profiles from and onto PV and SOM neurons. While highly valuable, we believe that studying such models exceeds the scope of our current manuscript. We now added a paragraph in the discussion section, mentioning this as an interesting future direction.

      Reviewer #1 (Recommendations For The Authors):

      The last sentence of the abstract is hard to interpret before reading the rest of the paper - suggest replacing or rephrasing.

      We rephrased the sentence to make more clear what we mean.

      Page 3, last full paragraph: I think this assumes that phi is positive. What is the justification for that assumption? More generally, I think you could say a bit more about phi in the main text since it is a fairly complicated term.

      The reviewer is correct, for a stable system phi is always positive. We now clarify this and explain phi in more detail in the main text.

      Fig 1D: It would be helpful to identify when the stimulus comes on and be clearer about what the stimulus is. I assume it’s a step increase in S input at 0.05 s or so - but that should be immediately apparent looking at the figure.

      We agree with the reviewer and we added a dashed line at the time of stimulus onset in Fig. 1D.

      Page 5: ”To motivate our analysis we compare ... (Fig. 2A)” - Figure 2A does not show responses without modulation, so this sentence is confusing.

      The dashed lines in Fig. 2A (and Fig. 2C) actually represents the rate change without modulation.

      Page 6: sentence “The central goal of our study ...” seems out of place since this is pretty far into the results, and that goal should already be clear.

      We agree with the reviewer, hence we updated the sentence.

      Page 10, top: the green curve in panel Aii always has a negative slope - so I am confused by the statement that increasing wSE decreases both gain and stability.

      We thank the reviewer for pointing out this mistake. We now fixed it in the text.

      Figure 6: in general it is hard to see what is going on in this figure (the green and blue in particular are hard to distinguish). Some additional labels would be helpful, but I would also see if the color scheme can be improved.

      We added a zoom-in to the panels which were hard to distinguish.

      Reviewer #2 (Recommendations For The Authors):

      Major recommendations:

      (1) The authors should explain early on in the results section what the key factor(s) is that differentiates SOM from PV cells in their model. E.g., in Fig. 1A, the only obvious difference is that SOM cells don’t inhibit themselves. However, later on in the paper, the difference in external stimulus drive to these interneuron classes is more heavily emphasized. Given the importance of that difference (in external stim drive), I think this should be highlighted early on.

      We now mention the key factors that differentiate PV and SOM neurons already when describing Fig. 1A.

      (2) The result in Figs. 5,6 demonstrate that recurrent SOM connectivity is important for achieving increases in both gain and stability. This observation could benefit from some intuitive explanation. Perhaps the authors could find this explanation by looking at their series expansion (Eqs. 11-14, Fig. 1C) and determining which term(s) are most important for this effect. The corresponding paths through the circuit – the most important ones – could then be highlighted for the reader.

      We agree with the reviewer that our results benefit from more intuitive explanations. This has also been pointed out by reviewer 1 in their public review. We now extended the concluding paragraphs in the context of Fig. 4-6 with additional information, providing a more intuitive understanding of the results presented in the respective chapter. While it is possible to gain an intuitive understanding of how the network gain depends on rate and weight parameters (Eq. 2), this understanding is unfortunately missing in the case of stability. The maximum eigenvalue of the system have a complex relationship with all the parameters, and often have nonlinear dependencies on changes of a parameter (e.g. as we show in Fig. 3iv or one can see in Fig. 6). We now discuss this difficulty at the end of the section “Influence of weight strength on network gain vs stability”.

      (3) I think the authors should consider including some analyses that do not rely on the system being at or near a fixed point. I admit that such analysis could be difficult, and this could of course be done in a future study. Nevertheless, I want to reiterate that this addition could add a lot of value to this body of work.

      As outlined above, we decided to not include additional analysis on network behaviour in nonlinear regimes but we now acknowledge in the discussion of our manuscript that the linearization approach is a limitation in our study and that it would be an interesting future direction to investigate chaotic dynamics.

      Minor recommendations:

      (1) At the top of P. 6, when the authors first discuss the stability criterion involving eigenvalues, they should address the question ”eigenvalues of what?”. I suggest introducing the idea of the Jacobian matrix, and explaining that the largest eigenvalue of that matrix determines how rapidly the system will return to the fixed point after a small perturbation.

      We included an additional sentence in the respective paragraph explaining the link between stability and negative eigenvalues, and we also added a sentence in the Methods section stating the the largest real eigenvalue dominates the behavior of the dynamical system.

      (2) The panel labelling in Fig. 3 is unnecessarily confusing. It would be simpler (and thus better) to simply label the panels A,B,C,D, or i,ii,iii,iv, instead of the current labelling: Ai, Aii, Aiii, Aiv. (There are currently no panels ”B” in Fig. 3).

      We updated the figure accordingly.

      Reviewer #3 (Recommendations For The Authors):

      • Suggestions for improved or additional experiments, data or analyses.

      Analysis of the effect of different nonlinear transfer functions is necessary.

      Please see our detailed answer to the reviewer’s comment in the public review above.

      Analysis of gain modulation in models with more realistic tuning properties.

      Please see our detailed answer to the reviewer’s comment in the public review above.

      Mathematical analysis of the conditions to obtain ”untangled” gain and stability:

      One of the promises of the paper is that it is offering a computational framework or circuit theory for understanding the effect of SOM perturbation. However, the main result, namely the untangling of gain and stability, has only been reported in numerical simulations (e.g. Fig. 6). Different parameters have been changed and the results of simulations have been reported for different conditions. Given the simplified model, which allows for rigorous mathematical analysis, isn’t it possible to treat this phenomenon more analytically? What would be the conditions for the emergence of the untangled regime? This is currently missing from the analyses and results.

      We agree with the reviewer that our results benefit from more intuitive explanations. This has also been pointed out by reviewer 1 in their public review. We now extended the concluding paragraphs in the context of Fig. 4-6 with additional information, providing a more intuitive understanding of the results presented in the respective chapter. While it is possible understand analytically of how the network gain depends on rate and weight parameters (Eq. 2), this understanding is unfortunately missing in the case of stability. The maximum eigenvalue of the system have a complex relationship with all the parameters, and often have nonlinear dependencies on changes of a parameter (e.g. as we show in Fig. 3iv or one can see in Fig. 6). This doesn’t allow for a a deep analytical understanding of the entangling of gain and stability. We now discuss this difficulty at the end of the section “Influence of weight strength on network gain vs stability”.

      • Recommendations for improving the writing and presentation. The Results section is well written overall, but other parts, especially the Introduction and Discussion, would benefit from proof reading - there are many typos and problems with sentence structures and wording (some mentioned below).

      We have gone through the manuscript again and improved the writing.

      The presentation of the dependence on weight in Figure 6 can be improved. For instance, the authors talk about the optimal range of PV connectivity, but this is difficult to appreciate in the current illustration and with the current colour scheme.

      We added a zoom in to the panels which were hard to distinguish.

      • Minor corrections to the text and figures. Text:

      We thank the reviewer for their thorough reading of our manuscript. We fixed all the issues from below in the manuscript.

      Some examples of bad structure or wording:

      From the Abstract:

      ”We show when E - PV networks recurrently connect with SOM neurons then an SOM mediated modulation that leads to increased neuronal gain can also yield increased network stability.” From Introduction:

      Sentence starting with ”This new circuit reality ...”

      ”Inhibition is been long identified as a physiological or circuit basis for how cortical activity changes depending upon processing or cognitive needs ...”

      Sentence starting with ”Cortical models with both ...”

      ”... allowing SOM neurons the freedom to ..”

      From Results:

      ”... affects of SOM neurons on E ..”

      ”seem in opposition to one another, with SOM neuron activity providing either a source or a relief of E neuron suppression”. The sentence after is also difficult to read and needs to be simplified.

      P. 7: ”We first remark that ...”

      Difficult to read/understand - long and badly structured sentence.

      P. 8: ”adding a recurrent connection onto SOM neurons from the E-PV subcircuit” It’s from E (and not PV) to be more precise (Fig. 5).

      Discussion:

      ”Firstly, E neurons and PV neurons experience very similar synaptic environments.” What does it mean?

      ”Fortunately, PV neurons target both the cell bodies and proximal dendrites” Fortunately for whom or what? ”in line with arge heterogeneity”

      Methods:

      Matrix B is never defined - the diagonal matrix of b (power law exponents) I assume.

      Some of the other notations too, e.g. bs, etc (it’s implicit, but should be explained).

      Structure of sentence:

      ”Network gain is defined as ...” (p. 17)

      Figure:

      The schematics in Figure 4 can be tweaked to highlight the effect of input (rather than other components of the network, which are the same and repetitive), to highlight the main difference for the reader.

    1. Author response:

      Reviewer #1:

      We thank the reviewer for recognizing the impact of our work on the pivotal roles of N-glycan-dependent ERQC in cellular fitness and pathogenicity and providing valuable comments to be considered to improve the manuscript. As suggested, we will rearrange data, reduce text volume, and discuss the possibility of how ERQC mutation decreases EV secretion without significant defect in conventional secretion. Regarding the proteomics data, we have already initiated a comparative analysis of total intracellular and EV-associated proteins to determine whether the reduced cargo loading in the Ugg1 mutant is specific to EV-associated proteins. Additionally, we may extend the analysis to include total secretion, enabling a clearer comparison between classical secretion and EV-mediated secretion to better evaluate the extent of classical secretion defects in the Ugg1 mutant.

      Reviewer #2:

      We sincerely thank the reviewer for the positive evaluation of our work. As recommended, we will reduce the text and reorganize the data to enhance the manuscript's readability.

      Reviewer #3:

      We sincerely thank the reviewer for the high appreciation of our work. As recommended, we will provide a more detailed explanation of the results with improved interpretation, strongly grounded on the obtained data.

    1. Author response:

      Reviewer #1 (Public review):

      Weaknesses:

      (1) The assertion that membrane trafficking is impaired by this variant could be bolstered by additional data.

      We agree with this comment and will perform additional analysis and experiments to support the assertion that membrane trafficking is impaired. As noted by the Reviewers, standard biochemical approaches to obtain such data may be challenging due to the fact that Kv3.1 is expressed in only a subset of cells and that we do not have a Kv3.1-A421V specific antibody.

      (2) In some experiments details such as the age of the mice or cortical layer are emphasized, but in others, these details are omitted.

      We appreciate that the Reviewer has noted this omission. We will include such details in the resubmission.

      (3) The impairments in PV neuron AP firing are quite large. This could be expected to lead to changes in PV neuron activity outside of the hypersynchronous discharges that could be detected in the 2-photon imaging experiments, however, a lack of an effect on PV neuron activity is only loosely alluded to in the text. A more formal analysis is lacking. An important question in trying to understand mechanisms underlying channelopathies like KCNC1 is how changes in membrane excitability recorded at the whole cell level manifest during ongoing activity in vivo. Thus, the significance of this work would be greatly improved if it could address this question.

      Yes, the impairments in neocortical PV-IN excitability are more marked than any other PV interneuronopathy that we have studied. We will include a more extensive analysis of the 2-photon imaging data in the resubmission. However, there are limitations to the inferences that can be made as to firing patterns based on 2-photon calcium imaging data, particularly for interneurons.

      (4) Myoclonic jerks and other types of more subtle epileptiform activity have been observed in control mice, but there is no mention of littermate control analyzed by EEG.

      We did not observe myoclonic jerks in control mice. This data will be included in the resubmission.

      Reviewer #2 (Public review):

      Weaknesses:

      In some experiments, the age of the animal in each experiment is not clearly stated. For example, the experiments in Figure 2 demonstrate impaired K+ conductance and membrane localization, but it is not clear whether they correlated with the excitability and synaptic defects shown in subsequent figures. Similarly, it is unclear how old mice the authors conducted EEG recordings, and whether non-epileptic mice are younger than those with seizures.

      We will include explicit information as to the age of the animals used for each experiment in the resubmission.

      The trafficking defect of mutant Kv3.1 proposed in this study is based only on the fluorescence density analysis which showed a minor change in membrane/cytosol ratio. It is not very clear how the membrane component was determined (any control staining?). In addition to fluorescence imaging, an addition of biochemical analysis will make the conclusion more convincing (while it might be challenging if the Kv3.1 is expressed only in PV+ cells).

      We will include additional information in the Methods section as to how the membrane component was determined in a revised version of the manuscript. We agree with Reviewer #2 regarding the limitations in the ability to further evaluate this.

      While the study focused on the superficial layer because Kv3.1 is the major channel subunit, the PV+ cells in the deeper cortical layer also express Kv3.1 (Chow et al., 1999) and they may also contribute to the hyperexcitable phenotype via negative effect on Kv3.2; the mutant Kv3.1 may also block membrane trafficking of Kv3.1/Kv3.2 heteromers in the deeper layer PV cells and reduce their excitability. Such an additional effect on Kv3.2, if present, may explain why the heterozygous A421V KI mouse shows a more severe phenotype than the Kv3.1 KO mouse (and why they are more similar to Kv3.2 KO). Analyzing the membrane excitability differences in the deep-layer PV cells may address this possibility.

      We will include recordings from PV-INs in deeper layers of the neocortex in the revised version of the manuscript, as requested.

      In Table 1, the A421V PV+ cells show a depolarized resting membrane potential than WT by ~5 mV which seems a robust change and would influence the circuit excitability. The authors measured firing frequency after adjusting the membrane voltage to -65mV, but are the excitability differences less significant if the resting potential is not adjusted? It is also interesting that such a membrane potential difference is not detected in young adult mice (Table 2). This loss of potential compensation may be important for developmental changes in the circuit excitability. These issues can be more explicitly discussed.

      We will include a more thorough discussion of this finding in the revised version of the manuscript. However, we do not completely understand this finding. It could be compensatory, as suggested by the Reviewer; however, it is transient and seems to be an isolated finding (i.e., there does not appear to be parallel “compensation” in other properties). Alternatively, it could be that impaired excitability of the Kcnc1-A421V/+ PV-INs may reflect impaired/delayed development, which itself is known to be activity-dependent.

      Reviewer #3 (Public review):

      Weaknesses:

      The manuscript identifies a partial mechanism of disease that leaves several aspects unresolved including the possible role of the observed impairments in thalamic neurons in the seizure mechanism. Similarly, while the authors identify a reduction in potassium currents and a reduction in PV cell surface expression of Kv3.1 it is not clear why these impairments would lead to a more severe disease phenotype than other loss-of-function mutations which have been characterized previously. Lastly, additional analysis of video-EEG data would be helpful for interpreting the extent of the seizure burden and the nature of the seizure types caused by the mutation.

      We agree with this comment. We studied neurons in the reticular thalamus as these cells are known to express Kv3.1 and are linked to epilepty pathogenesis. Yet, we focused on neocortical PV-INs over other Kv3.1-expressing neurons such as neurons of the reticular thalamus because we evaluated the impairments of intrinsic excitability to be more profound in neocortical PV-INs. Cross of Kcnc1-Flox(A421V)/+ mice to a cerebral cortex interneuron-specific driver that would avoid recombination in thalamus – such as Ppp1r2-Cre (RRID:IMSR_JAX:012686) – could assist in determining the relative contribution of thalamic reticular nucleus dysfunction to the overall phenotype, as performed by Makinson et al (2017) to address a similar question. There are of course other Kv3.1-expressing neurons in the brain, including in GABAergic interneurons in hippocampus and amygdala. We will include additional discussion in a revised version of the manuscript as to why we think there is more severe impairment in our Kcnc1-Flox(A421V)/+ mice relative to Kv3.1 and Kv3.2 knockout mice. We will include additional data on the epilepsy phenotype in the revised version of the manuscript, as requested.

    1. Author response:

      We thank the Dr. Ealand and Reviewers for their thoughtful comments on our submitted manuscript. We are in the process of revising our manuscript in light of the comments received, outlined below.

      In addition to the requested revisions, we have new data with M. tuberculosis strain H37Rv +/- gidB deletion (and complementation), confirming that deletion of gidB sensitizes the strain to rifampicin, and extending our findings to pathogenic tuberculosis. This will also be incorporated into the revised manuscript.

      Reviewer #1:

      (1) The structural work at the end feels like both an afterthought in terms of the science and the writing. I would suggest re-writing that section to be clearer about what the figure says and does not say. For example, the caption of Figure 6 appears to be more informative than the text and refers to concepts not present in the main text. In general, I found this section to be the most difficult to understand.

      We are rewriting this section to make it more coherent with the rest of the manuscript.

      (2) "delta-gidB" is written out in the caption of Figure 6. Line 234: gidB not italics.

      Thank you, these changes will be incorporated in the revised manuscript.

      Reviewer #2:

      (1) It would be essential to provide information regarding the growth rate and, ideally, translation rates in the gidB KO and the isogenic WT. As translation balances accuracy and speed, only characterising the speed is not sufficient to understand the phenomenon.

      We are performing these assays and will incorporate them in the revised manuscript.

      (2) Cryo-EM analysis of vacant 70S ribosomes is not sufficient for understanding the mechanisms underlying the accuracy defects in the gidB KO. One should assemble and solve structurally near-cognate and non-cognate complexes. I believe the authors are over-interpreting the scant structural data they have. Furthermore, current representation makes it impossible to assess the resolution of the structure, especially in the areas of interest.

      While we agree with the Reviewer that structures of translating ribosomes will be most informative in elucidating the molecular mechanism(s) by which methylation (or not) by GidB contributes to mistranslation, those experiments are ongoing and beyond the scope of the current study. Unlike E. coli ribosomes, for which there are a plethora of structures for mutants available, there are very structures of mycobacterial ribosomes beyond wild-type apo ribosomes. Therefore we feel that the structures of apo mycobacterial ribosomes +/- GidB-mediated methylation are still of value, and a necessary “first step” for the mechanistic work alluded to above. Secondly, the apo ribosome structures still hint at potential mechanisms by which mistranslation and 16S rRNA methylation may impact on each other – as in the comments to R#1 above, we are revising the text to increase clarity and coherence of this section.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, the authors follow up on their published observation that providing a lower glucose parental nutrition (PN) reduces sepsis from a common pathogen [Staphylococcus epidermitis (SE)] in preterm piglets. Here they found that a higher dose of glucose could thread the needle and get the protective effects of low glucose without incurring significant hypoglycemia. They then investigate whether the change in low glucose PN impacts metabolism to confer this benefit. The finding that lower glucose reduces sepsis is important as sepsis is a major cause of morbidity and mortality in preterm infants, and adjusting PN composition is a feasible intervention.

      Strengths:

      (1) They address a highly significant problem of neonatal sepsis in preterm infants using a preterm piglet model.

      (2) They have compelling data in this paper (and in a previous publication, ref 27) that low glucose PN confers a survival advantage. A downside of the low glucose PN is hypoglycemia which they mitigate in this paper by using a slightly high amount of glucose in the PN.

      (3) The experiment where they change PN from high to low glucose after infection is very important to determine if this approach might be used clinically. Unfortunately, this did not show an ability to reduce sepsis risk with this approach. Perhaps this is due to the much lower mortality in the high glucose group (~20% vs 87% in the first figure).

      (4) They produce an impressive multiomics data set from this model of preterm piglet sepsis which is likely to provide additional insights into the pathogenesis of preterm neonatal sepsis.

      Weaknesses:

      (1) The high glucose control gives very high blood glucose levels (Figure 1C). Is this the best control for typical PN and glucose control in preterm neonates? Is the finding that low glucose is protective or high glucose is a risk factor for sepsis?

      This work is a follow-up from our previous work where we explored different PN glucose regimens. Taken together our experiments heavily imply that glucose provision is associated to severity in a seemingly linear manner. In the clinical setting, there is no fixed glucose provision, but guidelines specify ranges that are acceptable. However, these guidelines do not take possible infections into account and are designed to optimize growth outcomes. Increased provision of glucose to preterm neonates may therefore increase their infection risk, but parenteral glucose cannot be entirely avoided as it would lead to hypoglycaemia and associated brain damage. In the present paper the reduced glucose PN reflects the lowest end of the recommended PN glucose intake. More work is needed to figure out the best glucose provision to infected preterm newborns, balancing positive and negative factors.

      (2) In Figure 1B, preterm piglets provided the high glucose PN have 13% survival while preterm piglets on the same nutrition in Figure 6B have ~80% survival. Were the conditions indeed the same? If so, this indicates a large amount of variation in the outcome of this model from experiment to experiment.

      In the follow-up experiment outlined in Figure 6 we reduced the follow-up time to 12 hours in an effort to minimize the suffering of the animals. We did this because we could detect relevant differences in the immune response between High and low glucose infected pigs as 12 hours. If we had extended the follow-up experiment to 22 hours we would likely have seen a much increased mortality.

      (3) Piglets on the low glucose PN had consistently lower density of SE (~1 log) across all time points. This may be due to changes in immune response leading to better clearance or it could be due to slower growth in a lower glucose environment.

      We agree with this assessment and have adjusted our result section to reflect this.

      (4) Many differences in the different omics (transcriptomics, metabolomics, proteomics) were identified in the SE-LOW vs SE-HIGH comparison. Since the bacterial load is very different between these conditions, could the changes be due to bacterial load rather than metabolic reprogramming from the low glucose PN?

      We analyzed the relationship between bacterial burdens and mortality and found that it did not correlate within each of the treatment groups. We have now added this data to the results section as supplemental and report this fact in the section called “Reduced glucose supply increases hepatic OXPHOS and gluconeogenesis and attenuates inflammatory pathways”. This finding inspired us to further explore the relationship between bacterial burdens and infection responses in our model which has resulted in our recent preprint: Wu et at. Regulation of host metabolism and defense strategies to survive neonatal infection. BioRxiv 2024.02.23.581534; doi: https://doi.org/10.1101/2024.02.23.581534

      Reviewer #2 (Public Review):

      Summary:

      The authors demonstrate that a low parenteral glucose regimen can lead to improved bacterial clearance and survival from Staph epi sepsis in newborn pigs without inducing hypoglycemia, as compared to a high glucose regimen. Using RNA-seq, metabolomic, and proteomic data, the authors conclude that this is primarily mediated by altered hepatic metabolism.

      Strengths:

      Well-defined controls for every time point, with multiple time points and biological replicates. The authors used different experimental strategies to arrive at the same conclusion, which lends credibility to their findings. The authors have published the negative findings associated with their study, including the inability to reverse sepsis-related mortality after switching from SE-high to SE-low at 3h or 6h and after administration of hIAIP.

      Weaknesses:

      (1) The authors mention, and it is well-known, that Staph epi is primarily involved in late-onset sepsis. The model of S. epi sepsis used in this study clearly replicates early-onset sepsis, but S. epi is extremely rare in this time period. How do the authors justify the clinical relevance of this model?

      The distinction between early and late onset sepsis makes sense clinically because they are likely to be caused by different organisms and therefore require different empirical antibiotic regimes. Early onset sepsis is caused by organisms transferred perinatally often following chorioamnionitis or uro-gential maternal infections (Strep. agalacticae/E. coli) whereas Late onset sepsis is likely caused by organisms from indwelling catheters or mucosal surfaces, most often coagulase negative staphylococci. Timing of an infection after birth of course plays a role, but the virulence factors of the pathogen probably plays a large role in shaping the immune response. Therefore, even though the infection in our model is initiated on the first day after birth, the organism that we use, Staph epidermidids, makes it a better model for pathogenesis of late onset sepsis. However, it is also important to acknowledge that the pathophysiology of “sepsis” may be similar despite timing and pathogen and depends on the degree of immune activation and downstream effects on organs.

      (2) The authors find that the neutrophil subset of the leukocyte population is diminished significantly in the SE-low and SE-high populations. However, they conclude on page 10 that "modulations of hepatic, but not circulating immune cell metabolism, by reduced glucose supply..." and this is possible because the authors have looked at the entire leukocyte transcriptome. I am curious about why the authors did not sequence the neutrophil-specific transcriptome.

      We collected the whole blood transcript during the experiments, which reflect the transcription profile of all the circulating leucocytes. Since we did not do single cell RNA sequencing during the experiment there is no possibility of isolating the neutrophil transcriptome at this time. Your point however is valid and we will reconsider incorporating single cell transcriptomics in future experiments.

      (3) The authors use high (30g/k/d) and low (7.2g/k/d) glucose regimens. These translate into a GIR of 21 and 5 mg/k/min respectively. A normal GIR for a preterm infant is usually 5-8, and sometimes up to 10. Do the authors have a "safe GIR" or a threshold they think we cannot cross? Maybe a point where the metabolism switch takes place? They do not comment on this, especially as GIR and glucose levels are continuous variables and not categorical.

      Our reduced glucose PN was chosen as it corresponded with the low end of recommended guidelines for PN glucose intake. There likely is not a “safe GIR” as the clinical responses to glucose intake during infections do not seem binary but increase with glucose intake. It is also important to remember that the reduced glucose intervention still resulted in significant morbidity and a 25% mortality within 22 hours. There is therefore still vast room for improvement, but even though further reduction in PN glucose would probably provide further protection it would entail dangerous hypoglycaemia (as described in our previous paper). The findings in this current paper has prompted us to explore several strategies to replace parenteral glucose with alternative macronutrients. Thus, the optimal PN for infected newborns would probably differ from standard PN in all macronutrients and will require much more pre- and clinical research.

      (4) In Figures 2B and C the authors show that SE-high and SE-low animals have differences in the oxphos, TCA, and glycolytic pathways. The authors themselves comment in the Supplementary Table S1B, E-F that these same metabolic pathways are also different in the Con-Low and Con-high animals, it is just the inflammatory pathways that are not different in the non-infected animals. How can they then justify that it is these metabolic pathways specifically which lead to altered inflammatory pathways, and not just the presence of infection along with some other unfound mechanism?

      It is to be expected that the inflammatory pathways do not differ between the Con-Low and Con-High groups as there is no infection to induce these pathways. The identified metabolic pathways that differ between SE-High and SE-Low animals seem to us the best explanation of the differences in clinical phenotype.

      (5) The authors mention in Figure 1F that SE-low animals had lower bacterial burdens than SE-high animals, but then go on to infer that the inflammatory cytokine differences are attributed to a rewiring of the immune response. However, they have not normalized the cytokine levels to the bacterial loads, as the differences in the cytokines might be attributed purely to a difference in bacterial proliferation/clearing.

      Please see our response to reviewer #1

      (6) The authors mention that switching from SE-high to SE-low at 3 or 6 h time points does not reduce mortality. Have the authors considered the reverse? Does hyperglycemia after euglycemia initially, worsen mortality? That would really conclude that there is some metabolic reprogramming happening at the very onset of sepsis and it is a lost battle after that.

      A very good point that we have not explored yet, we have added this consideration to the discussion and slightly amended our conclusions of this follow-up experiment.

      Reviewer #3 (Public Review):

      Summary:

      Baek and colleagues present important follow-up work on the role of serum glucose in the management of neonatal sepsis. The authors previously showed high glucose administration exacerbated neonatal sepsis, while strict glucose control improved outcomes but caused hypoglycemia. In the current report they examined the effect of a more tailored glucose management approach on outcomes and examined hepatic gene expression, plasma metabolome/proteome, blood transcriptome, as well as the the therapeutic impact of hIAIP. The authors leverage multiple powerful approaches to provide robust descriptive accounts of the physiologic changes that occur with this model of sepsis in these various conditions. Strengths:

      (1) Use of preterm piglet model.

      (2) Robust, multi-pronged approach to address both hepatic and systemic implications of sepsis and glucose management.

      (3) Trial of therapeutic intervention - glucose management (Figure 6), hIAIP (Figure 7).

      Weaknesses:

      (1) The translational role of the model is in question. CONS is rarely if ever a cause of EOS in preterm neonates. The model. uses preterm pigs exposed at 2 hours of age. This model most likely replicates EOS.

      Please see our response to Reviewer #2

      (2) Throughout the manuscript it is difficult to tell from which animals the data are derived. Given the ~90% mortality in the experimental CONS group, and 25% mortality in the intervention group, how are the data from animals "at euthanasia" considered? Meaning - are data from survivors and those euthanized grouped together? This should be clarified as biologically these may be very different populations (ie, natural survivor vs death).

      This is a very valid point. For all endpoints that are analyzed “at euthanasia” the age of the animal will vary. Some will have been euthanized early due to clinical deterioration and some will have survived all the way to the end of the experiment. This needs to be kept in mind when interpreting the results. We have further highlighted this point in the discussion and made it clear to the reader at what time-point each analysis was performed.

      (3) With limited time points (at euthanasia ) for hepatic transcriptomics (Figure 2), plasma metabolite (Figure 3) blood transcriptome (Figure 4), and plasma proteome (Figure 5) it is difficult to make conclusions regarding mechanisms preceding euthanasia. Per methods, animals were euthanized with acidosis or clinical decompensation. Are the reported findings demonstrative of end-organ failure and deterioration leading to death, or reflective of events prior?

      Yes, all organ specific endpoints are snapshots of the state of the animals at the time of euthanasia, pooling together animals that succumbed to sepsis and those that survived to 22 hours post infection. These results therefore reflect the end-state of the infection we cannot be sure when the differences between groups manifested themselves. However, given the stark differences in plasma lactate at 12 hours post infection it is likely that changes to metabolism occurred before most of animals succumbed to sepsis.

      We agree this is a weakness in our model, but we have since published a pre-print where we have further explored how metabolic adaptations shape the fate of similarly infected preterm pigs: BioRxiv 2024.02.23.581534; doi: https://doi.org/10.1101/2024.02.23.581534

      (4) Data are descriptive without corresponding "omics" from interventions (glucose management and/or hIAIP) or at least targeted assessment of key differences.

      We only did in-depth analysis of the glucose intervention as this showed the most promising clinical effects that warranted further in-depth investigation. It is possible that further insights could be gained from in-depth analysis of the other interventions but given that there were no obvious clinical befits we refrained from that.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      I am intrigued that mortality was not correlated to bacterial burden. Please provide the "data not shown" as this would help the reader understand better whether the difference in bacterial burden is driving the phenotypes and findings of the low glucose group.

      We have added this data to supplementary figure 1.  

      Reviewer #2 (Recommendations For The Authors):

      (1) I would urge the authors to consider a neutrophil-specific transcriptomic analysis. I understand that this would add significantly to the resubmission process. If the authors wish to include that as a future direction instead, they need to specifically mention the limitations of whole blood transcriptomics and how different immune cell types react differently to bacterial antigens.

      We agree with your considerations but we cannot include that data using the whole blood method applied in the experiment. We have added your consideration to the discussions.

      (2) I urge the authors to remove any impression that this is a model of late-onset sepsis, which is implied from the introduction, lines 3 and 4.

      Our intention was not to directly suggest that our model is a perfect reflection of late-onset sepsis but rather to highlight the relevance of using a pathogen commonly associated with LOS. We believe our model primarily captures the effects of intense pro-inflammatory immune activation, which may have parallels with various forms of sepsis, including LOS.

      Reviewer #3 (Recommendations For The Authors):

      Drawing on the robust nature of your "omics", identify key measures and test whether they are altered earlier in the development of clinical sepsis. Test whether these are altered by the intervention.

      A very valid point, at the moment it is not possible for us to explore this within the confines of these experiments. But, building upon these findings and the ones in our recent preprint we are confident that shifts in hepatic ratio of Oxidative phosphorylation and gluconeogenesis vs glycolysis shape the immune response to infections in neonates. In our upcoming experiments we are planning to incorporate plasma metabolomics at earlier timepoints to monitor when shifts in metabolism occur. However, given the heterogeneity of pigs, as opposed to inbred rodent models, sacrificing animals at fixed timepoints to gauge their organ function will be hard to interpret as it is impossible to know what the end state of the particular animal would have been. Therefore longitudinal sampling of liver tissue, during the course of infection would be challenging.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      In "Drift in Individual Behavioral Phenotype as a Strategy for Unpredictable Worlds," Maloy et al. (2024) investigate changes in individual responses over time, referred to as behavioral drift within the lifespan of an animal. Drift, as defined in the paper, complements stable behavioral variation (animal individuality/personality within a lifetime) over shorter timeframes, which the authors associate with an underlying bet-hedging strategy. The third timeframe of behavioral variability that the authors discuss occurs within seasons (across several generations of some insects), termed "adaptive tracking." This division of "adaptive" behavioral variability over different timeframes is intuitively logical and adds valuable depth to the theoretical framework concerning the ecological role of individual behavioral differences in animals.

      Strengths:

      While the theoretical foundations of the study are strong, the connection between the experimental data (Figure 1) and the modeling work (Figure 2-4) is less convincing.

      Weaknesses:

      In the experimental data (Figure 1), the authors describe the changes in behavioral preferences over time. While generally plausible, I identify three significant issues with the experiments:

      (1) All of the subsequent theoretical/simulation data is based on changing environments, yet all the experiments are conducted in unchanging environments. While this may suffice to demonstrate the phenomenon of behavioral instability (drift) over time, it does not properly link to the theory-driven work in changing environments. An experiment conducted in a changing environment and its effects on behavioral drift would improve the manuscript's internal consistency and clarify some points related to (3) below.

      In our framework, we posit that the amount of drift has been shaped by evolution to maximize fitness in the environments that the population has experienced, and this drift is observed independent of environment. While we agree that exploring the role of changing environments on the measure of drift would be interesting, we would anticipate the effects may be nuanced and beyond the scope of the current paper (and the scope of our theoretical work, which assumes that the individual phenotype is unaffected by change of environment except as mediated by death due to fitness effects). For example, it would be difficult to differentiate drift from idiosyncratic differences in learning (Smith et al., 2022), and non-adaptive plasticity to unrelated cues has been posited as a method of producing diverse phenotypes (Maxwell and Magwene, 2017), so “learning” to uncorrelated stimuli could conceivably be a mechanism for drift. Given the scope of the current study, we prioritized eliminating potential confounds for measuring drift, but remain interested in the interaction between learning and drift.

      (2) The temporal aspect of behavioral instability. While the analysis demonstrates behavioral instability, the temporal dynamics remain unclear. It would be helpful for the authors to clarify (based on graphs and text) whether the behavioral changes occur randomly over time or follow a pattern (e.g., initially more right turns, then more left turns). A proper temporal analysis and clearer explanations are currently missing from the manuscript.

      We agree it would be helpful to have more description of the dynamics over time aside from the power spectrum and autoregressive model fits. We hope to address this in more detail to provide more description of the changes over time in a revision.

      (3) The temporal dimension leads directly into the third issue: distinguishing between drift and learning (e.g., line 56). In the neutral stimuli used in the experimental data, changes should either occur randomly (drift) or purposefully, as in a neutral environment, previous strategies do not yield a favorable outcome. For instance, the animal might initially employ strategy A, but if no improvement in the food situation occurs, it later adopts strategy B (learning). In changing environments, this distinction between drift and learning should be even more pronounced (e.g., if bananas are available, I prefer bananas; once they are gone, I either change my preference or face negative consequences). Alternatively, is my random choice of grapes the substrate for the learning process towards grapes in a changing environment? Further clarification is needed to resolve these potential conflicts.

      As in our response to point 1, we believe this is a crucial distinction, and we intend to further highlight it in the discussion in the revision and further expand our discussion of how the two strategies may interact.

      Reviewer #2 (Public review):

      Summary:

      This is an inspired study that merges the concept of individuality with evolutionary processes to uncover a new strategy that diversifies individual behavior that is also potentially evolutionarily adaptive.

      The authors use a time-resolved measurement of spontaneous, innate behavior, namely handedness or turn bias in individual, isogenic flies, across several genetic backgrounds.

      They find that an individual's behavior changes over time, or drifts. This has been observed before, but what is interesting here is that by looking at multiple genotypes, the authors find the amount of drift is consistent within genotype i.e., genetically regulated, and thus not entirely stochastic. This is not in line with what is known about innate, spontaneous behaviors. Normally, fluctuations in behavior would be ascribed to a response to environmental noise. However, here, the authors go on to find what is the pattern or rule that determines the rate of change of the behavior over time within individuals. Using modeling of behavior and environment in the context of evolutionarily important timeframes such as lifespan or reproductive age, they could show when drift is favored over bet-hedging and that there is an evolutionary purpose to behavioral drift. Namely, drift diversifies behaviors across individuals of the same genotype within the timescale of lifespan, so that the genotype's chance for expressing beneficial behavior is optimally matched with potential variation of environment experienced prior to reproduction. This ultimately increases the fitness of the genotype. Because they find that behavioral drift is genetically variable, they argue it can also evolve.

      Strengths:

      Unlike most studies of individuality, in this study, the authors consider the impact of individuality on evolution. This is enabled by the use of multiple natural genetic backgrounds and an appropriately large number of individuals to come to the conclusions presented in the study. I thought it was really creative to study how individual behavior evolves over multiple timescales. And indeed this approach yielded interesting and important insight into individuality. Unlike most studies so far, this one highlights that behavioral individuality is not a static property of an individual, but it dynamically changes. Also, placing these findings in the evolutionary context was beneficial. The conclusion that individual drift and bet-hedging are differently favored over different timescales is, I think, a significant and exciting finding.

      Overall, I think this study highlights how little we know about the fundamental, general concepts behind individuality and why behavioral individuality is an important trait. They also show that with simple but elegant behavioral experiments and appropriate modeling, we could uncover fundamental rules underlying the emergence of individual behavior. These rules may not at all be apparent using classical approaches to studying individuality, using individual variation within a single genotype or within a single timeframe.

      Weaknesses:

      I am unconvinced by the claim that serotonin neuron circuits regulate behavioral drift, especially because of its bidirectional effect and lack of relative results for other neuromodulators. Without testing other neuromodulators, it will remain unclear if serotonin intervention increases behavioral noise within individuals, or if any other pharmacological or genetic intervention would do the same. Another issue is that the amount of drugs that the individuals ingested was not tracked. Variable amounts can result in variable changes in behavior that are more consistent with the interpretation of environmental plasticity, rather than behavioral drift. With the current evidence presented, individual behavior may change upon serotonin perturbation, but this does not necessarily mean that it changes or regulates drift.

      However, I think for the scope of this study, finding out whether serotonin regulates drift or not is less important. I understand that today there is a strong push to find molecular and circuit mechanisms of any behavior, and other peers may have asked for such experiments, perhaps even simply out of habit. Fortunately, the main conclusions derived from behavioral data across multiple genetic backgrounds and the modeling are anyway novel, interesting, and in fact more fundamental than showing if it is serotonin that does it or not.

      We agree that our data do not support a strong conclusion that serotonin plays a privileged role in regulating drift. Based on previous literature (e.g. Kain et al., 2014, where identical pharmacological manipulations had an effect on variability while dopaminergic and octopaminergic manipulations did not), we think it likely that large global perturbations in serotonin that we observe are likely to influence plasticity that might be involved in drift (and thus find the results we observe not particularly surprising). Nonetheless, we agree that the mechanism by which serotonin may affect drift could be indirect, and it is similarly plausible that many global perturbations could lead to some shift in the amount of drift. We intend to further discuss these issues in the revision.

      To this point, one thing that was unclear from the methods section is whether genotypes that were tested were raised in replicate vials and how was replication accounted for in the analyses. This is a crucial point - the conclusion that genotypes have different amounts of behavioral drift cannot be drawn without showing that the difference in behavioral drift does not stem from differences in developmental environment.

      While a cursory inspection suggests that batch effects between different replicates was small, we intend to clarify this and more explicitly address the effects of replicates in revision.

      Reviewer #3 (Public review):

      Summary:

      The paper begins by analyzing the drift in individual behavior over time. Specifically, it quantifies the circling direction of freely walking flies in an arena. The main takeaway from this dataset is that while flies exhibit an individual turning bias (when averaged over time), their preferences fluctuate over slow timescales.

      To understand whether genetic or neuromodulatory mechanisms influence the drift in individual preference, the authors test different fly strains concluding that both genetic background and the neuromodulator serotonin contribute to the degree of drift.

      Finally, the authors use theoretical approaches to identify the range of environmental conditions under which drift in individual bias supports population growth.

      Strengths:

      The model provides a clear prediction of the environmental fluctuations under which a drift in bias should be beneficial for population growth.

      The approach attempts to identify genetic and neurophysiological mechanisms underlying drift in bias.

      Weaknesses:

      Different behavioral assays are used and are differently analysed, with little discussion on how these behaviors and analyses compare to each other.

      We intend to address this in a revision of the discussion.

      Some of the model assumptions should be made more explicit to better understand which aspects of the behaviors are covered.

      We will further clarify the assumptions of the model in revision.

  2. Nov 2024
    1. Author response:

      Reviewer #1 (Public review):

      Summary:

      Urination requires precise coordination between the bladder and external urethral sphincter (EUS), while the neural substrates controlling this coordination remain poorly understood. In this study, Li et al. identify estrogen receptor 1-expressing neurons (ESR1+) in Barrington's nucleus as key regulators that faithfully initiate or suspend urination. Results from peripheral nerve lesions suggest that BarEsr1 neurons play independent roles in controlling bladder contraction and relaxation of the EUS. Finally, the authors performed region-specific retrograde tracing, claiming that distinct populations of BarEsr1 neurons target specific spinal nuclei involved in regulating the bladder and EUS, respectively.

      Strength:

      Overall, the work is of high quality. The authors integrate several cutting-edge technologies and sophisticated, thorough analyses, including opto-tagged single unit recordings, combined optogenetics, and urodynamics, particularly those following distinct peripheral nerve lesions.

      Weakness:

      (1) My major concern is the novelty of this study. Keller et al. 2018 have shown that BarEsr1 neurons are active during urination and play an essential role in relaxing the external urethral sphincter (EUS). Minimally, substantial content that merely confirms previous findings (e.g. Figures 1A-E; Figures 3A-E) should be move to the supplementary datasets.

      Indeed, we are aware of and have carefully studied the literature of Keller et al. Our manuscript here presents novel experiments beyond the scopes of that paper. Thanks to this comment, we will substantially revise our manuscript to enhance the visibility of novel data while keeping the agreeing data in the supplementary.

      (2) I also have concerns regarding the results showing that the inactivation of BarEsr1 neurons led to the cessation of EUS muscle firing (Figures 2G and S5C). As shown in the cartoon illustration of Figure 8, spinal projections of BarEsr1 neurons contact interneurons (presumably inhibitory) that innervate motor neurons, which in turn excite the EUS. I would therefore expect that the inactivation of BarEsr1 should shift the EUS firing pattern from phasic (as relaxation) to tonic (removal of relaxation), rather than stopping their firing entirely. Could the authors comment on this and provide potential reasons or mechanisms for this finding?

      We agree with this point. We meant that the EUS’ phasic bursting pattern was rapidly stopped upon BarEsr1 photoinhibition, but not all the firing stopped instantaneously. According to the previous studies (Chang et al., 2007, de Groat, 2009, de Groat and Yoshimura, 2015, Kadekawa et al., 2016), the voiding physiology of rodents is probably different from that of humans, such that for rodents the urine is step-wise pumped out in the gap time between multiple consecutive EUS phasic bursting epochs, and for humans the urine is continuously pumped out once the EUS firing is almost fully inhibition during a period of time. Namely, for mice, the EUS display sustained tonic activity following phasic bursting, while, in contrast, for humans the EUS keeps tonic firing until the moment of voiding onset (complete inhibition, muscle relaxed). Despite the prominent differences in the basic physiological properties, our assumption is that the logic of circuits from the brainstem to the urethra in this pathway is evolutionally conserved for both species; thus the logic of brainstem coordination of voiding could also be the same for both species, which is the main interest of our study (of using an animal model to address concerns of human health). Thus, to interpret our data for a broader audience we made a simplified and inaccurate expression. We apologize for the inaccuracy and we will correct our previous inaccurate description in the revised manuscript.

      (3) Current evidence is insufficient to support the claim that the majority of BarEsr1 neurons innervate the SPN but not DGC. The current spinal images are uninformative, as the fluorescence reflects the distribution of Esr1- or Crh-expressing neurons in the spinal cord, along with descending BarEsr1 or BarCrh axons. Given the close anatomical proximity of these two nuclei, a more thorough histological analysis is required to demonstrate that the spinal injections were accurately confined to either the SPN or the DGC.

      We agree that current evidence is insufficient to support the current claim. To address this concern and strengthen our claim, we will repeat the retrograde viral tracing experiments, combined with CTB647 injections to label the injection site, to validate specific targeting of SPN or DGC populations. We will also add higher-magnification imaging to distinguish BarESR1 axonal projections targeting SPN versus DGC. Results from these ongoing experiments will be incorporated into the revised manuscript.

      Reviewer #2 (Public review):

      Summary:

      The authors have performed a rigorous study to assess the role of ESR1+ neurons in the PMC to control the coordination of bladder and sphincter muscles during urination. This is an important extension of previous work defining the role of these brainstem neurons, and convincingly adds to the understanding of their role as master regulators of urination. This is a thorough, well-done study that clarifies how the Pontine micturition center coordinates different muscle groups for efficient urination, but there are some questions and considerations that remain.

      Strengths:

      These data are thorough and convincing in showing that ESR1+PMC neurons exert coordinated control over both the bladder and sphincter activity, which is essential for efficient urination. The anatomical distinctions in pelvic versus pudendal control are clear, and it's an advance to understand how this coordination occurs. This work offers a clearer picture of how micturition is driven.

      Weaknesses:

      The dynamics of how this population of ESR1+ neurons is engaged in natural urination events remains unclear. Not all ESR1+neurons are always engaged, and it is not measured whether this is simply variation in population activity, or if more neurons are engaged during more intense starting bladder pressures, for instance. In particular, the response dynamics of single and doubly-projecting neurons are not defined. Additionally, the model for how these neurons coordinate with CRH+ neuron activity in the PMC is not addressed, although these cell types seem to be engaged at the same time. Lastly, it would be interesting to know how sensory input can likely modulate the activity of these neurons, but this is perhaps a future direction.

      In response to the reviewer’s comments, we will attempt perform the following revisions for this round:

      (1) Engagement of ESR1+ neurons in natural urination events:

      We agree that probably not all ESR1+ neurons are consistently engaged during urination. To address this, we will perform a detailed analysis of the opto-tagged single unit recordings data.

      (2) Response dynamics of single- and doubly-projecting neurons:

      (a) We will use retrograde labelling combined with Ca2+ photometry recordings to differentiate the response dynamics of SPN- and DGC-projecting neurons during urination.

      (b) We will perform functional validations to assess the specific roles of single- and doubly-projecting neurons in coordinating bladder and EUS activity.

      (3) Coordination with CRH+ neurons in the PMC:<br /> We appreciate the suggestion to include CRH+ neurons in our model. We will expand our model to incorporate CRH+ neurons and their potential interactions with ESR1+ neurons.

      (4) Sensory modulation of ESR1+ neurons:<br /> The reviewer raises an excellent point regarding sensory input modulation of ESR1+ neuron activity. Although this is beyond the scope of our current study, we recognize its importance and propose to include this as a future direction.

      Reviewer #3 (Public review):

      Summary:

      The paper by Li et al explored the role of Estrogen receptor 1 (Esr1) expressing neurons in the pontine micturition center (PMC), a brainstem region also known as Barrington's nucleus (Hou et al 2016, Keller et al 2018). First, the author conducted bulk Ca2+ imaging/unit recording from PMCESR1 to investigate the correlations of PMCESR1 neural activity to voiding behavior in conscious mice and bladder pressure/external urethral muscle activity in urethane anesthetized mice. Next, the authors conducted optogenetics inactivation/activation of PMCESR1 to confirm the contribution to the voiding behavior also conducted peripheral nerve transection together with optogenetics activation to confirm the independent control of bladder pressure and urethral sphincter muscle.

      Weaknesses:

      (1) The study demonstrates that pelvic nerve transection reduces urinary volume triggered by PMCESR1+ cell photoactivation in freely moving mice. Could the role of pudendal nerve transection also be examined in awake mice to provide a more comprehensive understanding of neural involvement?

      Thank you for the suggestion, the pudendal nerve transection in awake mice is indeed a challenging experiment that has been missed. We will try it for the revision.

      (2) While the paper primarily focuses on PMCESR1+ cells in bladder-sphincter coordination, the analysis of PMCESR1+-DGC/SPN neural circuits - given their distinct anatomical projections in the sacral spinal cord - feels underexplored. How do these circuits influence bladder and sphincter function when activated or inhibited? Also, do you have any tracing data to confirm whether bladder-sphincter innervation comes from distinct spinal nuclei?

      Thank you for this great comment. The projection-specific neuronal function analysis is, as also suggested by Reviewer 2 in a similar comment (#8), missing in our first submission. These are so challenging experiments that we have missed in the first round of tests, but we decide to pursuit this goal again. Namely, we will perform photometry recordings of PMC neurons projecting to the DGC/SPN during measuring bladder pressure and urethral sphincter EMG activity. Additionally, while our study does not include direct tracing data to confirm distinct spinal nuclei for bladder and sphincter innervation, this has been well-documented in classic literature (Yao et al., 2018, Karnup and De Groat, 2020, Karnup, 2021). Specifically, anatomical studies have shown that SPN primarily innervates the bladder, while the DGC is associated with the innervation of the urethral sphincter. We will cite these references to provide context and support for our interpretations.

      (3) Although the paper successfully identifies the physiological role of PMCESR1+ cells in bladder-sphincter coordination, the study falls short in examining the electrophysiological properties of PMCESR1+-DGC/SPN cells. A deeper investigation here would strengthen the findings.

      While our study primarily focuses on the functional role of PMCESR1+ neurons in bladder-sphincter coordination, we acknowledge that understanding their intrinsic electrophysiological characteristics could further strengthen our findings. However, this aspect falls beyond the scope of the current study. Nevertheless, we recognize the significance of this direction and are excited to pursue it in future research. We appreciate the reviewer’s suggestion, as it highlights an important avenue for expanding upon our current findings.

      (4) The parameters for photoactivation (blue light pulses delivered at 25 Hz for 15 ms, every 30 s) and photoinhibition (pulses at 50 Hz for 20 ms) vary. What drove the selection of these specific parameters? Moreover, for photoactivation experiments, the change in pressure (ΔP = P5 sec - P0 sec) is calculated differently from photoinhibition (Δpressure = Ppeak - Pmin). Can you clarify the reasoning behind these differing approaches?

      We sincerely thank the reviewer for raising these important points and for the opportunity to clarify our experimental design and data analysis methods.

      Photoactivation versus photoinhibition parameters: The differences in photoactivation (25 Hz, 15 ms pulses) and photoinhibition (50 Hz, 20 ms pulses) protocols are based on the distinct physiological and technical requirements for activating versus inhibiting PMCESR1+ neurons. For photoactivation, 25 Hz stimulation aligns with the natural firing patterns of central neurons, allowing for intermittent activation without exceeding the neuronal refractory period. The shorter pulse duration (15 ms) minimizes phototoxicity and avoids overstimulation, as performed in previous studies (Keller et al., 2018). In contrast, photoinhibition requires sustained suppression of neuronal activity, achieved through higher frequencies (50 Hz) and longer pulses (20 ms) to ensure continuous coverage of neuronal activity.

      Calculation of pressure changes (ΔP) for photoactivation and photoinhibition: The differing methods for calculating pressure changes reflect the distinct physiological effects we aimed to capture. In photoactivation experiments (ΔP = P5 sec - P0 sec), the pressures before (P0 sec) and 5 seconds after (P5 sec) light delivery were compared to capture the immediate effect of light activation on bladder pressure, focusing on the onset and early dynamics of activation. In contrast, photoinhibition experiments assessed the immediate impact of light-induced suppression on bladder pressure during an ongoing voiding event. Here, Δpressure was calculated as Ppeak – Pmin to measure the rapid drop in pressure directly attributable to neuronal inhibition.

      We will expand these details in the methods section of the revised manuscript to provide greater transparency.

      (5) The discussion could further emphasize how PMCESR1+ cells coordinate bladder contraction and sphincter relaxation to control urination, highlighting their central role in the initiation and suspension of this process.

      We fully agree with this point. Additionally, in response to your and other reviewers’ suggestions, we are preparing a new round of experiments with projection-specific recording, and thus our discussion and conclusion will also be updated according to the newly obtained data.

      (6) In Figure 8, The authors analyze the temporal sequence of bladder pressure and EUS bursting during natural voiding and PMC activation-induced voiding. It would be acceptable to consider the existence of a lower spinal reflex circuit, however, the interpretation of the data contains speculation. Bladder pressure measurement is hard to say reflecting efferent pelvic nerve activity in real time. (As a biological system, bladder contraction is mediated by smooth muscle, and does not reflect real-time efferent pelvic nerve activity. As an experimental set-up, bladder pressure measurement has some delays to reflect bladder pressure because of tubing, but EUS bursting has no delay.) Especially for the inactivation experiment, these factors would contribute to the interpretation of data. This reviewer recommends a rewrite of the section considering these limitations. Most of the section is suitable for the results.

      Thank you for mentioning the possibility of bladder pressure measurement delay. We would prefer to perform a physical control test to quantify how much delay this measurement is under our experimental conditions. We will use a small ballon to mimic the bladder and use two identical pressure sensors, one with a very short tube inserted into the ballon and one with an extended tube same as in our animal experiments. We will then mimic both contraction initiation and halting, and quantify the delay between the two sensors.

      References

      • Chang HY, Cheng CL, Chen JJJ, de Groat WC. 2007. Serotonergic drugs and spinal cord transections indicate that different spinal circuits are involved in external urethral sphincter activity in rats. American Journal of Physiology-Renal Physiology 292: F1044-F1053. DOI: 10.1152/ajprenal.00175.2006

      • de Groat WC. 2009. Integrative control of the lower urinary tract: preclinical perspective. British Journal of Pharmacology 147. DOI: 10.1038/sj.bjp.0706604

      • de Groat WC, Yoshimura N. 2015. Anatomy and physiology of the lower urinary tract. Handb Clin Neurol 130: 61-108. DOI: 10.1016/B978-0-444-63247-0.00005-5

      • Kadekawa K, Yoshimura N, Majima T, Wada N, Shimizu T, Birder LA, Kanai AJ, de Groat WC, Sugaya K, Yoshiyama M. 2016. Characterization of bladder and external urethral activity in mice with or without spinal cord injury—a comparison study with rats. American Journal of Physiology-Regulatory, Integrative and Comparative Physiology 310: R752-R758. DOI: 10.1152/ajpregu.00450.2015

      • Karnup S. 2021. Spinal interneurons of the lower urinary tract circuits. Autonomic Neuroscience 235. DOI: 10.1016/j.autneu.2021.102861

      • Karnup SV, De Groat WC. 2020. Mapping of spinal interneurons involved in regulation of the lower urinary tract in juvenile male rats. IBRO Rep 9: 115-131. DOI: 10.1016/j.ibror.2020.07.002

      • Keller JA, Chen J, Simpson S, Wang EH-J, Lilascharoen V, George O, Lim BK, Stowers L. 2018. Voluntary urination control by brainstem neurons that relax the urethral sphincter. Nature Neuroscience 21: 1229-1238. DOI: 10.1038/s41593-018-0204-3             

      • Yao J, Zhang Q, Liao X, Li Q, Liang S, Li X, Zhang Y, Li X, Wang H, Qin H, Wang M, Li J, Zhang J, He W, Zhang W, Li T, Xu F, Gong H, Jia H, Xu X, Yan J, Chen X. 2018. A corticopontine circuit for initiation of urination. Nature Neuroscience 21: 1541-1550. DOI: 10.1038/s41593-018-0256-4

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The study aimed to better understand the role of the H3 protein of the Monkeypox virus (MPXV) in host cell adhesion, identifying a crucial α-helical domain for interaction with heparan sulfate (HS). Using a combination of advanced computational simulations and experimental validations, the authors discovered that this domain is essential for viral adhesion and potentially a new target for developing antiviral therapies.

      Strengths:

      The study's main strengths include the use of cutting-edge computational tools such as AlphaFold2 and molecular dynamics simulations, combined with robust experimental techniques like single-molecule force spectroscopy and flow cytometry. These methods provided a detailed and reliable view of the interactions between the H3 protein and HS. The study also highlighted the importance of the α-helical domain's electric charge and the influence of the Mg(II) ion in stabilizing this interaction. The work's impact on the field is significant, offering new perspectives for developing antiviral treatments for MPXV and potentially other viruses with similar adhesion mechanisms. The provided methods and data are highly useful for researchers working with viral proteins and protein-polysaccharide interactions, offering a solid foundation for future investigations and therapeutic innovations.

      Weaknesses:

      However, some limitations are notable. Despite the robust use of computational methodologies, the limitations of this approach are not discussed, such as potential sources of error, standard deviation rates, and known controls for the H3 protein to justify the claims. Additionally, validations with methodologies like X-ray crystallography would further benefit the visualization of the H3 and HS interaction.

      Thank you very much for the evaluation and appreciation of our work. In response to the identified weakness, we have conducted additional analyses to further assess the limitations of the computational methodologies used. Specifically, we predicted the MPXV H3 structure using two other AI-based protein structure prediction models, ESMFold and RoseTTAFold2. Both models also predicted an a-helical structure, which supports our conclusion. However, they yielded lower pLDDT scores (Figure S1A-C in the revised SI), indicating that some error may be present.

      We agree with this reviewer, as well as the other reviewers, that X-ray crystallography data for the H3 structure would be highly valuable. Unfortunately, we lack the expertise in structural biology to obtain these results at this stage. To complement this, we performed molecular dynamics (MD) simulations, which suggest that the helical domain is connected to the main domain via a flexible linker. This flexibility may help explain the challenges in obtaining a high-resolution X-ray structure. In fact, to date, the only structural data available for H3 is from the VAVC, which excludes the helical domain (The helical domain part is cleaved for the X-ray studies). We have added this point to the discussion and hope that experts in structural biology will be able to resolve the structure of this domain in the future.

      Reviewer #2 (Public Review):

      Summary:

      The manuscript presenting the discovery of a heparan-sulfate (HS) binding domain in monkeypox virus (MPXV) H3 protein as a new anti-poxviral drug target, presented by Bin Zhen and co-workers, is of interest, given that it offers a potentially broad antiviral substance to be used against poxviruses. Using new computational biology techniques, the authors identified a new alpha-helical domain in the H3 protein, which interacts with cell surface HS, and this domain seems to be crucial for H3-HS interaction. Given that this domain is conserved across orthopoxviruses, authors designed protein inhibitors. One of these inhibitors, AI-PoxBlock723, effectively disrupted the H3-HS interaction and inhibited infection with Monkeypox virus and Vaccinia virus. The presented data should be of interest to a diverse audience, given the possibility of an effective anti-poxviral drug.

      Strengths:

      In my opinion, the experiments done in this work were well-planned and executed. The authors put together several computational methods, to design poxvirus inhibitor molecules, and then they test these molecules for infection inhibition.

      Weaknesses:

      One thing that could be improved, is the presentation of results, to make them more easily understandable to readers, who may not be experts in protein modeling programs. For example, figures should be self-explanatory and understood on their own, without the need to revise text. Therefore, the figure legend should be more informative as to how the experiments were done.

      Thank you very much for your appreciation of our work and your support. In response to the identified weakness, we have carefully reviewed all the figure legends to ensure they are more informative.

      Reviewer #3 (Public Review):

      Summary:

      The article is an interesting approach to determining the MPOX receptor using "in silico" tools. The results show the presence of two regions of the H3 protein with a high probability of being involved in the interaction with the HS cell receptor. However, the α-helical region seems to be the most probable, since modifications in this region affect the virus binding to the HS receptor.

      Strengths:

      In my opinion, it is an informative article with interesting results, generated by a combination of "in silico" and wet science to test the theoretical results. This is a strong point of the article.

      Weaknesses:

      Has a crystal structure of the H3 protein been reported?

      The following text is in line 104: "which may represent a novel binding site for HS". It is unclear whether this means this "new binding site" is an alternative site to an old one or whether it is the true binding site that had not been previously elucidated.

      Thank you very much for your thoughtful evaluation and appreciation of our work.

      We agree with this reviewer, as well as the other reviewers, that X-ray crystallography data for the H3 structure would be highly valuable. Unfortunately, we are not experts in structural biology, and we have not yet been able to obtain these structural results. To date, the only structure available for H3 is the one from VAVC, which does not include the helical domain. We have included this point in the discussion and hope that experts in structural biology will be able to resolve the structure of this domain in the future.

      Regarding the "novel binding site," this term refers to "the true binding site that had not been previously elucidated." Previous research identified that H3 binds to heparan sulfate (HS), but the exact binding site had not been determined.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Validation of Results with Other Experimental Methods: While single-molecule force spectroscopy and flow cytometry provide valuable data, including complementary methods such as X-ray crystallography could offer additional insights into the H3-HS interaction and the effectiveness of the inhibitors.

      Discussion of Computational Model Limitations: Although the use of AlphaFold2 and other advanced tools is a strength, it is important to discuss the limitations of these models in more detail, including potential sources of error and how they may impact the interpretation of the results.

      During the manuscript evaluation, it is not clear the protein localization (transmembrane?) since the protein`s end is very close to the virus membrane surface. All experiments demonstrated the protein without being anchored to the membrane, letting the interaction site always be exposed. If the protein is linked to the membrane, how would the site be exposed due to the limited space between it and the virus structure?

      Thank you for these insightful comments. As you pointed out, the H3 protein, particularly the helical domain at the C-terminal, is indeed located close to the membrane, which could limit the available space for H3 binding. To investigate this further, we modeled the full-length H3 protein in the context of the membrane and performed molecular dynamics (MD) simulations to assess the available space. Our results show that there is more than 1 nm of space between the helical domain and the membrane, which should be sufficient for potential heparan sulfate (HS) binding (see Figure 1E, and Figure S1D&E in the revised manuscript).

      Minor corrections:

      Line 31: "is an emerging zoonotic pathogen" should be revised to reflect that Mpox is a re-emerging virus, given its history of causing outbreaks, such as in 2003.

      Line 71 and Line 75: Adding an explanation of "Mg binding sites" and "GAG motifs" would enhance reader understanding, as these represent important points in the study. The current positioning of Figure 1 causes some confusion for the reader.

      Line 111: High score? What controls were used for the protein? Are there known inhibitors of H3? If so, why weren't they tested for structure comparison? Additionally, what about other molecules that H3 binds to, such as UDP-Glucose, as demonstrated in the base article for the Vaccinia virus H3 protein available in the PDB?

      Figure 2B: Improve the legend, as the colors of the lines are not clear.

      Thank you for your instructive comments. We have addressed most of them in the revised manuscript.

      Regarding the "high score," AlphaFold2 provides a confidence score for its protein structure predictions, with a maximum score of 100. A score above 80 indicates a high level of confidence in the prediction.

      There are known inhibitors (such as antibodies) of H3, and while the sequence is available, no structure has been reported so far. Previous s NMR titration measurements have shown that UDP-glucose binds to H3, but no structural data for the complex exist. To date, the only available crystal structure is of a truncated H3, which does not include the helical domain we identified from VAVC.

      Reviewer #2 (Recommendations For The Authors):

      The text described in the result section does not match the text presented in Figures. So, it is not easy to see what are the authors referring to when they mention the Figure. For example, the text referring to Figure S8 mentions the GB1 domain and the Cohesin module, but these are not mentioned in Figure S8.

      I do not understand the results presented in Figure 5B. It is not clear to me, from the Figure legend nor after reading the Material and Methods, how this experiment was done. Specifically, what is plotted on X, is it the amount of inhibitor or the amount of protein? These things have to be checked through the manuscript.

      It would be interesting to confirm if the inhibition of infection is based on the inhibition of viral binding to the cells. This should not be complicated to realize, and it could provide evidence for the mechanism of action.

      Extensive use of terms like "this domain" is not good in this type of article, like in lines 207, and 211. It is not always clear to what domain are authors referring to, so it may be much better to mention the domain in question by the exact name.

      Line 337, If I am not mistaken dilutions are serial not series.

      Line 613, in methods. Please use g force instead of rpm, it is more informative. Even if it is just to pellet cells.

      Thank you very much for your instructive comments. We have addressed most of them in the revised manuscript. For instance, the immobilization of the GB1 domain and the cohesin module is now mentioned in Figure S9. Additionally, in the previous Figure 5B, the "x" represents the concentration of the inhibitor. Serial and g force is updated.

      Reviewer #3 (Recommendations For The Authors):

      Line 190

      Did you mutate all the amino acids at the same time? What was the impact of all these mutations on the structure of the helical region? Or if you modeled the protein again after replacing these 7 amino acids, did you find that there was no difference? Regardless of your answer, you must include a superposition of the mutated structure and the wt.

      Thank you for the insightful comment. We have now also predicted the structure of the serine mutant using AlphaFold2 (AF2). As expected, the helical domain structure remains largely preserved with only minor differences. We have included these results in Figure S6, as suggested.

      Figure 2D

      In this graph, the authors should indicate the ΔG as a negative value. In fact, the graph does not match the text.

      Thanks for the reminder, it is corrected in the graph

      Figure 4B

      Is the difference in binding force significantly different? 28.8 vs 33.7 pN

      The absolute difference in binding force is not large (~5 pN). However, for a system with a relatively low binding force, this difference is significant. Specifically, the 5 pN difference accounts for approximately a 14% reduction in binding force. We have included this percentage in the revised manuscript.

      Figure 5

      If AI-PoxBlocks723 was the only peptide effective in inhibiting viral infection of MPOX and other related viruses but not with 100% effectiveness, do you think this could be a consequence of a low interaction efficiency or the existence of a different receptor? Or a secondary region of binding in the H3? Can you argue about this?

      It has been proposed that there are other adhesion proteins for MPXV, such as D8, in addition to H3. We believe this accounts for the observed less-than-100% effectiveness.

      The use of peptides as "inhibitory tools" could have an interesting effect in vitro, however, in vivo the immunological response against the peptide will reduce/eliminate it, how you may optimize the "drug" development with this system, as you state in line 387.

      Thank you for your thoughtful comment. You are correct that the use of peptides as inhibitory tools could induce an immune response in vivo, which might limit their effectiveness over time. To optimize this approach for drug development, conjugate the peptides with carrier molecules, such as liposomes, nanoparticles, or dendrimers, which can protect the peptides from immune detection and improve their delivery to target cells. This could allow for more controlled and sustained release of the peptide in vivo, reducing the chances of immune clearance. We have added this discussion in the revised manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Author Response

      Reviewer #1 (Public Review): 

      Weaknesses: 

      - Having demonstrated that NK cell IFNgamma is important for recruiting and activating DCs and T cells in their model, one is left to wonder whether it is important for the therapeutic effect, which was not tested. 

      We conducted a preliminary study to compare the pro-survival effect of WT NK and Ifng-/- NK cell therapies. We found that, in the 95-500 mg day-21 tumor group, the overall survival (OS) of mice receiving Ifng-/- NK cell therapy significantly decreased (p = 0.045) compared to mice receiving WT NK cell therapy up to 60 days after tumor inoculation, but there was no difference in OS beyond 65 days after tumor inoculation. Therefore, we have added the following sentences at the end of the second paragraph in our Discussion (Page 32):

      “However, although Ifng-/- NK cells induced less cDC activation compared to WT NK cells, the levels of CD86 on cDCs of mice that received Ifng-/- NK cells were higher than those of mice not subjected to NK cell transfer (Figure 4B). This outcome indicates the presence of IFN-g-independent or/and compensatory mechanism(s) for cDC activation by the transferred NK cells, which is in line with our preliminary result that Ifng-/- NK cell therapy does not significantly diminish the pro-survival effect in comparison to WT NK cell therapy beyond 60 days after tumor cell inoculation (data not shown).”

      - It was somewhat difficult to gauge the clinical trial results because the trial was early stage and therefore not controlled. Evaluation of the results therefore relies on historical comparisons. To evaluate how encouraging the results are, it would be valuable for the authors to provide some context on the prognoses and likely disease progression of these patients at the time of treatment. 

      We had already indicated in our Results that all six patients had an ECOG performance status of 0 (Page 25 and Table). We have now added in the Results that they had “a predicted survival of >3 months” (Page 25).

      Reviewer #1 (Recommendations For The Authors):

      Minor points: 

      (1) It would be helpful if the authors provided a rationale for why they derived their NK cell product from bone marrow cells instead of the more common source, spleen cells. 

      We now clarify that: “We used BM cells instead of splenocytes for NK cell culture because removal of T cells from BM cells before culturing is not necessary” (Page 35) to the section Ex vivo expansion of murine and human NK cells in our Materials and Methods.

      (2) It would have been helpful to provide summary results from replicates of the cytokine production data shown in Figure 1F. 

      We have now added a graphical panel on the relative ΔMFI of two independent experiments to Figure 1F and revised the figure legend accordingly (Page 7—8).

      (3) The role of conventional CD4+ T cells is a little unclear. The authors state in the discussion that they contribute to the antitumor response, which is consistent with their finding that depleting both CD4 T cells and CD8 T cells has a greater effect than depleting CD8 T cells. Depleting CD4 T cells alone trended towards improving the response, however. Probably Tregs are the culprit in the latter effect but a sentence or two would be helpful if the claim for a protective role for CD4 T cells is to remain.  

      We have now re-analyzed the data of Figure 3D by separating mice into two groups according to day 21 tumor weight, i.e., 95-600 mg and >600 mg (Page 13—14). We have revised our explanation of the Figure 3D data in the Results (Page 11—12) as follows:

      “Accordingly, we examined the role of T cells in NK cell therapy by depleting T cell subsets with antiCD4 or/and anti-CD8 antibodies two days before primary tumor resection (Figure 3D Schema and Figure 3-figure supplement 1). In the 95-600 mg tumor group, depletion of CD8+ cells alone or both CD4+ and CD8+ cells diminished the effect of NK cell therapy, whereas depletion of CD4+ cells alone did not affect OS (Figure 3D). This result indicates that CD8+ T cells are essential for the effect of NK cell therapy. In contrast, the >600 mg tumor group displayed a limited NK-cell treatment effect as expected, but did exhibit improved OS upon depleting CD4+ cells alone (Figure 3D). As the proportion of lung Foxp3+CD4+ T cells in CD45+ cells positively correlated with day 21 tumor weight (data not shown), depletion of Foxp3+CD4+ T cells by anti-CD4 antibody likely has a stronger effect in augmenting the immune response for the >600 mg tumor group than the 95-600 mg tumor group. Moreover, both tumor groups showed diminished OS upon depletion of both CD4+ and CD8+ cells than was the case for depletion of CD8+ cells alone, indicating a CD8+ T cell-independent anti-tumor effect of CD4+ T cells (Figure 3D).”

      (4) The schema in Figure 3E states that mice were inoculated with either EO771 tumor cells or B16F10 tumor cells, but it appears that the data only show EO771 tumor challenges. This should be corrected. 

      Corrected according to the reviewer’s comment.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This paper reports fossil soft-tissue structures (tail vanes) of pterosaurs, and attempts to relate this to flight performance and other proposed functions for the tail

      Strengths:

      The paper presents new evidence for soft-tissue strengthening of vanes using exciting new methods.

      We thank Reviewer #1 for the positive assessment of our work.

      Weaknesses:

      There seems to be no discussion of bias in the sample selection method - even a simple consideration of whether discarded specimens were likely not to have had the cross-linking lattice, or if it was not visible.

      There seems to be no supporting evidence or theory to show how the lattice could have functioned, other than a narrative description. Moreover, there is no comparison to extant organisms where a comparison of function might be drawn.

      We note these weaknesses and have addressed them as part of the consensus of suggested edits given below (‘first option’). We thank the reviewer for this feedback.

      Reviewer #2 (Public review):

      Summary:

      The authors have set out to investigate and explain how early members of the Pterosauria were able to maintain stiffness in the vane of their tails. This stiffness, it is said, was crucial for flight in early members of this clade. Through the use Laser-Stimulated Fluorescence imaging, the authors have revealed that certain pterosaurs had a sophisticated dynamic tensioning system that has previously been unappreciated.

      Strengths:

      The choice of method of investigation for the key question is sound enough, and the execution of the same is excellent. Overall the paper is well written and well presented, and provides a very succinct, accessible and clear conclusion.

      We thank Reviewer #2 for their positive assessment of our work.

      Weaknesses:

      None

      We thank Reviewer #2 for their positive assessment of our work.

      Recommendations for the authors:

      The consensus between the reviewers and reviewing board is that this manuscript can be substantially strengthened and this can be achieved in two ways that are presented in order of preference.

      First option; resolve the following weaknesses:

      - Include a rigorous discussion of possible bias in the sample selection method with consideration of discarded specimens in relation to cross-linking lattice observation.

      - Include published biomechanics theory, supported by citations or a self-derived biomechanical model, to show how the lattice could have functioned biomechanically.

      - Discuss whether you found similar mechanisms in extant organisms for comparative functional interpretation.

      We thank the reviewers and reviewing board for taking the time to discuss the review and propose two consensus options for how to substantially strengthen the manuscript. We carefully considered both proposed options and decided to implement the first option in full. We have therefore made main text edits relating to all three points of the first option. The marked up article file shows exactly which parts of the text were edited in relation to the points.

      Second option; rewrite the manuscript so no mechanistic claims are made that are not supported by the information presented:

      - Accept the possibility of sampling bias and its limitation in the presentation of cross-linking lattice observation, outlining future work needed to address this.

      - Discuss biomechanics theory needs to be developed to show how the lattice could have functioned biomechanically and remove unsupported speculation about this. It is acceptable to present a new hypothesis, clearly outline the motivation for the hypothesis and how it can be tested with future biomechanical and comparative studies. Remove and replace all current speculative sections and phrasing accordingly and replace this with the framework supporting the idea of a new hypothesis.

      The first option was implemented instead of the second option.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      Previous work has shown that the evolutionarily-conserved division-orienting protein LGN/Pins (vertebrates/flies) participates in division orientation across a variety of cell types, perhaps most importantly those that undergo asymmetric divisions. Micromere formation in echinoids relies on asymmetric cell division at the 16-cell stage, and these authors previously demonstrated a role for the LGN/Pins homolog AGS in that ACD process. Here they extend that work by investigating and exploiting the question of why echinoids but not other echinoderms form micromeres. Starting with a phylogenetics approach, they determine that much of the difference in ACD and micromere formation in echinoids can be attributed to differences in the AGS Cterminus, in particular a GoLoco domain (GL1) that is missing in most other echinoderms.

      Thank you for the summary.

      Strengths: 

      There is a lot to like about this paper. It represents a superlative match of the problem with the model system and the findings it reports are a valuable addition to the literature. It is also an impressively thorough study; the authors should be commended for using a combination of experimental approaches (and consequently generating a mountain of data). 

      Thank you.

      Weaknesses: 

      There is an intriguing finding described in Figure 1. AGS in sea cucumbers looks identical to AGS in the pencil urchin, at least at the C terminus (including the GL1 domain). Nevertheless, there are no micromeres in sea cucumbers. Therefore another mechanism besides GL motif organization has arisen to support micromere formation. It is a consequential finding and an important consideration in interpreting the data, but I could not find any mention of it in the text. That is a missed opportunity and should be remedied, ideally not only through discussion but also experimentation. Specifically: does sea cucumber AGS (SbAGS) ever localize to the vegetal cortex in sea cucumbers? Can it do so in echinoids? Will that support micromere formation? 

      Thank you for pointing this out. 

      To respond to the Reviewer’s request, we synthesized sea cucumber (Sb) AGS based on the sequence available in the database and tested it in the sea urchin (Sp) embryos, which is enclosed in Fig. S3. We performed this experiment to confirm that SbAGS localizes less at the vegetal cortex than SpAGS as a proof of principle. However, we hesitate to conduct further studies using the synthetic sequence in this study. Sea cucumbers are an emerging yet understudied model. This species is not readily available or established as a model system for embryology. Even for the two species (A. japonicus in Japan and P. parvimensis in the USA) that were previously used for embryonic studies, their gametes are typically available only for 12 months in a year. Since some echinoderm researchers are aiming to establish sea cucumbers as a model system in the near future (see 2024 review: PMID: 38368336), we hope to be able to have better access to their embryos in the future. Yet, it may require a few more years to reach that condition.

      In this revised manuscript, we explained the above details and further added the discussion described below. All of the experimental models used in this study are wild animals obtained from the ocean, raising the standard for reproducibility. However, handling wild animals could come with challenges. We hope that the reviewer understands the unique benefits and challenges of this study.

      Discussion:

      Previous studies (PMIDs: 17726110; 21855794) suggest that GL1 is not involved in intramolecular interaction with TPR domains. This allows GL1 to interact independently with Gαi for cortical recruitment yet without influencing other GLs for AGS activation. To ensure GL1's independence, GL1 is typically located distantly from other GLs in Pins (flies), LGN (humans), and AGS (sea urchins). Based on this prior knowledge, we speculate three scenarios for sea cucumber (Sb) AGS not being able to localize or function during asymmetric cell division (ACD): 1) GL1 and GL2 are located too close to each other, compromising GL1's independence for recruitment. 2) A lack of GL4 loosens the autoinhibition state. 3) The GL1 sequence of SbAGS is quite different from that of echinoids’ AGS (Figure S2), compromising its recruiting efficacy. 

      For 1), we tested this possibility by making the SpAGS-GL1GL2 mutant that has GL1 and GL2 next to each other (Fig. 4G). This mutant indeed compromised its cortical localization and function in ACD. For 2), we showed that the lack of GL4 partially compromised ACD in SpAGS (Fig. 3F), suggesting that GL4 supports ACD. For 3), The results in Figure 4 indicate that the position but not the sequence of GL1 is critical for ACD. Based on these observations, we speculate a combination of 1) and 2) compromised SbAGS's ACD function. However, it is still possible that a significant difference in the GL1 sequence diminished its function as GL entirely. Future studies should address these remaining questions directly in the sea cucumber embryos once they are established as a model system in the near future (PMID: 38368336)

      The authors point out that AGS-PmGL demonstrates enrichment at the vegetal cortex (arrow in 5G, quantifications in 5H), unlike PmAGS. AGS-PmGL does not however support ACD. They interpret this result to indicate "that other elements of SpAGS outside of its C-terminus can drive its vegetal cortical localization but not function." This is a critical finding and deserves more attention. Put succinctly: Vegetal cortical localization of AGS is insufficient to promote ACD, even in echinoids. Why should this be?  

      Thank you for the suggestion. We revised our wording to be more succinct. Of note, as we noted in the text, AGS-PmGL has only two GL domains, which will likely not provide the full force to control ACD and result in insufficient ACD function.

      The authors did perform experiments to address this problem, hypothesizing that the difference might be explained by the linker region, which includes a conserved phosphorylation site that mediates binding to Dlg. They write "To test if this serine is essential for SpAGS localization, we mutated it to alanine (AGS-S389A in Fig. S3A). Compared to the Full AGS control, the mutant AGS-S389A showed reduced vegetal cortical localization (Fig. S3B-C) and function (Fig. S3D-E). Furthermore, we replaced the linker region of PmAGS with that of SpAGS (PmAGSSpLinker in Fig. S4A-B). However, this mutant did not show any cortical localization nor proper function in ACD (Fig. S4C-F). Therefore, the SpAGS C-terminus is the primary element that drives ACD, while the linker region serves as the secondary element to help cortical localization of AGS." 

      The experiments performed only make sense if the AGS-PmGL chimeric protein used in Figure 5 starts the PmGL sequence only after the Sp linker, or at least after the Sp phosphorylation site. I can't tell from the paper (Figure S3 indicates that it does, whereas S5 suggests otherwise), but it's a critical piece of information for the argument. 

      Thank you for the pointer, and we apologize for the confusion. AGS-PmGL contains the SpAGS linker domain. To clarify this point, we added the amino acid position at the junction of each chimeric construct diagram in Figs. 5 and S4. To clarify, Figure S5 is about the GL domain mutations (not about the Linker).  

      Another piece of missing information is whether the PmAGS can be phosphorylated at its own conserved phosphorylation site. The authors don't test this, which they could at least try using a phosphosite prediction algorithm, but they do show that the candidate phosphorylation site has a slightly different sequence in Pm than in Et and Sp (Fig. S4A). With impressive rigor, the authors go on to mutate the PmAGS phosphorylation site to make it identical to Sp. Nothing happens. Vegetal cortical localization does not increase over AGS-PmGL alone. Micromere formation is unrescued. 

      There is therefore a logic problem in the text, or at least in the way the text is written. The paragraph begins "Additionally, AGS-PmGL unexpectedly showed cortical localization (Figure 5G), while PmAGS showed no cortical localization (Figure 5B)." We want to understand why this is true, but the explanation provided in the remainder of the paragraph doesn't match the question: according to quite a bit of their own data, the phosphorylation site in the linker does not explain the difference. It might explain why AGS-PmGL fails to promote micromere formation, but only if the AGS-PmGL chimeric protein uses the Pm linker domain (see above).

      Thank you for the insightful suggestion. As suggested, we performed the phosphosite predictions using GPS 6.0 (PMID: 37158278) and enclosed the results in Fig. S4A (replacing the old Fig. S3A). The software predicts SpAGS and EtAGS have a predicted AuroraA phosphorylation site (RRRSMEN in Supplemental figure S4A) in their linker domain, while PmAGS does not. Sp and Et AGS also have the additional 5-7 predicted phosphorylation sites, while PmAGS has only three sites with low scores. Therefore, the linker domain is not conserved in PmAGS. 

      The PmAGS+SpLinker mutant does restore the predicted AuroraA phosphorylation site on the software, yet it does not restore the cortical localization or ACD function in the embryo. Therefore, other sites in the Linker region might also be necessary for cortical localization and ACD function of AGS. In this study, we did not perform further manipulations in the Linker domain. As the reviewer rightfully pointed out, even if we identify the Linker regions essential for AGS localization and function, it will be difficult to interpret the result unless we know what proteins interact with the Linker domain of AGS. Therefore, this is beyond the scope of the current manuscript. We discussed these remaining matters in the discussion section. 

      Another concern that is potentially related is the measurement of cortical signal. For example, in the control panel of Figure 5C, there is certainly a substantial amount of "non-cortical" signal that I believe is nuclear. I did not see a discussion of this signal or its implications. My impression of the pictures generally is that the nuclear signal and cortical signal are inversely correlated, which makes sense if they are derived from the same pool of total protein at different points of the cell cycle. If that's the case (and it might not be) I would expect some quantifications to be impacted. For example, the authors show in Figure S3B that AGS-S389A mutant does not localize to the cortex. However, this mutant shows a radically different localization pattern to the accompanying control picture (AGS), namely strong enrichment in what I assume to be the nucleus. Is the S389 mutant preventing AGS from making it to the cortex? Or are these pictures instead temporally distinct, meaning that AGS hasn't yet made it out of the nucleus? Notably, the work of Johnston et al. (Cell 2009), cited in the text, does not show or claim that the linker domain impacts Pins localization. Their model is rather that Pins is anchored at the cortex by Gαi, not Dlg, and that is the same model described in this manuscript.

      In agreement with that model and the results of Johnston et al., a later study (Neville et al. EMBO Reports 2023) failed to find a role for Dlg or the conserved phosphorylation site in Pins localization. 

      In the sea urchin embryo, the dye or GFP often appears in the nucleus randomly on top of the cytoplasm (for example, see Fig. S2b of PMID: 35444184). Further, embryos tend to incorporate exogenous genomic fragments more efficiently during early embryogenesis (PMID: 3165895). It is proposed that early embryos may have a loosened or incomplete nuclear envelope compared to adult cells as they divide rapidly (every 40 minutes). Therefore, any excess protein with no specific localization signal may randomly appear in the nucleus as it serves as an available space in the cell. As the Reviewer rightfully pointed out, we consider that the nuclear AGS signal is due to the lack of a specific destination since this signal pattern is not consistent across embryos. In contrast, the proteins that have nuclear localization (e.g., transcription factors) usually show a consistent nuclear signal across cells and embryos with less cytoplasmic signal. To avoid confusion, we replaced the S389A image in Fig. S3B (which is now Fig. S4C) as well as any other images that may create similar confusion.

      Reviewer #2 (Public Review): 

      This study from Dr. Emura and colleagues addresses the relevance of AGS3 mutations in the execution of asymmetric cell divisions promoting the formation of the micromere during seasearching development. To this aim, the authors use quantitative imaging approaches to evaluate the localisation of AGS3 mutants truncated at the N-terminal region or at the Cterminal region, and correlate these distributions with the formation of micromere and correct development of embryos to the pluteus stage. The authors also analyse the capacity of these mutated proteins to rescue developmental defects observed upon AGS3 depletion by morpholino antisense nucleotides (MO). Collectively these experiments revealed that the Cterminus of AGS3, coding for four GoLoco motifs binding to cortical Gaphai proteins, is the molecular determinant for cortical localisation of AGS3 at the micromeres and correct pluteus development. Further genetic dissections and expression of chimeric AGS3 mutants carrying shuffled copies of the GoLoco motifs or four copies of the same motifs revealed that the position of GoLoco1 is essential for AGS3 functioning. To understand whether the AGS3-GoLoco1 evolved specifically to promote asymmetric cell divisions, the authors analyse chimeric AGS3 variants in which they replaced the sea urchin GoLoco region with orthologs from other echinoids that do not form micromeres, or from Drosophila Pins or human LGN. These analyses corroborate the notion that the GoLoco1 position is crucial for asymmetric AGS3 functions. In the last part of the manuscript, the authors explore whether SpAGS3 interacts with the molecular machinery described to promote asymmetric cell division in eukaryotes, including Insc, NuMA, Par3, and Galphai, and show that all these proteins colocalize at the nascent micromere, together with the fate determinant Vasa. Collectively this evidence highlighted how evolutionarily selected AGS3 modifications are essential to sustain asymmetric divisions and specific developmental programs associated with them. 

      Thank you for the useful summary.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      The quantifications of "vegetal cortical localization" are somewhat incomplete. As measured, "vegetal cortical localization" does not demonstrate particular enrichment at the vegetal cortex, only that some signal appears there. In other words, we can't tell for sure that there is any more signal at the vegetal cortex than anywhere else along the cortex, and in fact that's plainly true and even described for the ACS1111 and AGS2222 constructs. One solution would be to measure signal strength around the cell perimeter and see where it is strongest. 

      As suggested by the Reviewer, we added new measurements, focusing and comparing the signals on the animal versus vegetal cortices (Figs. 2C, 3D, 4C, 5C, &H, 9D & F, S3D, S4D &I). 

      A related issue is that the strength of cortical enrichment is indicated in this paper by the ratio of cortical to "non-cortical" signal, but "non-cortical" is not defined. Does it include the nuclear signal? 

      As described above, we replaced all measurements using the above animal vs. vegetal cortices to avoid confusion. The nuclear signal is thus not measured in these analyses.

      I'm enthusiastic about the results in Figure 7, but I can't really see them very well. Could you please consider changing the color scheme? For single-color figures, it would be helpful to view them as black on white rather than (for example) blue on black. That change is easily achieved with Fiji. 

      We revised the Figure as suggested.

      Page 3 Results section: "At the time of ACD, Insc recruits Pins/LGN to the cortex through Gαi": I understand this sentence to mean that Gαi is an intermediary protein that Insc uses to recruit Pins/LGN. I think the point should be made more clear. As shown in Figure 1, Insc binds to Pins/LGN directly and interacts with cortical polarity proteins directly. Recruitment therefore doesn't appear to require Gαi, but stable association with the membrane (a subsequent step) probably does. That model is shown and described in Figure 6A.

      Thank you for the pointer. We clarified our explanations as suggested.

      Reviewer #2 (Recommendations For The Authors): 

      The manuscript addresses an interesting question, and uses elegant genetic approaches associated with imaging analyses to elucidate the molecular mechanisms whereby AGS3 and spindle orientation proteins promote asymmetric divisions and specific developmental programs. This considered, it might be worth clarifying a few aspects of the reported findings. 

      (1) In some experimental settings, the presence of AGS3 mutants exacerbates the AGS3 deletion by MO (Figure 4F). Can the author speculate on what can be the molecular explanation? 

      Thank you for pointing this out. We speculate that AGS1111 and AGS2222 are unable to keep the auto-inhibited forms since they lack GL3 and GL4 as modeled in Figure 6. AGS-MO reduces the endogenous AGS, which compromises the vegetal polarity. In this embryo, constitutive active AGS likely further randomizes the polarity, as evidenced by AGS-OE results in Fig. S7, resulting in an even worse outcome. We elaborated on this part in the text.

      (2) Imaging analyses of Figure 4B-C suggest that the mutant AGS1111 does not localise at the vegetal cortex while AGS2222 does (Fig. 4C). However these mutants induce similar developmental defects (Figure 4F). What could be the reason? 

      We apologize for the confusion in Fig. 4C. The majority of embryos from both AGS1111 and 2222 groups failed to form micromeres and showed AGS localization across the cortex. Among the dozens we examined, 0 embryos from 1111 and 8 embryos from 2222 developed micromeres. Those 8 embryos still showed vegetal cortical localization, so the proportion appears high in Fig. 4B, yet it reflects the minority in the group. In contrast, Development was scored for all embryos (including those that failed to form micromeres), so the graph demonstrates the majority of embryos. To avoid this confusion, we replaced the old Fig. 4C with a new graph that analyzes the cortical signal levels at the vegetal versus animal cortices.

      (3) Figure 7 shows the crosstalk between AGS3 and other asymmetry players including NuMA. Vertebrate and Drosophila NuMA are ubiquitously present in tissues and localise to the spindle poles in mitosis. However, in Figures 7A and 7E NuMA seems expressed only in a subset of sea urchin embryonic cells. Is this the case? 

      As the Reviewer rightfully pointed out, Sea urchin NuMA is also present in all cells and localizes to the spindle (please see Fig. 2 of our previous paper PMID: 31439829). AGS is also slightly localized on the spindles of all cells. However, the PLA signal of AGS and NuMA mostly showed up in the vegetal cortex in this study, suggesting that major crosstalk may occur in the vegetal cortex. This does not rule out the possibility that minor interactions may also occur on the spindle or elsewhere in the cell, which was not quantifiable in this study. We clarified this point in the text.

    1. Author response:

      Reviewer #1 (Recommendations for the authors):

      (1) Storyline and Narrative Flow:

      Consider revising the manuscript to create a more coherent and consistent narrative. Clarify how each section of the study-particularly the transition from multi-omics data integration to single-cell RNA-seq validation-contributes to the overall research question. This will help readers better understand the logical flow of the study.

      In the upcoming revisions, we will optimize the logical connections between sections of the manuscript to clarify the role each part plays in the overall research question, making it easier for readers to follow.

      (2) Immune Cell Activity Analysis:

      Reevaluate the methods used to assess immune cell activities within the context of the tumor microenvironment. Consider providing additional justification for the relevance of using the cancer cell model for this analysis. If necessary, explore alternative methods or models that might offer more meaningful insights into immune-tumor interactions.

      We fully recognize the importance of using tumor models to analyze and validate immune activity results, and we are considering experimental research in this area in future projects.

      (3) Single-Cell RNA-Seq Validation:

      Expand the validation of your findings using single-cell RNA-seq data. This could include more in-depth analyses that explore the heterogeneity within the subtypes and confirm the robustness of your classification method at the single-cell level. This would strengthen the support for your claims about the relevance of the identified subtypes.

      In the current study, we have applied the obtained multi-omics profiling features to single-cell sequencing data to classify malignant cells. We analyzed the metabolic and cell communication differences between different subtypes of malignant cells and explored potential reasons for these differences. Next, we plan to conduct further analysis of the differences between malignant cell subtypes to identify additional clues and mechanisms underlying these variations.

      (4) Methodological Justification:

      Provide a more detailed rationale for the selection of machine learning algorithms and integration strategies used in the study. Explain why the chosen methods are particularly well-suited for this research, and discuss any potential limitations they might have.

      In the revised manuscript, we will include descriptions of the principles of these analytical methods, as well as examples of their application in other studies, to discuss the rationale and limitations of applying these methods in this research.

      (5) Figures and Visualizations:

      Improve the clarity of your figures by addressing the following:

      a) Figure 3A: Cluster the pathways to make the comparisons clearer and more meaningful.

      b) Figure 4A: Clearly explain the significance of the blue bar.

      c) Figure 4B: Ensure this figure is discussed in the main text to justify its inclusion.

      d) Figure 7C: Enhance the figure legend to provide more informative details.

      Additionally, ensure that figure descriptions go beyond the captions and provide detailed explanations that help the reader understand the significance of each figure.

      We fully agree with the reviewer’s suggestions regarding these figures, and we will make the necessary revisions in the revised manuscript.

      (6) Supplementary Materials:

      Consider including more detailed supplementary materials that provide additional validation data, extended methodological descriptions, and any other information that would support the robustness of your findings.

      When we submission the revised manuscript, we will include supplementary materials such as figures or tables that may enhance the presentation of the manuscript's completeness.

      (7) Recent Literature:

      a) Incorporate more recent studies in your discussion, especially those related to HCC subtypes and the application of machine learning in oncology. This will provide a more current context for your work and help position your findings within the broader field.

      We appreciate the reviewer's suggestion. We will incorporate more recent studies into the discussion section and optimize its content.

      (8) Data and Code Availability:

      Ensure that all data, code, and materials used in your study are made available in line with eLife's policies. Provide clear links to repositories where readers can access the data and code used in your analyses.

      We have indicated the sources of the data and tools used in the analysis process within the text, and these data and tools can be accessed through the websites or literature we have cited.

      Reviewer #2 (Recommendations for the authors):

      (1) While the computational findings are robust, further experimental validation of the two subtypes, particularly the role of the MIF signaling pathway, would strengthen the biological relevance of the findings. In vitro or in vivo validation could confirm the proposed mechanisms and their influence on patient prognosis.

      We fully recognize the importance of using tumor models to analyze and validate immune activity results, and we are considering experimental research in this area in future projects.

      (2) Consider testing the model on additional independent cohorts beyond the TCGA and ICGC datasets to further demonstrate its generalizability and applicability across different patient populations.

      We are considering looking for independent external datasets in the GEO database or other databases to validate our model.

      (3) Review the manuscript for long or complex sentences, which can be broken down into shorter, more readable parts.

      In the revised manuscript, we will address any grammatical issues present in the manuscript and modify long and complex sentences that may hinder reader comprehension.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Thank you for your assessment and constructive critique, which helped us to improve the manuscript and its clarity. Upon carefully reading through the comments, we noticed that, based on the Reviewer's questions, some of our answers were already available but “hidden” as supplementary data. Thus, we changed the following two figures and text accordingly to showcase our results to the reader better:

      A) To highlight how mobile service data can indicate the spread of highly prevalent variants, we added a high-prevalence subcluster to Figure 2 (previously shown in Supplementary Figures S4 and S5) and, in exchange, moved one low-prevalence subcluster from Figure 2 back into the supplement. The figure is now showing a low and a high prevalent subcluster instead of two low prevalent subclusters.

      B) Based on Reviewer 1’s question about where samples were taken in regards to the mobility data from the community of the first identification (negative controls), we now highlight all the mobility data that was available to us in Figure 3 (as triangles) instead of just a few top mobility hits for both - mobility guided and random surveillance (serving as a negative control for the former). This way, we think, it is clearer how random sampling was also performed in some regions where mobility was coming from the community of origin (as asked by Reviewer 1) - the detailed trips and sampling are now part of the supplement for data transparency reasons. We also noticed a typo in the GPS coordinates, aligning one of the arrows falsely, which is corrected in the improved Figure 3.

      We have also included the R-Scripts used to generate all the figures in the manuscript in an OSF repository (we updated the “Data sharing statement”). We also updated Figure 1 slightly and extended the supplemental material. The remaining comments to reviewers are addressed point-by-point below.

      Reviewer 1 (Public Review):

      In "1 Exploring the Spatial Distribution of Persistent SARS-CoV-2 Mutations -Leveraging mobility data for targeted sampling" Spott et al. combine SARS-CoV-2 genomic data alongside granular mobility data to retrospectively evaluate the spread of SARS-CoV-2 alpha lineages throughout Germany and specifically Thuringia. They further prospectively identified districts with strong mobility links to the first district in which BQ.1.1 was observed to direct additional surveillance efforts to these districts. The additional surveillance effort resulted in the earlier identification of BQ.1.1 in districts with strong links to the district in which BQ.1.1 was first observed.

      Thank you for taking the time to review our work.

      (1) It seems the mobility-guided increased surveillance included only districts with significant mobility links to the origin district and did not include any "control" districts (those without strong mobility links). As such, you can only conclude that increasing sampling depth increased the rate of detection for BQ.1.1., not necessarily that doing so in a mobility-guided fashion provided an additional benefit. I absolutely understand the challenges of doing this in a real-world setting and think that the work remains valuable even with this limitation, but I would like the lack of control districts to be more explicitly discussed.

      Thank you for the critical assessment of our work. We agree that a control is essential for interpreting the results. In our case, randomized surveillance (“the gold standard”) served as a control with a total sampling depth seven times higher than the mobility-guided sampling. To better reflect the sampling in regards to the available mobility data, we revisited Figure 3 and added all the mobility information from the origin that was available to us. We also added this information to the random surveillance to provide a clearer picture to the reader. This now clearly shows how randomized surveillance covered communities with varying degrees of incoming mobility from the community of first occurrences, thereby underlining its role as a negative control. We updated the manuscript to reflect these changes and included the October 2020 and June 2021 mobility datasets in Supplementary Table S6. We agree that the sampling depth increases the detection, which is the point of guided sampling to increase sampling, specifically in areas where mobility points towards a possible spread. In regards to the negative control: Random surveillance (not Mobility-guided) in October covered 40 samples in the northwest region of Thuringia (Mobility-guided covered 19 samples). Thus, random surveillance also contained 31 out of 132 samples with a mobility link towards the first occurrence of BQ1.1 but with varying amounts of mobility (low to high).

      We added this information to the main text:

      Line 270 to 293:

      Following its first Thuringian identification, we utilized the latest available dataset of the past two years of mobile service data (October 2020 and June 2021) to investigate the residential movements for the community of first detection. Considering the highest incoming mobility from both datasets, we identified 18 communities with high (> 10,000), 34 with medium (2,001-10,000), and 82 with low (30-2,000) number of incoming one-way trips from the originating community (purple triangles in Figure 3a). As a result, we specifically requested all the available samples from the eight communities with the highest incoming mobility. Still, we were restricted to the submission of third parties over whom we had no influence. This led to the inclusion of the following eight communities with the most residential movement from the originating community: four in central and three in NW of Thuringia, one in NW-neighboring state Saxony-Anhalt. The samples requested from central Thuringia were also due to their geographic arrangement as a “belt” in central Thuringia, linking three major cities (see Supplementary Figure S1). Subsequently, we collected 19 additional samples (isolated between the 17th and 25th of October 2022; see “Guided Sampling” for October 2022, Figure 3a) besides the randomized sampling strategy. Thus, the sampling depth was increased in communities with high incoming mobility from the first origin.

      As part of the general Thuringian surveillance, we collected 132 samples for October (covering dates between the 5th and 31st) and 69 samples in November (covering dates between the 1st and 25th; see Figure 3b and c). Randomized sampling was not influenced or adjusted based on the mobility-guided sample collection. Thus, it also contains samples from communities with a mobility link towards the first occurrence of BQ.1.1, as they were part of the regular random collection (see gray triangles in Figure 3b). A complete overview of all samples is provided in Supplementary Table S5. The mobility datasets from October 2020 and June 2021 for all sampled communities are provided in Supplementary Table S6.

      Line 305 to 313:

      Among the 19 samples specifically collected based on mobile service data, we identified one additional sample of the specific Omicron sublineage BQ.1.1 in a community with high incoming mobility (n = 14, number of trips = 37,499) with a distance of approximately 16 km between both towns. Our randomly sampled routine surveillance strategy did not detect another sample during the same period. This was despite a seven times higher overall sample rate, which included 31 samples from communities with an identified incoming mobility from the community of the first occurrence (October 2022, Figure 3b). Only in the one-month follow-up were four other samples identified across Thuringia through routine surveillance (November 2022, Figure 3c).

      Line 325 to 333:

      In summary, increasing the sampling depth in the suspected regions successfully identified the specified lineage using only a fraction of the samples from the randomized sampling. Conversely, randomized surveillance, the “gold standard” acting as our negative control, did not identify additional samples with similar sampling depths in regions with no or low incoming mobility or even in high mobility regions with less sampling depth. Implementing such an approach effectively under pandemic conditions poses difficult challenges due to the fluctuating sampling sizes. Although the finding of the sample may have been coincidental, our proof of concept demonstrated how we can leverage the potential of mobile service data for targeted surveillance sampling.

      (2) Line 313: While this work has reliably shown that the spread of Alpha was slower in Thuringia, I don't think there have been sufficient analyses to conclude that this is due to the lack of transportation hubs. My understanding is that only mobility within Thuringia has been evaluated here and not between Thuringia and other parts of Germany.

      Thank you for pointing this out. We noticed that the original sentence lacked the necessary clarity. The statement in line 313 was based on the observation that Alpha first occurred in federal states with major transport hubs, such as international airports and ports, which Thuringia lacks, as demonstrated in the Microreact dataset. For clarification, we adjusted the sentence as follows:

      Line 340 and following:

      A plausible explanation for the delayed spread of the Alpha lineage in Thuringia is the lack of major transport hubs, as Alpha first occurred in federal states with such hubs. Previous studies have already highlighted the impact of major transportation hubs in the spread of Sars-CoV-2.

      (3) Line 333 (and elsewhere): I'm not convinced, based on the results presented in Figure 2, that the authors have reliably identified a sampling bias here. This is only true if you assume (as in line 235) that the variant was in these districts, but that hasn't actually been demonstrated here. While I recognize that for high-prevalence variants, there is a strong correlation between inflow and variant prevalence, low-prevalence variants by definition spread less and may genuinely be missing from some districts. To support this conclusion that they identified a bias, I'd like to see some type of statistical model that is based e.g. on the number of sequences, prevalence of a given variant in other districts, etc. Alternatively, the language can be softened ("putative sampling bias").

      Thank you for addressing this legitimate point of criticism in our interpretation. Due to the retrospective nature of the analysis and the fact that we found no additional samples of the clusters after the specified timeframes, we were limited to the samples in our dataset. Therefore, it is impossible to demonstrate if a variant was present in the relevant districts afterward. We agree that the variant’s low prevalence means they may genuinely not have spread to some districts. For clarification, we added the following statements and changed the wording accordingly:

      Additional statement in line 248:

      However, due to their low prevalence, it is also possible that these subclusters have not spread to the indicated districts.

      Adjusted wording in line 361:

      We exemplified this approach with the Alpha lineage, where mobile service data indicated a putative sampling bias and partially predicted the spread of our Thuringian subclusters.

      Recommendations:

      (1) I applaud the use of the microreact page to make the data public, however, I don't see any reference to a GitHub or Zenodo repository with the analysis code. The NextStrain code is certainly appreciated but there is presumably additional code used to identify the clusters, generate figures, etc. I generally prefer this code be made public and it is recommended by eLife.

      Thank you for your appreciation. We have now included the R-scripts in the manuscript’s OSF repository. These were used to create the figures in the manuscript and supplement utilizing the supplementary tables 1-6, which are also stored in the repository. To clearly communicate which data is provided, we changed lines 513 and 514 of the “Data sharing statement” as follows:

      Line 513 and following:

      Supplementary tables and the R-scripts used to generate all figures are also provided in the repository under https://osf.io/n5qj6/. These include the mobile service data used in this study, which is available in processed and anonymized form.

      The subcluster identification was performed manually. By adding each sample's mutation profile to the Microreact metadata file, we visually screened the phylogenetic time tree for all non-Alpha specific mutations present in at least 20 Thuringian genomes. We then applied the criteria described in the Methods section to identify the nine Alpha subclusters. For clarification, we changed line 436:

      Line 436:

      We then manually screened for mutations present in at least 20 genomes with a small phylogenetic distance and a time occurrence of at least two months.

      Reviewer 2 (Public Review):

      In the manuscript, the authors combine SARS-CoV-2 sequence data from a state in Germany and mobility data to help in understanding the movement of the virus and the potential to help decide where to focus sequencing. The global expansion in sequencing capability is a key outcome of the public health response. However, there remains uncertainty about how to maximise the insights the sequence data can give. Improved ability to predict the movement of emergent variants would be a useful public health outcome. Also knowing where to focus sequencing to maximising insights is also key. The presented case study from one State in Germany is therefore a useful addition to the literature. Nevertheless, I have a few comments.

      Thank you for taking the time to review our work.

      (1) One of the key goals of the paper is to explore whether mobile phone data can help predict the spread of lineages. However, it appears unclear whether this was actually addressed in the analyses. To do this, the authors could hold out data from a period of time, and see whether they can predict where the variants end up being found.

      Based on your feedback, we noticed that the results of the other seven clusters presented in the supplement were not appropriately highlighted, causing them to be overlooked. We indeed demonstrated that predicting viral spread based on mobility data is possible, as shown for the high-prevalence subcluster 7 (Cluster “ORF1b:A520V”, 811 samples). This was briefly mentioned in lines 240-242, but the cluster was only shown in Supplementary Figures S4 and S5. Instead, we focused more on the putative sampling bias that the mobility for low-prevalence subclusters could indicate as an interesting use case of mobility data. This addresses a concrete problem of every surveillance: successfully identifying low-prevalence targets. However, based on your feedback, we revisited Figure 2, adding the plots of the high-prevalence subcluster: “ORF1b:A520V” from Supplementary Figures S4 and S5 while moving the low-prevalence subcluster “S:N185D” from Figure 2 into the Supplementary Figures S4 and S5. Additionally, we changed line 229 to highlight this result properly.

      line 229 and following:

      The mobile service data-based prediction of a subcluster’s spread aligned well with the subsequent regional coverage of fast-spreading, highly prevalent subclusters, such as subcluster 7, which covered 811 samples (see Figure 2). In contrast, the predicted spread for the low-prevalence subclusters did not correspond well with the actual occurrence.

      (2) The abstract presents the mobility-guided sampling as a success, however, the results provide a much more mixed result. Ultimately, it's unclear what having this strategy really achieved. In a quickly moving pandemic, it is unclear what hunting for extra sequences of a specific, already identified, variant really does. I'm not sure what public health action would result, especially given the variant has already been identified.

      Thank you for your critical assessment of the presented results and their interpretation.

      Here, we aimed to provide an alternative to the standard randomized surveillance strategy. Through mobility-guided sampling, we sought to increase identification chances while necessitating fewer samples and decreasing costs, ultimately enhancing surveillance efficiency. The Omicron-lineage BQ.1.1 was the perfect example to prove this concept under actual pandemic conditions. Yet, the strategy is not limited to low-prevalence sublineages but can be applied to virtually any surveillance case. However, from your question, we recognize that this conclusion was unclear from the text. Therefore, we adapted the conclusion to better communicate the real implications of our proof of concept. Additionally, we altered line 42 in the abstract for clarification.

      However, we did not assess the benefits of surveillance itself, as the German Robert Koch Institute (RKI) already had outlined its importance for tracking different viral variants. This tracking served several reasons, like monitoring vaccine escapism, mutational progress, and assessing available antibodies for treatment.

      Line 42:

      The latter concept was successfully implemented as a proof-of-concept for a mobility-guided sampling strategy in response to the surveillance of Omicron sublineage BQ.1.1.

      Line 364 to 374:

      Another approach is actively guiding the sampling process through mobile service data, which we demonstrated with our proof of principle focusing on the Omicron-lineage BQ.1.1 as a real-life example. This approach could allow for a flexible allocation of surveillance resources, enabling adaptation to specific circumstances and increasing sampling depth in regions where a variant is anticipated. By incorporating guided sampling, much fewer resources may be needed for unguided or random sampling, thereby reducing overall surveillance costs.

      Additionally, while this approach is particularly useful for identifying low-prevalence variants, it is not limited to such variants. Still, it can provide a guided, more cost-efficient, low-sampling alternative to general randomized surveillance that can also be applied to other viruses or lineages.

      (3) Relatedly, it is unclear to me whether simply relying on spatial distance would not be an alternative simpler approach than mobile phone data. From Figure 2, it seems clear that a simple proximity matrix would work well at reconstructing viral flow. The authors could compare the correlation of spatial, spatial proximity, and CDR data.

      Thank you for pointing this out. While proximity data might appear to be an obvious choice, it has significant limitations compared to mobility data, especially in the context of our study. Proximity data assumes that spatial distance alone can accurately represent movement patterns, which would only be true in a normally distributed traffic network. Geographic features such as mountains, cities, and highways affect traffic flows, leading to variability over distance and time, which are beyond the scope of spatial proximity but efficiently captured by mobility data. In Figure 2, we presented a simplified view of the mobility data. Hence, proximity and mobility data appear to provide the same insights. However, as shown in the updated Figure 3, a detailed overview of the available mobility data reveals obvious and non-obvious spatial connections that proximity data can not capture. Incorporating such a level of detail in Figure 2 would have cluttered the figure and reduced its clarity (e.g., adding triangles for each Thuringian community).

      While a comparison between proximity data and mobility data would indeed be informative, it is beyond the scope of our current study, as our primary focus was to examine the useability of mobility data in explaining our subcluster’s spread in the first place. However, we agree it would be a valuable direction for future research. We summarized our thoughts from above in the following additional sentence:

      Line 374:

      Pre-generated mobility networks automatically tailored to each state's unique infrastructure and population dynamics could provide better-targeted sampling guidance rather than simple geographical proximity.

      Recommendations:

      (1) Line 128: What do these percentages mean - the proportion of States with at least one Alpha variant? Please clarify.

      We clarified the values at their first appearance in the text:

      Line 127:

      By March, Alpha had spread to nearly all states and districts (districts are similar to counties or provinces) in Germany (Median: 76·47 % Alpha samples among a federal states total sequenced samples compared to 36·03 % in February, excluding Thuringia) and Thuringia (Median: 85·29 %, up from 50·00 % in February).

      (2) Line 134: It's a little strange to compare the dynamics of a state with that of the whole country. For it lagged as compared to all other States?

      Line 134: “In summary, the spread of the Alpha lineage in Thuringia lagged roughly two weeks behind the general spread in the rest of Germany but showed similar proportions.”

      Thank you for the feedback. The statement refers to the comparison of Alpha-lineage proportions across federal states, excluding Thuringia, in lines 118 to 130. To simplify, we collectively referred to these federal states as “Germany” in the text. However, we recognize that this formulation is misleading, so we adjusted line 135 for clarification:

      Line 135:

      In summary, the spread of the Alpha lineage in Thuringia lagged roughly two weeks behind the general spread of other German federal states but showed similar proportions.

    1. Author response:

      Reviewer #1 (Public review)

      Weaknesses:

      The main weakness of the manuscript is that to a large degree, one of its main conclusions (MAP symmetry underlies differences in regenerative capacity) relies mainly on a correlation, without firmly establishing a causal link. However, this weakness is relatively minor because (1) it is partially addressed with the Spastin KO and (2) there isn't a trivial way to show a causal relationship in this case.

      We thank Reviewer #1 for their positive assessment of our manuscript. To further strengthen the claim that MAP asymmetry underlies differences in regenerative capacity, we could investigate the effect of depleting other MAPs that lose asymmetry after conditioning lesion (CRMP5 and katanin). One expects that similarly to spastin, this would disrupt the physiological asymmetry of DRG axons and impair axon regeneration. We will further discuss this issue in the revised version of the manuscript.

      Reviewer #2 (Public review):

      Weaknesses:

      In order for the method to be used it needs to be better described. For instance what proportion of neurons develop just two axonal branches, one of which is different? How selective are the researchers in finding appropriate neurons?

      We thank Reviewer #2 for their positive assessment of our manuscript. As suggested, we will include further methodological details on the in vitro system in the revised version of the manuscript. We have evaluated the percentage of DRG neurons exhibiting different morphologies in our cultures: multipolar (4%), bipolar, (35%) bell-shaped (17%), and pseudo-unipolar neurons (43%). This will be included in the revised manuscript. All the pseudo-unipolar neurons analysed had distinct axonal branches in terms of diameter and microtubule dynamics. For imaging purposes, we selected pseuso-unipolar neurons with axons unobstructed from other cells or neurites within a distance of at least 20–30 μm from the bifurcation point, to ensure optimal imaging. In the case of laser axotomy experiments, this distance was increased to 100–200 μm to ensure clear analysis of regeneration. These selection criteria will be detailed in the Methods of the revised manuscript.

      Reviewer #3 (Public review):

      Weaknesses:

      While some of the data are compelling, experimental evidence only partially supports the main claims. In its current form, the study is primarily descriptive and lacks convincing mechanistic insights. It misses important controls and further validation using 3D in vitro models.

      We recognize the importance of further exploring the contribution of other MAPs to microtubule asymmetry and regenerative capacity of DRG axons. In future work, we plan to investigate this issue by using knockout mice for katanin and CRMP5. To understand the mechanisms underlying the differential localization of MAPs in DRG axons, we performed in-situ hybridization to assess the availability of axonal mRNA but no differences were found between central and peripheral DRG axons (Figure 4 – figure supplement 2). To address whether differences in protein transport exist, we attempted to transduce DRG neurons with GFP-tagged spastin both in vitro and in vivo. However, these experiments were inconclusive as very low levels of spastin-GFP were detected. We are actively optimizing these approaches and will address this challenge in future studies. This will be further discussed in the revised manuscript.

      Given the heterogeneity of dorsal root ganglion (DRG) neurons, it is unclear whether the in vitro model described in this study can be applied to all major classes of DRG neurons.

      We acknowledge the diversity of DRG neurons and agree that assessing the presence of different DRG subtypes in our culture system will enrich its future use. Despite this heterogeneity, we focused on DRG neuron features that are common to all subtypes i.e, pseudo-unipolarization and higher regenerative capacity of peripheral branches. This will be further discussed in the revised version of the manuscript.

      Also unclear is the inconsistency with embryonic DRG cultures with embryonic (E)16 from rats and E13 from mice (spastin knockout and wild-type controls).

      Given our previous experience in establishing DRG neuron cultures from Wistar rats and C57BL/6 mice, these developmental stages are equivalent, yielding cultures of DRG neurons with similar percentages of different morphologies. Of note, in our colonies, gestation length is ~19 days in C57BL/6 mice (background of the spastin knockout line) and ~22 days in Wistar Han rats. This will be further clarified in the Methods.

      Furthermore, the authors stated (line 393) that only a small subset of cultured DRG neurons exhibited a pseudo-unipolar morphology. The authors should include the percentage of the neurons that exhibit a pseudo-unipolar morphology.

      We have previously evaluated the percentage of DRG neurons exhibiting different morphologies in our cultures: multipolar (4%), bipolar, (35%) bell-shaped (17%), and pseudo-unipolar neurons (43%). This will be included in the revised manuscript. In line 393, we referred specifically to an experimental setup where DRG neuron transduction was done and 30 transduced neurons were randomly selected for longitudinal imaging. From these, the number of viable pseudo-unipolar DRG neurons was limited by both the random nature of viral transduction and light-induced toxicity as continuous imaging over seven consecutive days at hourly intervals was done. This will be clarified in the revised manuscript.

      The significance of studying microtubule polymerization to DRG asymmetry in vitro is questionable, especially considering the model's validity. The authors might consider eliminating the in vitro data and instead focus on characterizing DRG asymmetry in vivo both before and after a conditioning lesion. If the authors choose to retain the in vitro data, classifying the central and peripheral-like branches in cultured DRG neurons will require further in-depth characterization. Additional validation should be performed in adult DRG neuron cultures not aged in vitro.

      The in vitro system here presented reliably reproduces several key features of DRG neurons observed in vivo, including asymmetry in axon diameter, regenerative capacity, axonal transport, and microtubule dynamics. Of note, most studies in the field were developed using multipolar DRG neurons that do not recapitulate in vivo morphology and asymmetries. Thus, the current in vitro system serves as a versatile tool for advancing our understanding of DRG biology and associated diseases. This system is particularly suited to study axon regeneration, and enables research on mechanisms occurring at the stem axon bifurcation, which are challenging to examine in vivo due to the length of the stem axon and the difficulty of locating the DRG T-junction. Optimizing similar cultures using adult DRG neurons comes with challenges, such as lower cell viability and decreased percentage of pseudo-unipolarization. This is the case with multiple other neuron types for which the vast majority of cultures are obtained from embryonic tissue. These embryonic cultures (as is the case with cortical and hippocampal neurons) are widely used to understand neuronal polarization, axon growth and/or regeneration. This will be further addressed in the revised manuscript.

      The comparison of asymmetry associated with a regenerative response between in vitro and in vivo paradigms has significant limitations due to the nature of the in vitro culture system. When cultured in isolation, DRG neurons fail to form functional connections with appropriate postsynaptic target neurons (the central branch) or to differentiate the peripheral domains associated with the innervation of target organs. Rather than growing neurons on a flat, hard surface like glass, more physiologically relevant substrates and/or culturing conditions should be considered. This approach could help eliminate potential artifacts caused by plating adult DRG neurons on a flat surface. Additionally, the authors should consider replicating their findings in a 3D culture model or using dorsal root ganglia explants, where both centrally and peripherally projecting axons are present.

      We agree that a more sophisticated system, such as a compartmentalized culture, holds great potential for future research. In this respect, we are currently engaged in developing such models. A compartmentalized system would enable the separation of three compartments: central nervous system neurons, DRG neurons, and peripheral targets. While previous efforts to create compartmentalized DRG cultures have been reported, these systems have not demonstrated the development of pseudo-unipolar morphology. Incorporating non-neuronal DRG cells into the DRG neuron compartment, may successfully support the development of a pseudo-unipolar morphology.

      We also recognize the importance of dimensionality in fostering pseudo-unipolar morphology. Of note, our model provides a 3D-like environment, as DRG glial cells are continuously replicating over the 21 days in culture. In relation to DRG explants, we attempted their use but encountered limitations with confocal microscopy as the axial resolution was insufficient to resolve adequately processes at the DRG T-junction or within individual branches. While tissue clearing could improve resolution, it would be incompatible with live imaging, which is essential for our experiments.

      The above issues will be further discussed in the revised manuscript.

      Panels 5H-J require additional processing with astrocyte markers to accurately define the lesion borders. Furthermore, including a lower magnification would facilitate a direct comparison of the lesion site.

      In our study, we relied on the alignment of nuclei to delineate the lesion site as in our accumulated experience, this provides an accurate definition of the lesion boarder. Outside the lesion, the nuclei are well-aligned, while at the lesion site, they become randomly distributed. Additionally, CTB staining further supports the identification of the rostral boarder of the lesion, as most injured central DRG axons stop their growth at the injury site. This will be further detailed in the Methods.

      The use of cholera toxin subunit B (CTB) to trace dorsal column sensory axons is prone to misinterpretation, as the tracer accumulates at the axon's tip. This limitation makes it extremely challenging to distinguish between regenerating and degenerating axons.

      While alternative methods to trace or label regenerating axons exist, CTB is a well-established and widely used tracer for central sensory projections, as shown in multiple studies. Regarding the concern of possible CTB labeling in degenerating axons, we believe this is unlikely to be the case in our study as in spinal cord injury controls, CTB-positive axons are nearly absent. Also, as regeneration was investigated six weeks after injury, axon degeneration has most likely already occurred, as shown in (PMID: 15821747 and PMID: 25937174).

    1. Author response:

      Reviewer #1 (Public review):

      Summary:

      The manuscript by Rühling et al analyzes the mode of entry of S. aureus into mammalian cells in culture. The authors propose a novel mechanism of rapid entry that involves the release of calcium from lysosomes via NAADP-stimulated activation of TPC1, which in turn causes lysosomal exocytosis; exocytic release of lysosomal acid sphingomyelinase (ASM) is then envisaged to convert exofacial sphingomyelin to ceramide. These events not only induce the rapid entry of the bacteria into the host cells but are also described to alter the fate of the intracellular S. aureus, facilitating escape from the endocytic vacuole to the cytosol.

      Strengths:

      The proposed mechanism is novel and could have important biological consequences.

      Weaknesses:

      Unfortunately, the evidence provided is unconvincing and insufficient to document the multiple, complex steps suggested. In fact, there appear to be numerous internal inconsistencies that detract from the validity of the conclusions, which were reached mostly based on the use of pharmacological agents of imperfect specificity.

      We thank the reviewer for the detailed evaluation of our manuscript. We will address the criticism below.

      We agree with the reviewer that many of the experiments presented in our study rely on the usage of inhibitors. However, we want to emphasize that the main conclusion (invasion pathway affects the intracellular fate/phagosomal escape) was demonstrated without the use of inhibitors or genetic ablation in two key experiments (Figure4 G/H). These experiments were in line with the results we obtained with inhibitors (amitriptyline [Supp. Figure 4E], ARC39, PCK310, [Figure 4c] and Vacuolin-1 [Supp. Figure4f]). Importantly, the hypothesis was also supported by another key experiment, in which we showed the intracellular fate of bacteria is affected by removal of SM from the plasma membrane before invasion, but not by removal of SM from phagosomal membranes after bacteria internalization (Figure4d-f). Taken together, we thus believe that the main hypothesis is strongly supported by our data.

      Moreover, we either used different inhibitors for the same molecule (ASM was inhibited by ARC39, amitriptyline and PCK310 with similar outcome) or supported our hypothesis with gene-ablated cell pools (TPC1, Syt7, SARM1), as we will point out in more detail below.

      Firstly, the release of calcium from lysosomes is not demonstrated. Localized changes in the immediate vicinity of lysosomes need to be measured to ascertain that these organelles are the source of cytosolic calcium changes. In fact, 9-phenantrol, which the authors find to be the most potent inhibitor of invasion and hence of the putative calcium changes, is not a blocker of lysosomal calcium release but instead blocks plasmalemmal TRPM4 channels. On the other hand, invasion is seemingly independent of external calcium. These findings are inconsistent with each other and point to non-specific effects of 9-phenantrol. The fact that ionomycin decreases invasion efficiency is taken as additional evidence of the importance of lysosomal calcium release. It is not clear how these observations support involvement of lysosomal calcium release and exocytosis; in fact treatment with the ionophore should itself have induced lysosomal exocytosis and stimulated, rather than inhibited invasion. Yet, manipulations that increase and others that decrease cytosolic calcium both inhibited invasion.

      With respect to lysosomal Ca2+ release, we agree with the reviewer that direct visual demonstration of lysosomal Ca2+ release upon infection will improve the manuscript. We therefore will perform additional experimentation to show alterations of Ca2+ at the lysosomes during infection.

      As to the TRPM4 involvement in S. aureus host cell internalization, it has been reported that TRPM4 is activated by cytosolic Ca2+. However, the channel conducts monovalent cations such as K+ or Na+ but is impermeable for Ca2+ 1, 2. The following of our observations are supporting this:

      i) S. aureus invasion is dependent on intracellular Ca2+, but is independent from extracellular Ca2+  (Figure 1c).

      ii) 9-phenantrol treatment reduces S. aureus internalization by host cells, illustrating the dependence of this process on TRPM4 (Figure 1b). We therefore hypothesize that TRPM4 is activated by Ca2+ released from lysosomes (see above).

      TRPM4 is localized to focal adhesions and is connected to actin cytoskeleton3, 4 – a requisite of host cell entry of S. aureus.5, 6 This speaks for an important function of TRPM4 in uptake of S. aureus in general, but does not necessarily have to be involved exclusively in the rapid uptake pathway.

      TRPM4 itself is not permeable for Ca2+ but is activated by the cation.  Thus, it is unlikely to cause lysosomal exocytosis. The stronger bacterial uptake reduction by treatment with 9-phenantrol when compared to Ned19 thus may be caused by the involvement of TRPM4 in additional pathways of S. aureus host cell entry involving that association of TRPM4 with focal adhesions or, as pointed out by the reviewer, unspecific side effects of 9-phenantrol that we currently cannot exclude. We will include this information in the revised manuscript.

      Regarding the reduced S. aureus invasion after ionomycin treatment, we agree with the reviewer that ionomycin is known to lead to lysosomal exocytosis as was previously shown by others7 as well as our laboratory8.

      We hypothesized that pretreatment with ionomycin would trigger lysosomal exocytosis and thus would reduce the pool of lysosomes that can undergo exocytosis before host cells are contacted by S. aureus. As a result, we should observe a marked reduction of S. aureus internalization in such “lysosome-depleted cells”, if the lysosomal exocytosis is coupled to bacterial uptake. Our observation of reduced bacterial internalization after ionomycin treatment supports this hypothesis.

      However, ionomycin treatment and S. aureus infection of host cells are distinct processes.

      While ionomycin results in strong global and non-directional lysosomal exocytosis of all “releasable” lysosomes (~5-10 % of all lysosomes according to previous observations)7, we hypothesize that lysosomal exocytosis upon contact with S. aureus only involves a very small proportion of lysosomes at host-bacteria contact sites.

      Since ionomycin disturbs the overall cellular Ca2+ homeostasis, we agree with the reviewer that this does not directly show lysosomal Ca2+ liberation. We will discuss this in more detail in the revised manuscript.

      The proposed role of NAADP is based on the effects of "knocking out" TPC1 and on the pharmacological effects of Ned-19. It is noteworthy that TPC2, rather than TPC1, is generally believed to be the primary TPC isoform of lysosomes. Moreover, the gene ablation accomplished in the TPC1 "knockouts" is only partial and rather unsatisfactory. Definitive conclusions about the role of TPC1 can only be reached with proper, full knockouts. Even the pharmacological approach is unconvincing because the high doses of Ned-19 used should have blocked both TPC isoforms and presumably precluded invasion. Instead, invasion is reduced by only ≈50%. A much greater inhibition was reported using 9-phenantrol, the blocker of plasmalemmal calcium channels. How is the selective involvement of lysosomal TPC1 channels justified?

      As to partial gene ablation of TPC1: To avoid clonal variances, we usually perform pool sorting to obtain a cell population that predominantly contains cells -here- deficient in TPC1, but also a small proportion of wildtype cells as seen by the residual TPC1 protein on the Western blot. We observe a significant reduction of bacterial uptake in this cell pool suggesting that the uptake reduction in a pure K.O. population may be even larger.

      As to the inhibition by Ned19: We agree with the reviewer that Ned19 inhibits TPC1 and TPC2. Since ablation of TPC1 reduced invasion of S. aureus, we concluded that TPC1 is important for S. aureus host cell invasion. We thus agree with the reviewer that a role for TPC2 cannot be excluded. We will clarify this in the reviewed manuscript. It needs to be noted, however, that deficiency in either TPC1 or TPC2 alone was sufficient to prevent Ebola virus infection9, which is in line with our observations.

      The 50% reduction of invasion upon Ned19 treatment (Figure 1d) is comparable with the reduction caused by other compounds that influence the ASM-dependent pathway (such as amitriptyline, ARC39 [Figure 2c], BAPTA-AM [Figure 1c], Vacuolin-1 [Figure 2a], β-toxin [Figure 2e] and ionomycin [Figure 1a]). Further, the partial reduction of invasion is most likely due to the concurrent activity of multiple internalization pathways which are not all targeted by the used compounds.

      Invoking an elevation of NAADP as the mediator of calcium release requires measurements of the changes in NAADP concentration in response to the bacteria. This was not performed. Instead, the authors analyzed the possible contribution of putative NAADP-generating systems and reported that the most active of these, CD38, was without effect, while the elimination of SARM1, another potential source of NAADP, had a very modest (≈20%) inhibitory effect that may have been due to clonal variation, which was not ruled out. In view of these data, the conclusion that NAADP is involved in the invasion process seems unwarranted.

      Our results from two independent experimental set-ups (Ned19 [Figure 1d] and TPC1 K.O. [Figure 1e & Figure 2f]) indicate the involvement of NAADP in the process. However, the measurement of NAADP concentration is non-trivial. However, we can rule out clonal variation in the SARM1 mutant since experiments were conducted with a cell pool as described above in order to avoid clonal variation of single clones.

      The mechanism behind biosynthesis of NAADP is still debated. CD38 was the first enzyme discovered to possess the ability of producing NAADP. However, it requires acidic pH to produce NAADP10 -which does not match the characteristics of a cytosolic NAADP producer. HeLa cells do not express CD38 and hence, it is not surprising that inhibition of CD38 had no effect on S. aureus invasion in HeLa cells. However, NAADP production by HeLa cells was observed in absence of CD3811. Thus CD38-independent NAADP generation is likely. SARM1 can produce NAADP at neutral pH12 and is expressed in HeLa, thus providing a more promising candidate.

      We agree with the reviewer that the reduction of S. aureus internalization after ablation of SARM1 is less pronounced than in other experiments of ours. This may be explained by NAADP originating from other enzymes, such as the recently discovered DUOX1, DUOX2, NOX1 and NOX213, which – with exception of DUOX2- possess a low expression even in HeLa cells. We will discuss this in the revised manuscript.

      The involvement of lysosomal secretion is, again, predicated largely on the basis of pharmacological evidence. No direct evidence is provided for the insertion of lysosomal components into the plasma membrane, or for the release of lysosomal contents to the medium. Instead, inhibition of lysosomal exocytosis by vacuolin-1 is the sole source of evidence. However, vacuolin-1 is by no means a specific inhibitor of lysosomal secretion: it is now known to act primarily as a PIKfyve inhibitor and to cause massive distortion of the endocytic compartment, including gross swelling of endolysosomes. The modest (20-25%) inhibition observed when using synaptotagmin 7 knockout cells is similarly not convincing proof of the requirement for lysosomal secretion.

      We agree that the manuscript will strongly benefit from a functional analysis of lysosomal exocytosis. We therefore will conduct assays to investigate exocytosis in the revision. However, we previously showed i) by addition of specific antisera that LAMP1 transiently is exposed on the plasma membrane during ionomycin and pore-forming toxin challenge and ii) demonstrated the release of ASM activity into the culture medium under these conditions.8 Both measurements are not compatible with S. aureus infection, since LAMP1 antibodies also are non-specifically bound by protein A and another IgG-binding protein on the S. aureus surface, which would bias the results. Since protein A also serves as an adhesin, we cannot simply delete the ORF without changing other aspects of staphylococcal virulence. Further, FBS contains a ASM background activity that impedes activity measurements of cell culture medium. We previously removed this background activity by a specific heat-inactivation protocol.8 However, S. aureus invasion is strongly reduced in culture medium containing this heat-inactivated FBS.

      We agree with the reviewer that Vacuolin-1 has unspecific side effects. We will address this in the revised version of the manuscript.

      As to the involvement of synaptotagmin 7:

      Synaptotagmin 7 is not the only protein possibly involved in Ca-dependent exocytosis. For instance, SYT1 has been shown to possess an overlapping function.14 This may explain the discrepancy between our vacuolin-1 and SYT7 ablation experiments. We will add an according section to the discussion.

      ASM is proposed to play a central role in the rapid invasion process. As above, most of the evidence offered in this regard is pharmacological and often inconsistent between inhibitors or among cell types. Some drugs affect some of the cells, but not others. It is difficult to reach general conclusions regarding the role of ASM. The argument is made even more complex by the authors' use of exogenous sphingomyelinase (beta-toxin). Pretreatment with the toxin decreased invasion efficiency, a seemingly paradoxical result. Incidentally, the effectiveness of the added toxin is never quantified/validated by directly measuring the generation of ceramide or the disappearance of SM.

      Although pharmacological inhibitors can have unspecific side effects, we want to emphasize that the inhibitors used in our study act on the enzyme ASM by completely different mechanisms. Amitriptyline is a so called functional inhibitor of ASM (FIASMA) which induces the detachment of ASM from lysosomal membranes resulting in degradation of the enzyme.15 By contrast, ARC39 is a competitive inhibitor.16, 17

      We do not see inconsistencies in our data obtained with ASM inhibitors. Amitriptyline and ARC39 both reduce the invasion of S. aureus in HuLEC, HuVEC and HeLa cells (Figure 2c). ARC39 needs a longer pre-incubation, since its uptake by host cells is slower (data not shown). We observe a different outcome in 16HBE14o- and Ea.Hy 926 cells, with 16HBE14o- even demonstrating a slightly increased invasion of S. aureus upon ARC39 treatment. Amitriptyline had no effect (Figure 2c). Moreover, both inhibitors affected the invasion dynamics (Figure 3d), phagosomal escape (Figure 4c and Supp. Figure 4e) and Rab7 recruitment (Figure 4a and Supp. Figure 4b) in a similar fashion. Proper inhibition of ASM by both compounds in all cell lines used was validated by enzyme assays (Supp. Figure 2e), which suggests that the ASM-dependent pathway does only exist in specific cell lines. This also may serve as an argument that we here do not observe unspecific side effects of the compounds. We will clarify this in the revised manuscript.

      ASM is a key player for SM degradation and recycling. In clinical context, deficiency in ASM results in the so-called Niemann Pick disease type A/B. The lipid profile of ASM-deficient cells is massively altered18, which will result in severe side effects. Short-term inhibition by small molecules therefore poses a clear benefit when compared to the usage of ASM K.O. cells.

      As to the treatment with a bacterial sphingomyelinase:

      Treatment with the bacterial SMase (bSMase, here: β-toxin) was performed in two different ways:

      i) Pretreatment of host cells with β-toxin to remove SM from the host cell surface before infection. This removes the substrate of ASM from the cell surface prior to addition of the bacteria (Figure 2e, Figure 4d-f). Since SM is not present on the extracellular plasma membrane leaflet after treatment, a release of ASM cannot cause localized ceramide formation at the sites of lysosomal exocytosis. Similar observations were made by others.19

      ii) Addition of bSMase to host cells together with the bacteria to complement for the absence of ASM (Figure 2f).

      Removal of the ASM substrate before infection (i) prevents localized ASM-mediated conversion of SM to Cer during infection and resulted in a decreased invasion, while addition of the SMase during infection resulted in an increased invasion in TPC1 and SYT7 ablated cells. Thus, both experiments are consistent with each other and in line with our other observations.

      Removal of SM from the plasma membrane by β-toxin was indirectly demonstrated by the absence of Lysenin recruitment to phagosomes/escaped bacteria when host cells were pretreatment with the toxin before infection (Figure4F). In another publication, we recently quantified the effectiveness of β-toxin treatment, even though with slightly longer treatment times (75 min vs. 3h).20 We will repeat the measurements also for shorter treatment times.

      To clarify our experimental approaches to the readership we will add an explanatory section to the revised manuscript.

      As to the general conclusions regarding the role of ASM: ASM and lysosomal exocytosis has been shown to be involved in uptake of a variety of pathogens19, 21-25 supporting its role in the process.

      The use of fluorescent analogs of sphingomyelin and ceramide is not well justified and it is unclear what conclusions can be derived from these observations. Despite the low resolution of the images provided, it appears as if the labeled lipids are largely in endomembrane compartments, where they would presumably be inaccessible to the secreted ASM. Moreover, considering the location of the BODIPY probe, the authors would be unable to distinguish intact sphingomyelin from its breakdown product, ceramide. What can be concluded from these experiments? Incidentally, the authors report only 10% of BODIPY-positive events after 10 min. What are the implications of this finding? That 90% of the invasion events are unrelated to sphingomyelin, ASM, and ceramide?

      During the experiments with fluorescent SM analogues (Figure 3a,b), S. aureus was added to the samples immediately before start of video recording. Hence, bacteria are slowly trickling onto the host cells and we thus can image the initial contact between them and the bacteria, for instance, the bacteria depicted in Figure 3a contact the host cell about 9 min before becoming BODIPY-FL-positive (see Supp. Video 1, 55 min). Hence, we think that in these cases we see the formation of phagosomes around bacteria rather than bacteria in endomembrane compartments. Since generation of phagosomes happens at the plasma membrane, SM is accessible to secreted ASM.

      The “trickling” approach for infection is an experimental difference to our invasion measurements, in which we synchronized the infection by a very slow centrifugation. This ensures that all bacteria have contact to host cells and are not just floating in the culture medium. However, live cell imaging of initial bacterial-host contact and synchronization of infection is technically not combinable.

      In our invasion measurements -with synchronization-, we typically see internalization of ~20% of all added bacteria after 30 min. Hence, most bacteria that are visible in our videos likely are still extracellular and only a small proportion was internalized. This explains why only 10% of total bacteria are positive for BODIPY-FL-SM after 10 min. The proportion of internalized bacteria that are positive for BODIPY-FL-SM should be way higher but cannot be determined with this method.

      We agree with the reviewer that we cannot observe conversion of BODIPY-FL-SM by ASM. In order to do that, we attempted to visualize the conversion of a visible-range SM FRET probe (Supp. Figure 3), but the structure of the probe is not compatible with measurement of conversion on the plasma membrane, since the FITC fluorophore released into the culture medium by the ASM activity thereby gets lost for imaging. In general, the visualization of SM conversion with subcellular resolution is challenging and even with novel tools developed in our lab26 visualization of SM on the plasma membrane is difficult.

      The conclusion we draw from these experiments are that i.) S. aureus invasion is associated with SM and ii.) SM-associated invasion can be very fast, since bacteria are rapidly engulfed by BODIPY-FL-SM containing membranes.

      It is also unclear how the authors can distinguish lysenin entry into ruptured vacuoles from the entry of RFP-CWT, used as a criterion of bacterial escape. Surely the molecular weights of the probes are not sufficiently different to prevent the latter one from traversing the permeabilized membrane until such time that the bacteria escape from the vacuole.

      We here want to clarify that both, the Lysenin as well as the CWT reporter have access to rupture vacuoles (Figure 4b). We used the Lysenin reporter in these experiments for estimation of SM content of phagosomal membranes. If a vacuole is ruptured, both the bacteria and the luminal leaflet of the phagosomal membrane remnants get in contact with the cytosol and hence with the cytosolically expressed reporters YFP-Lysenin as well as RFP-CWT resulting in “Lysenin-positive escape” when phagosomes contained SM (see Figure 4f). By contrast, either β-toxin expression by S. aureus or pre-treatment with the bSMase resulted in absence of Lysenin recruitment suggesting that the phagosomal SM levels were decreased/undetectable (Figure 4f, Supp Figure 5f, g, i, j).

      This approach does not enable a quantitative measurement of phagosomal SM and rather gives a “yes or no” answer. However, we think this method is sufficient to show that β-toxin expression and pretreatment markedly decreased phagosomal SM levels in the host cells.

      The approach we used here to analyze “Lysenin-positive escape” can clearly be distinguished from Lysenin-based methods that were used by others.27 There Lysenin was used to show trans-bilayer movement of SM before rupture of bacteria-containing phagosomes.

      To clarify the function of Lysenin in our approach we will add an additional figure to the revised manuscript.

      Both SMase inhibitors (Figure 4C) and SMase pretreatment increased bacterial escape from the vacuole. The former should prevent SM hydrolysis and formation of ceramide, while the latter treatment should have the exact opposite effects, yet the end result is the same. What can one conclude regarding the need and role of the SMase products in the escape process?

      As pointed out above, pretreatment of host cells with SMase removes SM from the plasma membrane and hence, ASM does not have access to its substrate. Hence, both treatment with either ASM inhibitors or pretreatment with bacterial SMase prevent ASM from being active on the plasma membrane and hence block the ASM-dependent uptake (Figure 2 c, e). Although overall less bacteria were internalized by host cells under these conditions, the bacteria that invaded host cells did so in an ASM-independent manner.

      Since blockage of the ASM-dependent internalization pathway (with ASM inhibitor [Figure 4c], SMase pretreatment [Figure 4e] and Vacuolin-1[Supp. Fig.4f]) always resulted in enhanced phagosomal escape, we conclude that bacteria that were internalized in an ASM-independent fashion cause enhanced escape. Vice versa, bacteria that enter host cells in an ASM-dependent manner demonstrate lower escape rates.

      This is supported by comparing the escape rates of “early” and “late” invaders [Figure 4g/h], which in our opinion is a key experiment that supports this hypothesis. The “early” invaders are predominantly ASM-dependent (see e.g. Figure 3e) and thus, bacteria that entered host cell in the first 10 min of infection should have been internalized predominantly in an ASM-dependent fashion, while slower entry pathways are active later during infection. The early ASM dependent invaders possessed lower escape rates, which is in line with the data obtained with inhibitors (e.g. Figure 4c and Supp. Fig. 4f).

      We hypothesize that the activity of ASM on the plasma membrane during invasion mediates the recruitment of a specific subset of receptors, which then influence downstream phagosomal maturation and escape. This hypothesis is supported by the fact that the subset of receptors interacting with S. aureus is altered upon inhibition of the ASM-dependent uptake pathway. We describe this in another study that is currently under evaluation elsewhere.

      Reviewer #2 (Public review):

      Summary:

      In this manuscript, Ruhling et al propose a rapid uptake pathway that is dependent on lysosomal exocytosis, lysosomal Ca2+ and acid sphingomyelinase, and further suggest that the intracellular trafficking and fate of the pathogen is dictated by the mode of entry.

      The evidence provided is solid, methods used are appropriate and results largely support their conclusions, but can be substantiated further as detailed below. The weakness is a reliance on chemical inhibitors that can be non-specific to delineate critical steps.

      Specific comments:

      A large number of experiments rely on treatment with chemical inhibitors. While this approach is reasonable, many of the inhibitors employed such as amitriptyline and vacuolin1 have other or non-defined cellular targets and pleiotropic effects cannot be ruled out. Given the centrality of ASM for the manuscript, it will be important to replicate some key results with ASM KO cells.

      We thank the reviewer for the critical evaluation of our manuscript and plenty of constructive comments.

      We agree with the reviewer, that ASM inhibitors such as functional inhibitors of ASM (FIASMA) like amitriptyline used in our study have unspecific side effects given their mode-of-action. FIASMAs induce the detachment of ASM from lysosomal membranes resulting in degradation of the enzyme.15  However, we want to emphasize that we also used the competitive inhibitor ARC39 in our study16, 17 which acts on the enzyme by a completely different mechanism. All phenotypes (reduced invasion [Figure 2c, d], effect on invasion dynamics [Figure 3d], enhanced escape [Figure 4c and Supp Figure 4e] and differential recruitment of Rab7 [Supp. Figure 4b]) were observed with both inhibitors thereby supporting the role of ASM in the process.

      We further agree that experiments with genetic evidence usually support and improve scientific findings. However, ASM is a cellular key player for SM degradation and recycling. In a clinical context, deficiency in ASM results in a so-called Niemann Pick disease type A/B. The lipid profile of ASM-deficient cells is massively altered18, which in itself will result in severe side effects. Thus, the usage of inhibitors provides a clear benefit when compared to ASM K.O. cells, since ASM activity can be targeted in a short-term fashion thereby preventing larger alterations in cellular lipid composition.

      Most experiments are done in HeLa cells. Given the pathway is projected as generic, it will be important to further characterize cell type specificity for the process. Some evidence for a similar mechanism in other cell types S. aureus infects, perhaps phagocytic cell type, might be good.

      Whenever possible we performed the experiments not only in HeLa but also in HuLECs. For example, we refer to experiments concerning the role of Ca2+ (Figure 1c/Supp.Figure1e), lysosomal Ca2+/Ned19 (Figure1d/Supp Figure 1g), lysosomal exocytosis/Vacuolin-1 (Figure 2a/Supp. Figure2a), ASM/ARC39 and amitriptyline (Figure 2c), surface SM/β-toxin (Figure 2e/Supp. Figure 2g), analysis of invasion dynamics (complete Figure 3) and measurement of cell death during infection (Figure 5c-e, Supp. Figure 6a+b).

      HuLECs, however, are not really genetically amenable and hence we were not able to generate gene deletions in these cells and upon introduction of the fluorescence escape reporter the cells are not readily growing.

      As to ASM involvement in phagocytic cells: a role for ASM during the uptake of S. aureus by macrophages was previously reported by others.23 However, in professional phagocytes S. aureus does not escape from the phagosome and replicates within the vacuole.28

      I'm a little confused about the role of ASM on the surface. Presumably, it converts SM to ceramide, as the final model suggests. Overexpression of b-toxin results in the near complete absence of SM on phagosomes (having representative images will help appreciate this), but why is phagosomal SM detected at high levels in untreated conditions? If bacteria are engulfed by SM-containing membrane compartments, what role does ASM play on the surface? If surface SM is necessary for phagosomal escape within the cell, do the authors imply that ASM is tuning the surface SM levels to a certain optimal range? Alternatively, can there be additional roles for ASM on the cell surface? Can surface SM levels be visualized (for example, in Figure 4 E, F)?

      We initially hypothesized that we would detect higher phagosomal SM levels upon inhibition of ASM, since our model suggests SM cleavage by ASM on the host cell surface during bacterial cell entry. However, we did not detect any changes in our experiments (Supp. Figure 4d). We currently favor the following explanation: SM is the most abundant sphingolipid in human cells.29 If peripheral lysosomes are exocytosed and thereby release ASM, only a localized and relative small proportion of SM may get converted to Cer, which most likely is below our detection limit. In addition, the detection of cytosolically exposed phagosomal SM by YFP-Lysenin is not quantitative and provides a “Yes or No” measurement. Hence, we think that the rather limited SM to Cer conversion in combination with the high abundance of SM in cellular membranes does not visibly affect the recruitment of the Lysenin reporter.

      In our experiments that employ BODIPY-FL-SM (Figure 3a+b), we cannot distinguish between native SM and downstream metabolites such as Cer. Hence, again we cannot make any assumptions on the extent to which SM is converted on the surface during bacterial internalization. Although our laboratory recently used trifunctional sphingolipid analogs to analyze the SM to Cer conversion20, the visualization of this process on the plasma membrane is currently still challenging.

      Overall, we hypothesize that the localized generation of Cer on the surface by released ASM leads to generation of Cer-enriched platforms. Subsequently, a certain subset of receptors may be recruited to these platforms and influence the uptake process. These platforms are supposed to be very small, which also would explain that we did not detect changes in Lysenin recruitment.

      Related to that, why is ASM activity on the cell surface important? Its role in non-infectious or other contexts can be discussed.

      ASM release by lysosomal exocytosis is implied in plasma membrane repair upon injury. We will this discuss this in the revised version of the manuscript.

      If SM removal is so crucial for uptake, can exocytosis of lysosomes alone provide sufficient ASM for SM removal? How much or to what extent is lysosomal exocytosis enhanced by initial signaling events? Do the authors envisage the early events in their model happening in localized confines of the PM, this can be discussed.

      Ionomycin treatment led to a release of ~10 % of all lysosomes and also increased extracellular ASM activity.7, 8 However, it is currently unclear– to our knowledge -to which extent the released ASM affects surface SM levels. Also, it is unknown which percentage of the lysosomes is released during infection with S. aureus. However, one has to speculate that this will be only a fraction of the “releasable lysosomes” as we assume that the effects (lysosomal Ca2+ liberation, lysosomal exocytosis and ASM activity) are very localized and take place only at host-pathogen contact sites (see also above). In initial experimentation we attempted to visualize the local ASM activity on the cell surface by using a visible range FRET probe (Supp. Fig. 3). Cleavage of the probe by ASM on the surface leads to release of FITC into the cell culture medium which does not contribute a measurable signal at the surface.

      How are inhibitor doses determined? How efficient is the removal of extracellular bacteria at 10 min? It will be good to substantiate the cfu experiments for infectivity with imaging-based methods. Are the roles of TPC1 and TPC2 redundant? If so, why does silencing TPC1 alone result in a decrease in infectivity? For these and other assays, it would be better to show raw values for infectivity. Please show alterations in lysosomal Ca2+ at the doses of inhibitors indicated. Is lysosomal Ca2+ released upon S. aureus binding to the cell surface? Will be good to directly visualize this.

      Concerning the inhibitor concentrations, we either used values established in published studies or recommendations of the suppliers (e.g. 2-APB, Ned19, Vacuolin-1). For ASM inhibitors, we determined proper inhibition of ASM by activity assays. Concentrations of ionomycin resulting in Ca2+ influx and lysosomal exocytosis was determined in earlier studies of our lab.8, 30

      As to the removal of bacteria at 10 min p.i.: Lysostaphin is very efficient for removal of extracellular S. aureus and sterilizes the tissue culture supernatant. It significantly lyses bacteria within a few minutes, as determined by turbidity assays.31

      As to imaging-based infectivity assays: We will add an analysis of imaging-based invasion assays in the revised manuscript.

      Regarding the roles of TPC1 and TPC2: from our data we cannot conclude whether the roles of TPC1 and TPC2 are redundant. One could speculate that since blockage of TPC1 alone is sufficient to reduce internalization of bacteria, that both channels may have distinct roles. On the other hand, there might be a Ca2+ threshold in order to initiate lysosomal exocytosis that can only be attained if TPC1 and TPC2 are activated in parallel. Thus, our observations are in line with another study that shows reduced Ebola virus infection in absence of either TPC1 or TPC2.32

      As to raw CFU counts: whereas the observed effects upon blocking the invasion of S. aureus are stable, the number of internalized bacteria varies between individual biological replicates, for instance, by differences in host cell fitness or growth differences in bacterial cultures, which are prepared freshly for each experiment.

      With respect to visualization of lysosomal Ca2+ release: we agree with the reviewer that direct visual demonstration of lysosomal Ca2+ release upon infection will improve the manuscript. We therefore will perform additional experimentation to show alterations of Ca2+ at the lysosomes during infection.

      The precise identification of cytosolic vs phagosomal bacteria is not very easy to appreciate. The methods section indicates how this distinction is made, but how do the authors deal with partial overlaps and ambiguities generally associated with such analyses? Please show respective images. The number of events (individual bacteria) for the live cell imaging data should be clearly mentioned.

      We apologize for not having sufficiently explained the technology to detect escaped S. aureus. The cytosolic location of S. aureus is indicated by recruitment of RFP-CWT.33 CWT is the cell wall targeting domain of lysostaphin, which efficiently binds to the pentaglycine cross bridge in the peptidoglycan of S. aureus. This reporter is exclusively and homogenously expressed in the host cytosol. Only upon rupture of phagoendosomal membranes the reporter can be recruited to the cell wall of now cytosolically located bacteria. S. aureus mutants, for instance in the agr quorum sensing system, cannot break down the phagosomal membrane in non-professional phagocytes and thus stay unlabeled by the CWT-reporter.33 We will include respective images/movies of escape events and the bacteria numbers for live cell experiments in the revised version of the manuscript.

      In the phagosome maturation experiments, what is the proportion of bacteria in Rab5 or Rab7 compartments at each time point? Will the decreased Rab7 association be accompanied by increased Rab5? Showing raw values and images will help appreciate such differences. Given the expertise and tools available in live cell imaging, can the authors trace Rab5 and Rab7 positive compartment times for the same bacteria?

      We will include the proportion of Rab7-associated bacteria in the revised manuscript. Usually, we observe that Rab5 is only transiently (for a few minutes) present on phagosomes and only afterwards the phagosomes become positive for Rab7. We do not think that a decrease in Rab7-positive phagosomes would increase the proportion of Rab5-positive phagosomes. However, we cannot exclude this hypothesis with our data.

      We can achieve tracing of individual bacteria for recruitment of Rab5/Rab7 only manually, which impedes a quantitative evaluation. However, we will include information that illustrates the consecutive recruitment of the GTPases.

      The results with longer-term infection are interesting. Live cell imaging suggests that ASM-inhibited cells show accelerated phagosomal escape that reduces by 6 hpi. Where are the bacteria at this time point ? Presumably, they should have reached lysosomes. The relationship between cytosolic escape, replication, and host cell death is interesting, but the evidence, as presented is correlative for the populations. Given the use of live cell imaging, can the authors show these events in the same cell?

      We think that most bacteria-containing phagoendosomes should have fused with lysosomes 6 h p.i. as we have previously shown by acidification to pH of 5 and LAMP1 decoration.34

      We will provide images/videos to show the correlation between escape and replication in the revised manuscript.

      Given the inherent heterogeneity in uptake processes and the use of inhibitors in most experiments, the distinction between ASM-dependent and independent pathways might not be as clear-cut as the authors suggest. Some caution here will be good. Can the authors estimate what fraction of intracellular bacteria are taken up ASM-dependent?

      We agree with the reviewer that an overlap between internalization pathways is likely. A clear distinction is therefore certainly non-trivial. Alternative to ASM-dependent and ASM-independent pathways, the ASM activity may also accelerate one or several internalization pathways. We will address this limitation in the revised manuscript. 

      Early in infection (~10 min after contact with the cells), the proportion of bacteria that enter host cells ASM-dependently is relatively high amounting to roughly 75% in HuLEC. After 30 min, this proportion is decreasing to about 50%. We will include this information in the revised version of the manuscript.

      References

      (1) Launay, P. et al. TRPM4 Is a Ca2+-Activated Nonselective Cation Channel Mediating Cell Membrane Depolarization. Cell 109, 397-407 (2002).

      (2) Nilius, B. et al. The Ca<sup>2+</sup>‐activated cation channel TRPM4 is regulated by phosphatidylinositol 4,5‐biphosphate. The EMBO Journal 25, 467-478-478 (2006).

      (3) Cáceres, M. et al. TRPM4 Is a Novel Component of the Adhesome Required for Focal Adhesion Disassembly, Migration and Contractility. PLoS One 10, e0130540 (2015).

      (4) Silva, I., Brunett, M., Cáceres, M. & Cerda, O. TRPM4 modulates focal adhesion-associated calcium signals and dynamics. Biophysical Journal 123, 390a (2024).

      (5) Schlesier, T., Siegmund, A., Rescher, U. & Heilmann, C. Characterization of the Atl-mediated staphylococcal internalization mechanism. International Journal of Medical Microbiology 310, 151463 (2020).

      (6) Jevon, M. et al. Mechanisms of Internalization ofStaphylococcus aureus by Cultured Human Osteoblasts. Infection and Immunity 67, 2677-2681 (1999).

      (7) Rodriguez, A., Webster, P., Ortego, J. & Andrews, N.W. Lysosomes behave as Ca2+-regulated exocytic vesicles in fibroblasts and epithelial cells. J Cell Biol 137, 93-104 (1997).

      (8) Krones & Rühling et al. Staphylococcus aureus alpha-Toxin Induces Acid Sphingomyelinase Release From a Human Endothelial Cell Line. Front Microbiol 12, 694489 (2021).

      (9) Sakurai, Y. et al. Two-pore channels control Ebola virus host cell entry and are drug targets for disease treatment. Science 347, 995-998 (2015).

      (10) Aarhus, R., Graeff, R.M., Dickey, D.M., Walseth, T.F. & Lee, H.C. ADP-ribosyl cyclase and CD38 catalyze the synthesis of a calcium-mobilizing metabolite from NADP. J Biol Chem 270, 30327-30333 (1995).

      (11) Schmid, F., Fliegert, R., Westphal, T., Bauche, A. & Guse, A.H. Nicotinic acid adenine dinucleotide phosphate (NAADP) degradation by alkaline phosphatase. J Biol Chem 287, 32525-32534 (2012).

      (12) Angeletti, C. et al. SARM1 is a multi-functional NAD(P)ase with prominent base exchange activity, all regulated bymultiple physiologically relevant NAD metabolites. iScience 25, 103812 (2022).

      (13) Gu, F. et al. Dual NADPH oxidases DUOX1 and DUOX2 synthesize NAADP and are necessary for Ca(2+) signaling during T cell activation. Sci Signal 14, eabe3800 (2021).

      (14) Schonn, J.-S., Maximov, A., Lao, Y., Südhof, T.C. & Sørensen, J.B. Synaptotagmin-1 and -7 are functionally overlapping Ca<sup>2+</sup> sensors for exocytosis in adrenal chromaffin cells. Proceedings of the National Academy of Sciences 105, 3998-4003 (2008).

      (15) Kornhuber, J. et al. Functional Inhibitors of Acid Sphingomyelinase (FIASMAs): a novel pharmacological group of drugs with broad clinical applications. Cell Physiol Biochem 26, 9-20 (2010).

      (16) Naser, E. et al. Characterization of the small molecule ARC39, a direct and specific inhibitor of acid sphingomyelinase in vitro. J Lipid Res 61, 896-910 (2020).

      (17) Roth, A.G. et al. Potent and selective inhibition of acid sphingomyelinase by bisphosphonates. Angew Chem Int Ed Engl 48, 7560-7563 (2009).

      (18) Schuchman, E.H. & Desnick, R.J. Types A and B Niemann-Pick disease. Mol Genet Metab 120, 27-33 (2017).

      (19) Miller, M.E., Adhikary, S., Kolokoltsov, A.A. & Davey, R.A. Ebolavirus Requires Acid Sphingomyelinase Activity and Plasma Membrane Sphingomyelin for Infection. Journal of Virology 86, 7473-7483 (2012).

      (20) M. Rühling, L.K., F. Wagner, F. Schumacher, D. Wigger, D. A. Helmerich, T. Pfeuffer, R. Elflein, C. Kappe, M. Sauer, C. Arenz, B. Kleuser, T. Rudel, M. Fraunholz, J. Seibel Trifunctional sphingomyelin derivatives enable nanoscale resolution of sphingomyelin turnover in physiological and infection processes via expansion microscopy. Nat Commun accepted in principle (2024).

      (21) Peters, S. et al. Neisseria meningitidis Type IV Pili Trigger Ca(2+)-Dependent Lysosomal Trafficking of the Acid Sphingomyelinase To Enhance Surface Ceramide Levels. Infect Immun 87 (2019).

      (22) Grassmé, H. et al. Acidic sphingomyelinase mediates entry of N. gonorrhoeae into nonphagocytic cells. Cell 91, 605-615 (1997).

      (23) Li, C. et al. Regulation of Staphylococcus aureus Infection of Macrophages by CD44, Reactive Oxygen Species, and Acid Sphingomyelinase. Antioxid Redox Signal 28, 916-934 (2018).

      (24) Fernandes, M.C. et al. Trypanosoma cruzi subverts the sphingomyelinase-mediated plasma membrane repair pathway for cell invasion. J Exp Med 208, 909-921 (2011).

      (25) Luisoni, S. et al. Co-option of Membrane Wounding Enables Virus Penetration into Cells. Cell Host & Microbe 18, 75-85 (2015).

      (26) Rühling, M. et al. Trifunctional sphingomyelin derivatives enable nanoscale resolution of sphingomyelin turnover in physiological and infection processes via expansion microscopy. Nature Communications 15, 7456 (2024).

      (27) Ellison, C.J., Kukulski, W., Boyle, K.B., Munro, S. & Randow, F. Transbilayer Movement of Sphingomyelin Precedes Catastrophic Breakage of Enterobacteria-Containing Vacuoles. Curr Biol 30, 2974-2983 e2976 (2020).

      (28) Moldovan, A. & Fraunholz, M.J. In or out: Phagosomal escape of Staphylococcus aureus. Cell Microbiol 21, e12997 (2019).

      (29) Slotte, J.P. Biological functions of sphingomyelins. Progress in Lipid Research 52, 424-437 (2013).

      (30) Stelzner, K. et al. Intracellular Staphylococcus aureus Perturbs the Host Cell Ca(2+) Homeostasis To Promote Cell Death. mBio 11 (2020).

      (31) Kunz, T.C. et al. The Expandables: Cracking the Staphylococcal Cell Wall for Expansion Microscopy. Front Cell Infect Microbiol 11, 644750 (2021).

      (32) Sakurai, Y. et al. Ebola virus. Two-pore channels control Ebola virus host cell entry and are drug targets for disease treatment. Science 347, 995-998 (2015).

      (33) Grosz, M. et al. Cytoplasmic replication of Staphylococcus aureus upon phagosomal escape triggered by phenol-soluble modulin alpha. Cell Microbiol 16, 451-465 (2014).

      (34) Giese, B. et al. Staphylococcal alpha-toxin is not sufficient to mediate escape from phagolysosomes in upper-airway epithelial cells. Infect Immun 77, 3611-3625 (2009).

    1. Author response:

      Reviewer 1:

      (1) Free energy barriers appear to be very high for a substrate transport process. In Figure 3, the transitions from IF (Inward facing) to OF (Outward facing) state appear to have a barrier of 12 kcal/mol. Other systems with mutant or sodium unbound have even higher barriers. This does not seem consistent with previous studies where transport mechanisms of transporters have been explored using molecular dynamics. 

      First, in Figure 3, the transition from IF to OF state doesn’t have a barrier of 12 kcal/mol. The IFF to OFB transition is almost barrierless, and from OFB to OFF is ~5 kcal/mol, which is also evident in Figure 2.

      If the reviewer was referring to the transition from OFB to IFB states, the barrier is 6.8 kcal/mol (Na+ bound state), and the rate-limiting barrier in the entire sugar transport process (Na+ bound state) is 8.4 kcal/mol, as indicated in Figure 2 and Table 1, which is much lower than the 12 kcal/mol barrier the reviewer mentioned. When the Na+ is unbound, the barrier can be as high as 12 kcal/mol, but it is this high barrier that leads to our conclusion that the Na+ binding is essential for sugar transport, and the 12 kcal/mol barrier indicates an energetically unfavorable sugar translocation process when the Na+ is unbound, which is unlikely to be the major translocation process in nature. 

      Even for the 12 kcal/mol barrier reported for the Na+ unbound state, it is still not too high considering the experimentally measured MelB sugar active transport rate, which is estimated to be on the order of 10 to 100 s-1. This range of transport rate is typical for similar MFS transporters such as the lactose permease (LacY), which has an active transport rate of 20 s-1. The free energy barrier associated with the active transport is thus on the order of ~15-16 kcal/mol based on transition state theory assuming kBT/h as the prefactor. This experimentally estimated barrier is higher than all of our calculated barriers. Our calculated barrier for the sugar translocation with Na+ bound is 8.4 kcal/mol, which means an additional ~7-8 kcal/mol barrier is contributed by the Na+ release process after sugar release in the IFF state. This is a reasonable estimation of the Na+ unbinding barrier.

      Therefore, whether the calculated barrier is too high depends on the experimental kinetics measurements, which are often challenging to perform. Based on the existing experimental data, the MFS transporters are

      usually relatively slow in their active transport cycle. The calculated barrier thus falls within the reasonable range considering the experimentally measured active transport rates.

      (2) Figure 2b: The PMF between images 20-30 shows the conformation change from OF to IF, where the occluded (OC) state is the highest barrier for transition. However, OC state is usually a stable conformation and should be in a local minimum. There should be free energy barriers between OF and OC and in between OC and IF.  

      First, the occluded state (OCB) is not between images 20-30, it is between images 10 to 20. Second, there is no solid evidence that the OCB state is a stable conformation and a local minimum. Existing experimental structures of MFS transporters seldom have the fully occluded state resolved.

      (3) String method pathway is usually not the only transport pathway and alternate lower energy pathways should be explored. The free energy surface looks like it has not deviated from the string pathway. Longer simulations can help in the exploration of lower free energy pathways. 

      We agree with the reviewer that the string method pathway is usually not the only transport pathway and alternate lower energy pathways could exist. However, we also note that even if the fully occluded state is a local minimum and our free energy pathway does visit this missing local minimum after improved sampling, the overall free energy barrier will not be lowered from our current calculated value. This is because the current rate-limiting barrier arises from the transition from the OFB state to the IFF state, and the barrier top corresponds to the sugar molecule passing through the most constricted region in the cytoplasmic region, i.e., the IFC intermediate state visited after the IFB state is reached. Therefore, the free energy difference between the OFB state and the IFC state will not be changed by another hypothetical local minimum between the OFB and IFB states, i.e., the occluded OCB state. In other words, a hypothetical local minimum corresponding to the occluded state, even if it exists, will not decrease the overall rate-limiting barrier and may even increase it further, depending on the depth of the local minimum and the additional barriers of entering and escaping from this new minimum. 

      (4) The conformational change in transporters from OF to IF state is a complicated multi-step process. First, only 10 images in the string pathway are used to capture the transition from OF to IF state. I am not sure is this number is enough to capture the process. Second, the authors have used geodesic interpolation algorithm to generate the intermediate images. However, looking at Figure 3B, it looks like the transition pathway has not captured the occluded (OC) conformation, where the transport tunnel is closed at both the ends. Transporters typically follow a stepwise conformational change mechanism where OF state transitions to OC and then to IF state. It appears that the interpolation algorithm has created a hourglasslike state, where IF gates are opening and OF gates are closing simultaneously thereby creating a state where the transport tunnel is open on both sides of the membrane. These states are usually associated with high energy. References 30-42 cited in the manuscript reveal a distinct OC state for different transporters. 

      In our simulations, even with 10 initial images representing the OF to IF conformational transition, the occluded state is sampled in the final string pathway. There is an ensemble of snapshots where the extracellular and intracellular gates are both relatively narrower than the OF and IF states, preventing the sugar from leaking into either side of the bulk solution. In contrast to the reviewer’s guess, we never observed an hourglass-like state in our simulation where both gates are open. Figure 3B is a visual representation of the backbone structure of the OCB state without explicitly showing the actual radius of the gating region, which also depends on the side chain conformations. Thus, Figure 3B alone cannot be used to conclude that we are dominantly sampling an hourglass-like intermediate conformation instead of the occluded state, as mentioned by the reviewer. 

      Moreover, not all references in 30-42 have sampled the occluded state since many of them did not even simulate the substrate translocation process at all. For the ones that did sample substrate translocation processes, only two of them were studying the cation-coupled MFS family symporter (ref 38, 40) and they didn’t provide the PMF for the entire translocation process. There is no strong evidence for a stable minimum corresponding to a fully occluded state in these two studies.  In fact, different types of transporters with different coupling cations may exhibit different stability of the fully occluded state. For example, the fully occluded state has been experimentally observed for some MFS transporters, such as multidrug transporter EmrD, but not for others, such as lactose permease LacY. Thus, it is not generally true that a stable, fully-occluded state exists in all transporters, and it highly depends on the specific type of transporter and the coupling ion under study. 

      Reviewer 2:

      The manuscript by Liang and Guan provides an impressive attempt to characterize the conformational free energy landscape of a melibiose permease (MelB), a symporter member of major facilitator superfamily (MFS) of transporters. Although similar studies have been conducted previously for other members of MFS, each member or subfamily has its own unique features that make the employment of such methods quite challenging. While the methodology is indeed impressive, characterizing the coupling between large-scale conformational changes and substrate binding in membrane transporters is quite challenging and requires a sophisticated methodology. The conclusions obtained from the three sets of path-optimization and free energy calculations done by the authors are generally supported by the provided data and certainly add to our understanding of how sodium binding facilitates the transport of melibiose in MelB. However, the data is not generated reliably which questions the relevance of the conclusions as well. I particularly have some concerns regarding the implementation of the methodology that I will discuss below. 

      (1) In enhanced sampling techniques, often much attention is given to the sampling algorithm. Although the sampling algorithm is quite important and this manuscript has chosen an excellent pair: string method with swarms of trajectories (SMwST) and replica-exchange umbrella sampling (REUS) for this task, there are other important factors that must be taken into account. More specifically, the collective variables used and the preparation of initial conformations for sampling. I have objectives for both of these (particularly the latter) that I detail below. Overall, I am not confident that the free energy profiles generated (summarized in Figure 5) are reliable, and unfortunately, much of the data presented in this manuscript heavily relies on these free energy profiles. 

      Since comments (1) and (2) from this review are related, please see our response to (2) below. 

      (2) The authors state that they have had an advantage over other similar studies in that they had two endpoints of the string to work from experimental data. I agree that this is an advantage. However, this could lead to some dangerous flaws in the methodology if not appropriately taken into account. Proteins such as membrane transporters have many slow degrees of freedom that can be fully captured within tens of nanoseconds (90 ns was the simulation time used here for the REUS). Biased sampling allows us to overcome this challenge to some extent, but it is virtually impossible to take into account all slow degrees of freedom in the enhanced sampling protocol (e.g., the collective variables used here do not represent anything related to sidechain dynamics). Therefore, if one mixes initial conformations that form different initial structures (e.g., an OF state and an IF state from two different PDB files), it is very likely that despite all equilibration and relaxation during SMwST and REUS simulations, the conformations that come from different sources never truly mix. This is dangerous in that it is quite difficult to detect such inconsistencies and from a theoretical point of view it makes the free energy calculations impossible. Methods such as WHAM and its various offshoots all rely on overlap between neighboring windows to calculate the free energy difference between two windows and the overlap should be in all dimensions and not just the ones that we use for biasing. This is related to well-known issues such as hidden barriers and metastability. If one uses two different structures to generate the initial conformations, then the authors need to show their sampling has been long enough to allow the two sets of conformations to mix and overlap in all dimensions, which is a difficult task to do. 

      We partly agree with the reviewer in that it is challenging to investigate whether the structures generated from the two different initial structures are sufficiently mixed in terms of orthogonal degrees of freedom outside the CV space during our string method and REUS simulations. We acknowledge that our simulations are within 100 ns for each REUS window, and there could be some slow degrees of freedom that are not fully sampled within this timescale. However, the conjectures and concerns raised by the reviewer are somewhat subjective in that they are almost impossible to be completely disproven. In a sense, these concerns are essentially the same as the general suspicion that the biomolecular simulation results are not completely converged, which cannot be fully ruled out for relatively complex biomolecular systems in any computational study involving MD simulations.  We also note that comparison among the PMFs of different cation bound/unbound states will have some error cancellation effects because of the consistent use of the same sampling methods for all three systems. Our main conclusions regarding the cooperative binding and transport of the two substrates lie in such comparison of the PMFs and additionally on the unbiased MD simulations. Thus, although there could be insufficient sampling, our key conclusions based on the relative comparison between the PMFs are more robust and less likely to suffer from insufficient sampling.

      (3) I also have concerns regarding the choice of collective variables. The authors have split the residues in each transmembrane helix into the cyto- and periplasmic sides. Then they have calculated the mass center distance between the cytoplasmic sides of certain pairs of helices and have also done the same for the periplasmic side. Given the shape of a helix, this does not seem to be an ideal choice since rather than the rotational motion of the helix, this captures more the translational motion of the helix. However, the transmembrane helices are more likely to undergo rotational motion than the translational one. 

      Our choice of CVs not only captures the translational motion but also the rotational motion of the helix. Consider a pair of helices. If there is a relative rotation in the angle between the two helices, causing the extracellular halves of the two helices to get closer and the intracellular halves to be more separated, this rotational motion can be captured as the decrease of one CV describing the extracellular distance and increase in the other CV describing the intracellular distance between the two helices. Reversely, if one of the two CVs is forced to increase and the other one forced to decrease, it can, in principle, bias the relative rotation of the two helices with respect to each other. Indeed, comparing Figure 3 with Figure S4, the reorientation of the helices with respect to the membrane normal (Fig. S4) is accompanied by the simultaneous decrease and increase in the pairwise distances between different segments of the helices. Therefore, our choice of CVs in the string method and REUS are not biased against the rotation of the helices, as the reviewer assumed.

      (4) Convergence: String method convergence data does not show strong evidence for convergence (Figure S2) in my opinion. REUS convergence is also not discussed. No information is provided on the exchange rate or overlap between the windows.

      The convergence of string method, REUS, the exchange rate and overlap between windows will be discussed in the reviewed manuscript.

      Reviewer 3:

      The paper from Liang and Guan details the calculation of the potential mean force for the transition between two key states of the melibiose (Mel) transporter MelB. The authors used the string method along with replica-exchange umbrella sampling to model the transition between the outward and inwardfacing Mel-free states, including the binding and subsequent release of Mel. They find a barrier of ~6.8 kcal/mol and an overall free-energy difference of ~6.4 kcal/mol. They also investigate the same process without the co-transported Na+, finding a higher barrier, while in the D59C mutant, the barrier is nearly eliminated.

      For Na+ bound state, the rate-limiting barrier is 8.4 kcal/mol instead of 6.8 kcal/mol. The overall free energy difference is 3.7 kcal/mol instead of 6.4 kcal/mol. These numbers need to be corrected in the public review.

      I found this to be an interesting and technically competent paper. I was disappointed actually to see that the authors didn't try to complete the cycle. I realize this is beyond the scope of the study as presented.

      We agree with the reviewer that characterizing the complete cycle is our eventual goal. However, in order to characterize the complete cycle of the transporter, the free energy landscapes of the Na+ binding and unbinding process in the sugar-bound and unbound states, as well as the OF to IF conformational transition in the apo state. These additional calculations are expensive, and the amount of work devoted to these new calculations is estimated to be at least the same as the current study. Therefore, we prefer to carry out and analyze these new simulations in a future study.  

      The results are in qualitative agreement with expectations from experiments. Could the authors try to make this comparison more quantitative? For example, by determining the diffusivity along the path, the authors could estimate transition rates.

      In our revised manuscript, we will determine the diffusivity along the path and estimate transition rates.

      Relatedly, could the authors comment on how typical concentration gradients of Mel and Na+ would affect these numbers?

      The concentration gradient of Mel and Na+ can be varied in different experimental setups. In a typical active transport essay, the Na+ has a higher concentration outside the cell, and the melibiose has a higher concentration inside the cell. In the steady state, depending on the experiment setup, the extracellular Na+ concentration is in the range of 10-20 mM, and the intracellular concentration is self-balanced in the range of 3-4 mM due to the presence of other ion channels and pumps. In addition to the Na+ concentration gradient, there is also a transmembrane voltage potential of -200 mV (the intracellular side being more negative than the extracellular side), which facilitates the Na+ release into the intracellular side. In the steady state, the extracellular concentration of melibiose is ~0.4 mM, and the intracellular concentration is at least 1000 times the extracellular concentration, greater than 0.4 M. In this scenario, the free energy change of intracellular melibiose translocation will be increased by about ~5 kcal/mol at 300K temperature, leading to a total ∆𝐺 of ~8 kcal/mol. The total barrier for the melibiose translocation is expected to be increased by less than 5 kcal/mol. However, the increase in ∆𝐺 for intracellular melibiose translocation will be compensated by a decrease in ∆𝐺 of similar magnitude ( ~5 kcal/mol) for intracellular Na+ translocation. In a typical sugar self-exchange essay, there is no net gradient in the melibiose or Na+ across the membrane, and the overall free energy changes we calculated apply to this situation.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      In their manuscript "PDGFRRa signaling regulates Srsf3 transcript binding to affect PI3K signaling and endosomal trafficking" Forman and colleagues use iMEPM cells to characterize the effects of PDGF signaling on alternative splicing. They first perform RNA-seq using a one-hour stimulation with Pdgf-AA in control and Srsf3 knockdown cells. While Srsf3 manipulation results in a sizeable number of DE genes, PDGF does not. They then turn to examine alternative splicing, due to findings from this lab. They find that both PDGF and Srsf3 contribute much more to splicing than transcription. They find that the vast majority of PDGF-mediated alternative splicing depends upon Srsf3 activity and that skipped exons are the most common events with PDGF stimulation typically promoting exon skipping in the presence of Srsf3. They used eCLIP to identify RNA regions bound to Srsf3. Under both PDGF conditions, the majority of peaks were in exons with +PDGF having a substantially greater number of these peaks. Interestingly, they find differential enrichment of sequence motifs and GC content in stimulated versus unstimulated cells. They examine 2 transcripts encoding PI3K pathway (enriched in their

      GO analysis) members: Becn1 and Wdr81. They then go on to examine PDGFRRa and Rab5, an endosomal marker, colocalization. They propose a model in which Srsf3 functions downstream of PDGFRRa signaling to, in part, regulate PDGFRa trafficking to the endosome. The findings are novel and shed light on the mechanisms of PDGF signaling and will be broadly of interest. This lab previously identified the importance of PDGF naling on alternative splicing. The combination of RNA-seq and eCLIP is an exceptional way to comprehensively analyze this effect. The results will be of great utility to those studying PDGF signaling or neural crest biology. There are some concerns that should be considered, however. 

      We thank the Reviewer for these supportive comments.

      (1) It took some time to make sense of the number of DE genes across the results section and Figure 1. The authors give the total number of DE genes across Srsf3 control and loss conditions as 1,629 with 1,042 of them overlapping across Pdgf treatment. If the authors would add verbiage to the point that this leaves 1,108 unique genes in the dataset, then the numbers in Figure 1D would instantly make sense. The same applies to PDGF in Figure 1F and the Venn diagrams in Figure 2. 

      We have edited the relevant sentence for Figure 1D as follows: “There was extensive overlap (521 out of 1,108; 47.0%) of Srsf3-dependent DE genes across ligand treatment conditions, resulting in a total of 1,108 unique genes within both datasets (Fig. 1C,D; Fig. S1A).” Similarly, we edited the relevant sentence for Figure 1F as follows: “There was limited overlap (4 out of 47; 8.51%) of PDGF-AA-dependent DE genes across Srsf3 conditions, resulting in a total of 47 unique genes within both datasets (Fig. 1E,F; Fig. S1B).” We edited the relevant sentence for Figure 2B as follows: “There was limited overlap (203 out of 1,705; 11.9%) of Srsf3-dependent alternatively-spliced transcripts across ligand treatment conditions, resulting in a total of 1,705 unique events within both datasets (Fig. 2A,B).” Finally, we edited the relevant sentence for Figure 2D as follows: “There was negligible overlap (9 out of 622; 1.45%) of PDGF-AA-dependent alternatively-spliced transcripts across Srsf3 conditions, resulting in a total of 622 unique events within both datasets (Fig. 2C,D).”

      (2) The percentage of skipped exons in the +DPSI on the righthand side of Figure 2F is not readable.  

      We have moved the label for the percentage of skipped exon events with a +DPSI for the -PDGF-AA vs +PDGF-AA (scramble) alternatively-spliced transcripts in Figure 2E so that it is legible.

      (3) It would be useful to have more information regarding the motif enrichment in Figure 3. What is the extent of enrichment? The authors should also provide a more complete list of enriched motifs, perhaps as a supplement. 

      We have added P values beneath the motifs in Figure 3F and 3G. Further, we have added a new Supplementary Figure, Figure S5, that lists the occurrence of the top 10 most enriched motifs in the unstimulated and, separately, stimulated samples in the eCLIP dataset and in a control dataset, as well as their P values.

      (4) It is unclear what subset of transcripts represent the "overlapping datasets" on lines 280-315. The authors state that there are 149 unique overlapping transcripts, but the Venn diagram shows 270. Also, it seems that the most interesting transcripts are the 233 that show alternative splicing and are bound by Srsf3. Would the results shown in Figure 5 change if the authors focused on these transcripts? 

      The Reviewer is correct that 233 of the alternatively-spliced transcripts had an Srsf3 eCLIP peak, as indicated in Figure 5A. However, several of these eCLIP peaks were a large distance from an alternatively-spliced element in the rMATS datasets, indicating that Srsf3 binding may not be contributing to the splicing outcomes in these cases. Instead, we correlated the eCLIP peaks with AS events by identifying transcripts in which Srsf3 bound within an alternatively-spliced exon or within 250 bp of the neighboring introns. We have added additional text clarifying this point in the Results: “We next sought to identify high-confidence transcripts for which Srsf3 binding had an increased likelihood of contributing to AS. Previous studies revealed enrichment of functional RBP motifs near alternatively-spliced exons (Yee et al., 2019). As such, we correlated the eCLIP peaks with AS events across all four treatment comparisons by identifying transcripts in which Srsf3 bound within an alternatively-spliced exon or within 250 bp of the neighboring introns (Tables S12-S15).” Further, we have relabeled Figure 5B as “Highconfidence, overlapping datasets biological process GO terms”.

      (5) In general, there is little validation of the sequencing results, performing qPCR on Arhgap12 and Cep55. The authors should additionally validate the PI3K pathway members that they analyze. Related, is Becn1 expression downregulated in the absence of Srsf3, as would be predicted if it is undergoing NMD? 

      We have added two new figure panels, Figure 5F-5G, assessing Wdr81 AS and Wdr81 protein sizes, as this gene has previously been implicated in craniofacial development. We have added the following text to the Results section: “Finally, as Wdr81 protein levels are predicted to regulate RTK trafficking between early and late endosomes, we confirmed the differential AS of Wdr81 transcripts between unstimulated scramble cells and scramble cells treated with PDGFAA ligand for 1 hour by qPCR using primers within constitutively-expressed exons flanking alternatively-spliced exon 9. This analysis revealed a decreased PSI for Wdr81 in each of three biological replicates upon PDGF-AA ligand treatment (Fig. 5F). Relatedly, we assessed the ratio of larger isoforms of Wdr81 protein (containing the WD3 domain) to smaller isoforms (missing the WD3 domain) via western blotting. Consistent with our RNA-seq and qPCR results, PDGFAA stimulation for 24 hours in the presence of Srsf3 led to an increase in smaller Wdr81 protein isoforms (Fig. 5G).”

      (6) What is the alternative splicing event for Acap3?  

      We have added the following text to the Results section and updated Figure 5E with Acap3 eCLIP peak visualization and the predicted alternative splicing outcome: “Finally, Acap3 is a GTPase-activating protein (GAP) for the small GTPase Arf6, converting Arf6 to an inactive, GDP-bound state (Miura et al., 2016). Arf6 localizes to the plasma membrane and endosomes, and has been shown to regulate endocytic membrane trafficking by increasing PI(4,5)P2 levels at the cell periphery (D’Souza-Schorey and Chavrier, 2006). Further, constitutive activation of Arf6 leads to upregulation of the gene encoding the p85 regulatory subunit of PI3K and increased activity of both PI3K and AKT (Yoo et al., 2019)… Srsf3 binding was additionally increased in Acap3 exon 19 upon PDGF-AA stimulation, at an enriched motif within the highconfidence, overlapping datasets, and we observed a corresponding increase in excision of adjacent intron 19 (Fig. 5D,E). As Acap3 intron 19 contains a PTC, this event is predicted to result in more transcripts encoding full-length protein (Fig. 5E).”

      (7) The insets in Figure 6 C"-H" are useful but difficult to see due to their small size. Perhaps these could be made as their own figure panels. 

      We have increased the size of the previous insets in new Figure 6 panels C’’’-H’’’.

      (8) In Figure 6A, it is not clear which groups have statistically significant differences. A clearer visualization system should be used. 

      We have added bracket shapes to Figure 6A indicating the statistically significant differences between scramble 0 minutes and scramble 60 minutes, and between scramble 60 minutes and shSrsf3 60 minutes.

      (9) Similarly in Figure 6B, is 15 vs 60 minutes in the shSrsf3 group the only significant difference? Is there a difference between scramble and shSrsf3 at 15 minutes? Is there a difference between 0 and 15 minutes for either group? 

      We have added a bracket shape to Figure 6B indicating the statistically significant difference between shSrsf3 at 15 minutes and shSrsf3 at 60 minutes. No other pairwise comparisons between treatments or timepoints were statistically significantly different.

      Reviewer #2 (Public Review): 

      Summary: 

      This manuscript builds upon the work of a previous study published by the group (Dennison, 2021) to further elucidate the coregulatory axis of Srsf3 and PDGFRa on craniofacial development. The authors in this study investigated the molecular mechanisms by which PDGFRa signaling activates the RNA-binding protein Srsf3 to regulate alternative splicing (AS) and gene expression (GE) necessary for craniofacial development. PDGFRa signaling-mediated Srsf3 phosphorylation drives its translocation into the nucleus and affects binding affinity to different proteins and RNA, but the exact molecular mechanisms were not known. The authors performed RNA sequencing on immortalized mouse embryonic mesenchyme (MEPM) cells treated with shRNA targeting 3' UTR of Srsf3 or scramble shRNA (to probe AS and DE events that are Srsf3 dependent) and with and without PDGF-AA ligand treatment (to probe AS and DE events that are PDGFRa signaling dependent). They found that PDGFRa signaling has more effect on AS than on DE. A matching eCLIP-seq experiment was performed to investigate how Srsf3 binding sites change with and without PDGFRa signaling. 

      Strengths: 

      (1) The work builds well upon the previous data and the authors employ a variety of appropriate techniques to answer their research questions. 

      (2) The authors show that Srsf3 binding pattern within the transcript as well as binding motifs change significantly upon PDGFRa signaling, providing a mechanistic explanation for the significant changes in AS. 

      (3) By combining RNA-seq and eCLIP datasets together, the authors identified a list of genes that are directly bound by Srsf3 and undergo changes in GE and/or AS. Two examples are Becn1 and Wdr81, which are involved in early endosomal trafficking.  We thank the Reviewer for these supportive comments.

      Weaknesses: 

      (1) The authors identify two genes whose AS are directly regulated by Srsf3 and involved in endosomal trafficking; however, they do not validate the differential AS results and whether changes in these genes can affect endosomal trafficking. In Figure 6, they show that PDGFRa signaling is involved in endosome size and Rab5 colocalization, but do not show how Srsf3 and the two genes are involved. 

      We have added two new figure panels, Figure 5F-5G, assessing Wdr81 AS and Wdr81 protein sizes, as this gene has previously been implicated in craniofacial development. We have added the following text to the Results section: “Finally, as Wdr81 protein levels are predicted to regulate RTK trafficking between early and late endosomes, we confirmed the differential AS of Wdr81 transcripts between unstimulated scramble cells and scramble cells treated with PDGFAA ligand for 1 hour by qPCR using primers within constitutively-expressed exons flanking alternatively-spliced exon 9. This analysis revealed a decreased PSI for Wdr81 in each of three biological replicates upon PDGF-AA ligand treatment (Fig. 5F). Relatedly, we assessed the ratio of larger isoforms of Wdr81 protein (containing the WD3 domain) to smaller isoforms (missing the WD3 domain) via western blotting. Consistent with our RNA-seq and qPCR results, PDGFAA stimulation for 24 hours in the presence of Srsf3 led to an increase in smaller Wdr81 protein isoforms (Fig. 5G).” The experiments in Figure 6 compare early endosome size, PDGFRa localization in early endosomes and phospho-Akt levels in response to PDGF-AA stimulation in scramble versus shSrsf3 cells, demonstrating that Srsf3-mediated PDGFRa signaling leads to enlarged early endosomes, retention of PDGFRa in early endosomes and increased downstream phospho-Akt signaling. Though we agree with the Reviewer that functionally linking the AS events to the endosomal phenotype would strengthen our conclusions, these are technically challenging experiments for several reasons. First, this approach has typically relied on tiling oligos against a region of interest to find the optimal sequence. We identified several transcripts that are bound by Srsf3 and undergo alternative splicing upon PDGFRa signaling to potentially contribute to the regulation of PI3K signaling and early endosomal trafficking. We do not expect that these effects are mediated by a single transcript but may instead by mediated by a combination of alternative splicing changes. As such, these experiments would require us to identify and validate multiple splice-switching antisense oligonucleotides (ASOs). Second, ASOs designed against a specific target may not lead to alternative splicing of that target, even in cases of high predicted binding affinities (Scharner et al., 2020, Nucleic Acid Res 48(2), 802816). Third, ASOs have been shown to result in off-target mis-splicing effects, which are hard to predict (Scharner et al., 2020, Nucleic Acid Res 48(2), 802-816). The design of functional ASOs is thus a long-standing challenge in the field, and likely beyond the scope of this manuscript. We have added the following text to the Discussion to highlight this potential future direction: “In the future, it will be worthwhile to attempt to functionally link the AS of transcripts such as Becn1, Wdr81 and/or Acap3 to the endosomal trafficking changes observed above using spliceswitching antisense oligonucleotides (ASOs).”

      (2) The proposed model does not account for other proteins mediating the activation of Srsf3 after Akt phosphorylation. How do we know this is a direct effect (and not a secondary or tertiary effect)? 

      This point is introduced in the Discussion: “Whether phosphorylation of Srsf3 directly influences its binding to target RNAs or acts to modulate Srsf3 protein-protein interactions which then contribute to differential RNA binding remains to be determined, though findings from Schmok et al., 2024 may argue for the latter mechanism. Studies identifying proteins that differentially interact with Srsf3 in response to PDGF-AA ligand stimulation are ongoing and will shed light on these mechanisms…. Again, this shift could be due to loss of RNA binding owing to electrostatic repulsion and/or changes in ribonucleoprotein composition and will be the subject of future studies.” We have added a potential change in Srsf3 protein-protein interactions upon Akt phosphorylation in the model in Figure 6J.

      Reviewer #2 (Recommendations For The Authors): 

      Suggestions: 

      (1) It would strengthen the paper and improve the connection with the other sections of the paper if the authors show: 

      a)  validation of PDGFRa signaling leading to AS of Becn1 and Wdr81 and corresponding changes in protein, and  

      We have added two new figure panels, Figure 5F-5G, assessing Wdr81 AS and Wdr81 protein sizes, as this gene has previously been implicated in craniofacial development. We have added the following text to the Results section: “Finally, as Wdr81 protein levels are predicted to regulate RTK trafficking between early and late endosomes, we confirmed the differential AS of Wdr81 transcripts between unstimulated scramble cells and scramble cells treated with PDGFAA ligand for 1 hour by qPCR using primers within constitutively-expressed exons flanking alternatively-spliced exon 9. This analysis revealed a decreased PSI for Wdr81 in each of three biological replicates upon PDGF-AA ligand treatment (Fig. 5F). Relatedly, we assessed the ratio of larger isoforms of Wdr81 protein (containing the WD3 domain) to smaller isoforms (missing the WD3 domain) via western blotting. Consistent with our RNA-seq and qPCR results, PDGFAA stimulation for 24 hours in the presence of Srsf3 led to an increase in smaller Wdr81 protein isoforms (Fig. 5G).”

      b)  functionally link the AS event(s) to endosomal phenotype using ASOs, etc. 

      Though we agree with the Reviewer that such results would strengthen our conclusions, these are technically challenging experiments for several reasons. First, this approach has typically relied on tiling oligos against a region of interest to find the optimal sequence. We identified several transcripts that are bound by Srsf3 and undergo alternative splicing upon PDGFRa signaling to potentially contribute to the regulation of PI3K signaling and early endosomal trafficking. We do not expect that these effects are mediated by a single transcript but may instead by mediated by a combination of alternative splicing changes. As such, these experiments would require us to identify and validate multiple splice-switching antisense oligonucleotides (ASOs). Second, ASOs designed against a specific target may not lead to alternative splicing of that target, even in cases of high predicted binding affinities (Scharner et al., 2020, Nucleic Acid Res 48(2), 802-816). Third, ASOs have been shown to result in off-target mis-splicing effects, which are hard to predict (Scharner et al., 2020, Nucleic Acid Res 48(2), 802-816). The design of functional ASOs is thus a long-standing challenge in the field, and likely beyond the scope of this manuscript. We have added the following text to the Discussion to highlight this potential future direction: “In the future, it will be worthwhile to attempt to functionally link the AS of transcripts such as Becn1, Wdr81 and/or Acap3 to the endosomal trafficking changes observed above using splice-switching antisense oligonucleotides (ASOs).”

      (2) The Venn diagram in Figure 5A and the description of the analysis the authors did to combine the RNA-seq and eCLIP-seq data are a little confusing. The authors say that they correlated eCLIP peaks with GE or AS events across all four treatment comparisons. The purpose of looking at both datasets was to find genes that are directly bound by Srsf3 and also have significantly affected GE and/or AS. Therefore, the data with and without PDGF-AA should be considered separately. For example, eCLIP peaks in the PDGF-AA condition can be correlated to Srsf3-dependent AS differences (comparing shSrsf3 and scramble) in the -PDGF-AA condition, and eCLIP peaks in the +PDGF-AA condition can be correlated to Srsf3-dependent AS differences in the +PDGF-AA condition. In the Venn diagram and the description, it seems like all comparisons were combined and it is not clear how the data were analyzed.

      As indicated in Figure 5A, 233 of the alternatively-spliced transcripts uniquely found in one of the four treatment comparisons had an Srsf3 eCLIP peak. However, several of these eCLIP peaks were a large distance from an alternatively-spliced element in the rMATS datasets, indicating that Srsf3 binding may not be contributing to the splicing outcomes in these cases. Instead, we correlated the eCLIP peaks with AS events by identifying transcripts in which Srsf3 bound within an alternatively-spliced exon or within 250 bp of the neighboring introns. We have added additional text clarifying this point in the Results: “We next sought to identify highconfidence transcripts for which Srsf3 binding had an increased likelihood of contributing to AS.

      Previous studies revealed enrichment of functional RBP motifs near alternatively-spliced exons (Yee et al., 2019). As such, we correlated the eCLIP peaks with AS events across all four treatment comparisons by identifying transcripts in which Srsf3 bound within an alternativelyspliced exon or within 250 bp of the neighboring introns (Tables S12-S15).” Further, we have relabeled Figure 5B as “High-confidence, overlapping datasets biological process GO terms”. We respectfully disagree with the Reviewer’s suggested comparisons. A comparison of the PDGF-AA eCLIP data with the scramble vs shSrsf3 (-PDGF-AA) data from the list of highconfidence transcripts resulted in only 7 transcripts. Similarly, a comparison of the +PDGF-AA eCLIP data with the scramble vs shSrsf3 (+PDGF-AA) data from the list of high-confidence transcripts resulted in only 14 transcripts. Separate gene ontology analyses of these lists of 7 and 14 transcripts revealed 21 and 40 significant terms for biological process, respectively, the majority of which encompassed one, and never more than two, transcripts. Had we separately examined the -PDGF-AA and +PDGF-AA data, we would not have detected the changes in Becn1, Wdr81 and Acap3 in Figure 5E.

    1. Author response:

      We appreciate the reviewer’s recognition of the strengths of our work as well as their constructive critiques and insightful suggestions for improvement. In this provisional response, we outline how we plan to address the reviewer’s comments in the revised manuscript. 

      (1) Viscosity and surface tension are not accurately measured. 

      We thank the reviewers for bringing up this important point. We are aware that FRAP is not the best method to accurately measure condensate viscoelasticity due to the problems the reviewers and others in the field have pointed out. More accurate methods of measuring fluorescent protein mobility, such as single-molecule tracking or fluorescence correlation spectroscopy, can be used; however, they cannot accurately reflect the time scale dependence of viscoelasticity in the condensate either. Other methods such as rheology and micropipette aspiration that have been used to measure condensate viscoelasticity in vitro are not accessible in living cells yet. Similarly, there is no readily available method to directly measure the surface tension of condensates in live cells. Therefore, we used FRAP and fusion assays to estimate the ratio of surface tension between the two condensates. This ratio was then used to determine the surface tension of the coiled coil condensates in the model after estimating the surface tension for disordered condensate from in vitro measurements (https://doi.org/10.1016/j.bpr.2021.100011). In the revision, we will adjust our FRAP fitting and use condensates with similar sizes to make our FRAP data more accurate. However, based on the large difference we observed for these two condensates, we do not believe these FRAP improvements would change the conclusions. 

      We are also aware that the stokes-einstein relation strictly applies to purely viscous systems. One can apply the generalized Stokes-Einstein relation, which links the diffusion coefficient to the complex viscoelastic modulus of the medium. However, the complex modulus is difficult to determine in cells through live imaging. We thus used the Stokes-Einstein relation to estimate the ratio of effective viscosities, assuming elastic deformations relax faster. In the revision, we will add these assumptions to our discussion. 

      (2) Justification of a Neo-Hookean elasticity model for chromatin. 

      We thank the reviewer for highlighting this important aspect of our work. The observation that the strains R/ξ in our initial model are of the order of 100 is valid and raises questions about the applicability of the Neo-Hookean model. While it is true that at such high strains, the pressure becomes nearly constant (5E/6), our model remains applicable within the range of strains relevant to chromatin, particularly for small droplets where R/ξ values are more moderate. This is explicitly considered in the section “Effect of mechanical heterogeneity on condensate nucleation and growth,” where we also account for heterogeneous mesh sizes correlated with local stiffness. While these points are discussed in the supplementary material, we acknowledge that these details are not clearly presented in the main text, and we will revise the manuscript to explicitly discuss the strain regime and model applicability.

      We agree that varying both the stiffness E and mesh size ξ would provide a more comprehensive understanding of the system, as both parameters are likely affected by experimental perturbations. We will revisit our analysis to incorporate variations in ξ alongside E and discuss the potential effects on our results.

      Furthermore, the stabilization of condensate size by chromatin elasticity arises from the size-dependent pressure exerted by the elastic network, which is a feature of strain-stiffening elastic media rather than a specific property of the Neo-Hookean model. However, we agree that exploring the robustness of our results under alternative elasticity models would strengthen the manuscript. In the revised version, we will analyze additional elasticity models, including strain stiffening and thinning, to evaluate how these might influence our conclusions and to provide a broader context for the predicted growth phases.

      The connection between the nucleation barrier and the cavitation barrier is particularly intriguing. The referenced study (https://doi.org/10.1073/pnas.2102014118) highlights non-linear elastic effects, including breakage and cavitation, which may be relevant in our system. We will explore whether cavitation effects due to elastic confinement play a role in the nucleation dynamics observed here and include a discussion of these mechanisms in the revised manuscript.

      (3) Unclear description of nucleation in the model. 

      We thank the reviewer for pointing out the lack of clarity in our description of nucleation. R_0​ represents the critical radius for nucleation, beyond which droplets grow spontaneously. The nucleation probability p_nuc​ is evaluated at R_0​, which depends on the free energy barrier ΔG, supersaturation S, and the elastic properties of the surrounding medium. We will include a clearer explanation of R_0​, its dependence on parameters, and its role in nucleation in the revised manuscript.

      We ensure that the stiffness is sampled from a truncated normal distribution, preventing negative stiffness values. Sampling is performed at fixed intervals, and we will clarify the protocol to avoid bias and ensure consistency in the simulations.

      Supersaturation S will be defined regarding solute and solvent concentrations, and we will discuss its influence on ΔG and R_0​.

      The dependence of the elastic pressure P_E​ on R_0​, with stiffer surroundings leading to smaller nucleated droplets, will be explicitly clarified. We also agree that Figure S4A may be misleading, as it suggests spatial correlations in stiffness. We will revise the figure and caption to better represent the model assumptions.

      (4) Limited data for the elastic ripening claim.

      We acknowledge the reviewer’s concern regarding the limitation of support for the claim in the current manuscript. We believe our data do indicate elastic ripening. Particularly, the data points very close to zero are not necessarily artifacts of the fitting, as the elastic ripening can be very slow due to small differences in the local stiffness values around the droplets. We have mentioned this at the end of the section “Condensate material properties and chromatin heterogeneity determine the modes of ripening”. We shall revisit these results and remedy this concern with more data and analysis in the revised manuscript. 

      (5) Confusion for dynamic regimes such as "fusion", "ripening", and "diffusion-based" and the problem with using “ripening time” to compare ripening speed.

      We will clear up our definitions of the dynamic regimes and ensure consistent language use. The ripening time was defined as the time it takes per length of droplets to shrink. This way, the size dependence of the absolute ripening time is decoupled and thus can be used to compare the speed of ripening between two condensates. This is not well-explained in our current version. In the revision, we will redefine the normalized ripening time to avoid this confusion. 

      (6) Chromatin should be excluded from the condensates 

      We have data to support that chromatin is excluded from the condensates. We will add the data in the revision. 

      (7) Effect of protein production on the diffusive growth process.

      From the experiment, we do not believe that protein production is a significant source of the diffusive growth because for coiled-coil condensates nucleated with Hotag3 there was little diffusive growth. In the model also, condensates can grow for hours in the absence of protein production, depending on chromatin stiffness and surface tension. We aim to address the effect of protein production on growth in the revised manuscript.

    1. Author response:

      We thank the anonymous very much for dedicating their time to thoroughly review our manuscript. We sincerely appreciate their thoughtful consideration and detailed assessment. Regarding the raised concerns, we acknowledge the importance of exploring the full scope of class IIb microcins, however, we believe that in depth characterization, purification, and in vivo application of the 12 novel compounds goes beyond the scope of this short report and discovery article.

      At the same time, the reviewers acknowledge that the analysis, experimental design, the expression system as well as the performed assays are “sound”, “convincing”, and “corroborated by suitable controls”. In the present manuscript we sought to identify novel antimicrobials and to comprehensively verify their antimicrobial activity in E. coli irrespective of the siderophore-dependent delivery mechanism. Notably, none of the reviewers questioned that we describe new antimicrobials, the characteristics we used to find them, that they are class IIb microcins, or that they do exhibit antimicrobial activity against Gram-negative ESKAPE and plant pathogens.

      We believe that our discovery study can serve as a steppingstone towards the application of bacterially produced antimicrobial compounds to target Gram negative pathogens in numerous plant and animal species, including humans.

    1. Author response:

      Our response to Reviewer #1:

      We appreciate the reviewer’s comments to clarify the strengths and weaknesses of our work. Whether the effect of GM-CSF/IL-3 on the bowel is pro-inflammatory or anti-inflammatory has been controversial. In the present study, we have shown that CD131 mediated a pro-inflammatory effect of GM-CSF on the intestine, which may have worked in synergy with tissue-infiltrating macrophages. While its down-stream signaling has been investigated back and forth, we did not put effort into it. Using macrophage-specific CD131-deficient animals is important to clarify the effects of macrophage-specific CD131 on bowel inflammation. Our present work is indeed incomplete, and we anticipate to work on it further in future research. Concerning the results on human subjects, it is indeed that results from animal experiments were not completely reproduced. We believe that CD131 does have an effect on ulcerative colitis; however, due to the use of biological agents (e.g. anti-TNFs), the need for surgery in the treatment of ulcerative colitis has dramatically decreased and we could not get enough samples to reach a more convincing statistical analysis. Twenty-nine patients shown in the present study were all that received surgical intervention at our center during the past decade, and more human subjects will be needed in future research, possibly from multi-center study.

      Our response to Reviewer #2:

      Many appreciations for the valuable reviewer’s comments and suggestions. We realized that the number of animals per group was not indicated in each figure; in order to clarify the experimental rigor, we have deposited data used to generate the results of the present study in Dryad. Concerning the heterozygous CD131 knock-out animals, we think that others have used the homozygous mice in their studies; however, we observed premature deaths in those animals and we could not get any single homozygous mouse. We could not tell the exact reason, but we did observe robust phenotypes in these heterozygous mice. We do realize that our present work is incomplete, and more experiments need to be done to establish a causal relationship between CD131 and down-stream effects. We anticipate to use macrophage-specific homozygous CD131-deficient mice in our future research, which we believe will produce more meaningful and convincing results.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Recommendations for the Authors:

      Reviewer #2:

      (1) In my previous review, I noted that using three different movies to conclude that different genres evoke different thought patterns is an overinterpretation with only one instance per genre. In the rebuttal letter, the authors state that they provide "evidence that is necessary but not sufficient to conclude that we can distinguish different genres of films" (page 15). Accordingly, I suggest refraining from statements such as "There was a significant main effect of movie genre on memory" (page 13) in the manuscript.

      Thank you for this point. We have removed any reference to genre.

      Page 18 (referring to page 13) [354-355] “First, there was a significant main effect of movie on memory, F(2, 254.12) = 49.33, p <.001, η2 = .28.”

      Reviewer #3:

      The revised manuscript is easier to read and better contextualized.

      Thank you for this comment and for your feedback to allow us to make the manuscript more clear.

      Public Reviews:

      Reviewer #1:

      The lack of direct interrogation of individual differences/reliability of the mDES scores warrants some pause.

      Our study's goal was to understand how group-level patterns of thought in one group of participants relate to brain activity in a different group of participants. To this end, we decomposed trial-level mDES data to show dimensions that are common across individuals, which demonstrated excellent split-half reliability. Then we used these data in two complementary ways. First, we established that these ratings reliably distinguished between the different films (showing that our approach is sensitive to manipulations of semantic and affective features in a film) and that these group-level patterns were also able to predict patterns of brain activity in a different group of participants (suggesting that mDES dimensions are also sensitive to the way brain activity emerges during movie watching). Second, we established that variation across individuals in their mDES scores predicted their comprehension of information from films. Thus our study establishes that when applied to movie-watching, mDES is sensitive to individual differences in the movie-watching experience (as determined by an individual's comprehension). Given the success of this study and the relative ease with which mDES can be performed, it will be possible in the future to conduct mDES studies that hone in on both the general features of the movie-watching experience, as well as aspects that are more unique to an individual.

      Reviewer #2:

      (1) The distinction between thinking and stimulus processing (in the sense of detecting and assigning meaning to features, modulated by factors such as attention) remains unclear. Is "thinking" a form of conscious access or a reportable read-out from sensory and higher-level stimulus processing? Or does it simply refer to the method used here to identify different processing states?

      Thank you for highlighting this first point, which is an important consideration when attempting to map cognitive states. We have added some additional comments to our discussion section to expand on this point.

      Page 35-36 [698-711] “It is possible, therefore, that the identification of regions of visual and auditory cortex by our study reflects the participants attention to sensory input, rather than the complex analysis of these inputs that may be required for certain features of the movie watching experience. On the other hand, it is possible that the movie-watching state is a qualitatively different type of mental state to those that emerge in typical task situations. For example, unlike tasks, the movie-watching state is characterized by multi-modal sensory input, semantically rich themes, that evolve together to reveal a continuous narrative to the viewer. It is possible, therefore, that movies engender an absorbed state which depends more on processing in sensory cortex than would occur in traditional task paradigms such as a working memory task (when systems in association cortex may be needed to maintain information related to task rules). Important headway into addressing this uncertainty can be achieved by using mDES to compare the types of states that occur in different contexts (including both movies and tasks) and comparing the topography of brain activity associated with different experiential states.”

      (2) The dimensions of thought appear to be directly linked to brain areas traditionally associated with core faculties of perception and cognition. For example, superior temporal cortex codes for speech information, which is also where thought reports on verbal detail localize in this study. This raises the question of whether the present study truly captures mechanisms specific to thinking and distinct from processing, especially given that individual variations in reports were not considered and movie-specific features were not controlled for.

      Thank you for this point, we have added an additional paragraph to the discussion to expand on this.

      Page 35 [692-698] “Finally, it is worth considering whether the patterns of brain activity identified by our analysis reflect the stimuli that are processed during movie watching, or the cognitive and affective processing of this information. On the one hand, the regions we found were often within regions of sensory cortex, areas of the brain which are often ascribed basic stimulus processing functions [1]. Moreover, according to perspectives on cognition derived from more traditional task paradigms, complex features of cognition, such as the regulation of thought, are often attributed to regions of association cortex, such as the dorsolateral prefrontal cortex [2].”

      Reviewer #3:

      This paper is framed as presenting a new paradigm but it does little to discuss what this paradigm serves, what are its limitations and how it should have been tested. The novelty appears to be in using experience sampling from 1 sample to model the responses of a second sample.

      Thank you for this comment, we have since made clear what the novelty of the methodology is, as you have correctly identified, by expanding this point beyond the methods section to clearly orient the reader to the application and limitation of our methodological approach with our paradigm.

      Page 7-8 [149-174] “One challenge that arises when attempting to map the dynamics of thought onto brain activity during movie-watching is accounting for the inherently disruptive nature of experience sampling: to measure experience with sufficient frequency to map experiential reports during movies would inherently disrupt the natural processes of the brain and alter the viewer’s experience (for example, by pausing the film at a moment of suspense). Therefore, if we periodically interrupt viewers to acquire a description of their thoughts while recording brain activity, this could impact on the ability to capture important dynamic features of the brain. On the other hand, if we measured fMRI activity continuously over movie-watching (as is usually the case), we would lack the capacity to directly relate brain signals to the corresponding experiential states. Thus, to overcome these obstacles, we developed a novel methodological approach using two independent samples of participants. In the current study, one set of 120 participants was probed with mDES five times across the three ten-minute movie clips (11 minutes total, no sampling in the first minute). We used a jittered sampling technique where probes were delivered at different intervals across the film for different people depending on the condition they were assigned. Probe orders were also counterbalanced to minimize the systematic impact of prior and later probes at any given sampling moment. We used these data to construct a precise description of the dynamics of experience for every 15 seconds of three ten-minute movie clips. These data were then combined with fMRI data from a different sample of 44 participants who had already watched these clips without experience sampling [3]. By combining data from two different groups of participants, our method allows us to describe the time series of different experiential states (as defined by mDES) and relate these to the time series of brain activity in another set of participants who watched the same films with no interruptions. In this way, our study set out to explicitly understand how the patterns of thoughts that dominate different moments in a film in one group of participants relate to the brain activity at these time points in a second set of participants and, therefore, better understand the contribution of different neural systems to the movie-watching experience.”

      Page 33-35 [658-691] “Importantly, our study provides a novel method for answering these questions and others regarding the brain basis of experiences during films that can be applied simply and cost-effectively. As we have shown, mDES can be combined with existing brain activity, allowing information about both brain activity and experience to be determined at a relatively low cost.  For example, the cost-effective nature of our paradigm makes it an ideal way to explore the relationship between cognition and neural activity during movie-watching during different genres of film. In neuroimaging, conclusions are often made using one film in naturalistic paradigm studies [4]. Although the current study only used three movie clips, restraining our ability to form strong conclusions regarding how different patterns of thought relate to specific genres of film, in the future, it will be possible to map cognition across a more extensive set of movies and discern whether there are specific types of experience that different genres of films engage. One of the major strengths of our approach, therefore, is the ability to map thoughts across groups of participants across a wide range of movies at a relatively low cost.

      Nonetheless, this paradigm is not without limitations. This is the first study, as far as we know, that attempts to compare experiential reports in one sample of participants with brain activity in a second set of participants, and while the utility of this method enables us to understand the relationship between thought and brain activity during movies, it will be important to extend our analysis to mDES data during movie-watching while brain activity is recorded. In addition, our study is correlational in nature, and in the future, it could be useful to generate a more mechanistic understanding of how brain activity maps onto the participants experience. Our analysis shows that mDES is able to discriminate between films, highlighting its broad sensitivity to variation in semantic or affective content. Armed with this knowledge, we propose that in the future, researchers could derive mechanistic insights into how the semantic features may influence the mDES data. For example, it may be possible to ask participants to watch movies in a scrambled order to understand how the structure of semantic or information influences the mapping between brains and ongoing experience as measured by mDES. Finally, our study focused on mapping group-level patterns of experience onto group-level descriptions of brain activity. In the future it may be possible to adopt a “precision-mapping” approach by measuring longer periods of experience using mDES and determining how the neural correlates of experience vary across individuals who watched the same movies while brain activity was collected [5]. In the future, we anticipate that the ease with which our method can be applied to different groups of individuals and different types of media will make it possible to build a more comprehensive and culturally inclusive understanding of the links between brain activity and movie-watching experience.”

      What are the considerations for treating high-order thought patterns that occur during film viewing as stable enough to use across participants? What would be the limitations of this method? (Do all people reading this paper think comparable thoughts reading through the sections?) This is briefly discussed in the revised manuscript and generally treated as an opportunity rather than as a limitation.

      It is likely, based on our study, that films can evoke both stereotyped thought patterns (i.e. thoughts that many people will share) and others that are individualistic. It is clear that, in principle, mDES is capable of capturing empirical information on both stereotypical thoughts and idiosyncratic thoughts. For example, clear differences in experiences across films and, in particular, during specific periods within a film, show that movie-watching can evoke broadly similar thought patterns in different groups of participants (see Figure 3 right-hand panel). On the other hand, the association between comprehension and the different mDES components indicate that certain individuals respond to the same film clip in different ways and that these differences are rooted in objective information (i.e. their memory of an event in a film clip). A clear example of these more idiosyncratic features of movie watching experience can be seen in the association between “Episodic Knowledge” and comprehension. We found that “Episodic Knowledge” was generally high in the romance clip from 500 Days of Summer but was especially high for individuals who performed the best, indicating they remembered the most information. Thus good comprehends responded to the 500 Days of Summer clip with responses that had more evidence of “Episodic Knowledge” In the future, since the mDES approach can account for both stereotyped and idiosyncratic features of experience, it will be an important tool in understanding the common and distinct features that movie watching experiences can have, especially given the cost effective manner with which these studies can be run.  

      In conclusion, this study tackles a highly interesting subject and does it creatively and expertly. It fails to discuss and establish the utility and appropriateness of its proposed method.

      Thank you very much for your feedback and critique. In our revision and our responses to these questions, we provided more information about the method's robustness utility and application to understanding cognition. Thank you for bringing these points to our attention.

      References

      (1) Kaas, J.H. and C.E. Collins, The organization of sensory cortex. Current Opinion in Neurobiology, 2001. 11(4): p. 498-504.

      (2) Turnbull, A., et al., Left dorsolateral prefrontal cortex supports context-dependent prioritisation of off-task thought. Nature Communications, 2019. 10.

      (3) Aliko, S., et al., A naturalistic neuroimaging database for understanding the brain using ecological stimuli. Scientific Data, 2020. 7(1).

      (4) Yang, E., et al., The default network dominates neural responses to evolving movie stories. Nature Communications, 2023. 14(1): p. 4197.

      (5) Gordon, E.M., et al., Precision Functional Mapping of Individual Human Brains. Neuron, 2017. 95(4): p. 791-807.e7.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This paper represents a huge amount of work on a condition whose patients' health and well-being have not always been prioritized, and only relatively recently has the immune dysregulation seen in patients with Down Syndrome (DS) been garnering major research interest.

      This paper provides an unparalleled examination of immune disorders in patients with DS. The authors also report the results from a clinical trial with the JAK inhibitor tofacitinib in DS patients.

      Strengths:

      This manuscript reports a herculean effort and provides an unparalleled examination of immune disorders in a large number of patients with DS.

      Weaknesses:

      Not a major weakness but, apart from finding an elevation of CD4 T central memory cells and more differentiated plasmablast, several of the alterations reported in this manuscript had already been suggested by a few case reports and a very small series. On the other hand, the number of patients (and controls) utilized for this study is remarkable and allows for drawing much firmer conclusions.

      We are grateful for the Reviewer’s very positive assessment of the work and results presented in this manuscript. We agree that many of the changes in the peripheral immune system reported here had been previously documented by our team and others using smaller sample sizes. However, as the Reviewer appreciated, this study involves an order of magnitude more research participants than previous studies (i.e., ~400 total participants, ~300 of them with trisomy 21 versus ~100 controls), which enabled us to investigate associations between immune changes and clinical variables, while also helping us draw much firmer conclusions.

      Reviewer #2 (Public Review):

      In this manuscript, Rachubinski and colleagues provide a comprehensive clinical, immunological, and autoantibody assessment of autoimmune/inflammatory manifestations of patients with Down syndrome (DS) in a large number of patients with this disorder. These analyses confirm prior results of excess interferon and cytokine signals in DS patients and extend these observations to highlight early-onset immunological aberrancies, far before symptoms occur, as well as characterizing novel autoantibody reactivities in this patient population. Then, the authors report the interim analysis of an open-label, Phase II, clinical trial of the JAK1/3 inhibitor, tofacitinib, that aims to define the safety, clinical efficacy, and immunological outcomes of DS patients who suffer from inflammatory conditions of the skin. The clinical trial analysis indicates that the treatment is tolerated without serious adverse effects and that the majority of patients have experienced clinical improvement or remission in their corresponding clinical cutaneous manifestations as well as improvement or normalization of aberrant immunological signals such as cytokines.

      The major strength of the study is the recruitment and uniform, systematic evaluation of an impressive number of DS patients. Moreover, the promising early results from the tofacitinib clinical trial pave the way for analysis of a larger number of patients within the Phase II trial and otherwise, which may lead to improved clinical outcomes for affected patients. An inherent weakness of such studies is the descriptive nature of several parameters and the relatively small size of tofacitinib-treated DS patients. However, the descriptive nature of some of the correlative research analyses is of scientific interest and is useful to generate hypotheses for future additional (including mechanistic) work, and treatment of 10 DS patients in a formal clinical trial at interim analysis is not a trivial task for a disease like this. The manuscript achieves the aims of the authors and the results support their conclusions. The authors appropriately acknowledge areas that require more research and areas that are not well understood. The results are represented in a useful manner and statistical methods and analyses appear sound.

      We appreciate the very positive evaluation by this Reviewer. We agree with the Reviewer on the descriptive nature of many of the analyses completed and on the value of a larger cohort of individuals with Down syndrome treated with a JAK inhibitor. The clinical trial will involve a total of 40 participants, and we look forward to reporting the results from the full cohort in the near future.

      Reviewer #3 (Public Review):

      Summary:

      Individuals with Down syndrome (DS) have high rates of autoimmunity and can have exaggerated immune responses to infection that can unfortunately cause significant medical complications. Prior studies from these authors and others have convincingly demonstrated that individuals with DS have immune dysregulation including increased Type I IFN activity, elevated production of inflammatory cytokines (hypercytokinemia), increased autoantibodies, and populations of dysregulated adaptive immune cells that pre-dispose to autoimmunity. Prior studies have demonstrated that using JAK inhibitors to treat patient samples in vitro, in small case series of patients, and in mouse models of DS leads to improvement of immune phenotype and/or clinical disease. This manuscript provides two major advances in our understanding of immune dysregulation and therapy for patients. First, they perform deep immune phenotyping on several hundred individuals with DS and demonstrate that immune dysregulation is present from infancy. Second, they report a promising interim analysis of a Phase II clinical trial of a JAK inhibitor in 10 people with DS and moderate to severe skin autoimmunity.

      Strengths and weaknesses:

      The relatively large cohort and careful clinical annotation here provide new insights into the immune phenotype of patients with DS. For example, it is interesting that regardless of autoimmune disease or autoantibody status, individuals with DS have elevated cytokines and CRP. Analysis of the cohorts by age demonstrated that some cytokines are significantly elevated in people with DS starting in infancy (e.g., IL-9 and IL-17C). Nearly all adults with DS in this study had autoantibodies (98%) and most had six or more autoantibodies (63%), which differed significantly from euploid study participants. This implies that all patients with DS might benefit from early intervention with therapy to reduce inflammation. However, it is also worth considering that an alternative interpretation that since hypercytokinemia does not vary based on disease state in individuals with DS, this may not be a key factor driving autoimmunity (although it may be relevant for other clinical symptoms such as neuroinflammation).

      Small case series have suggested the benefit of JAK inhibitors to treat autoimmunity in DS. This is the first report of a prospective clinical trial to test a JAK inhibitor in this setting. The clinical trial entry criteria included moderate to severe autoimmune skin disease in patients aged 12-50 years with DS, and treatment was with the JAK1/3 inhibitor tofacitinib. This clinical trial is a critically important step for the field. The early results support that treatment is well tolerated with an improvement of interferon scores in patients and reduction of autoantibodies. Most patients experienced clinical improvement, with alopecia areata having the greatest response. Treatment may not affect all skin diseases equally, for example of the 5 patients with hidradenitis suppurativa, only 1 showed clinical improvement based on skin score. While very promising, the clinical trial results reported here are preliminary and based on an interim analysis of 10 patients at 16 weeks. Individuals with DS have a lifelong risk of immune dysregulation and thus it is unclear how long therapy, if of benefit, would need to be continued. The results of longer-term therapy will be informative when considering the risks/benefits of this therapy.

      We thank the Reviewer for the very positive evaluation. We agree with the Reviewer that the hypercytokinemia of Down syndrome may contribute to other pathophysiological processes beyond autoimmune conditions. Although many cytokines elevated in Down syndrome have well demonstrated pathogenic roles in the etiology of autoimmune diseases in the general population (e.g., TNF-a, IL-6), their consistent upregulation in DS regardless of clinical evidence of autoimmune pathology indicates the existence of a prolonged pre-clinical period, where the hypercytokinemia likely precedes evident tissue damage and symptomology. Alternatively, it is possible that these elevated cytokines are contributing the overall pathophysiology of DS (e.g., neuroinflammation, cognitive impairments, complications from viral infections) without formal diagnosis of an autoimmune disease. We also agree with the Reviewer that not all immune skin conditions would respond equally to JAK inhibition. Based on recent approvals for JAK inhibitors in the immunodermatology field, it is expected that JAK inhibition would show the greatest benefits for alopecia areata, atopic dermatitis, and psoriasis, with less clear results for hidradenitis suppurativa. We hope to contribute to this field through the analysis of the full clinical trial cohort in the near future. Lastly, we strongly agree with the need to assess the value of long-term therapy with JAK inhibitors or other immune therapies in people with Down syndrome for various clinical endpoints.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      This paper represents a huge amount of work on a condition whose patients' health and well-being have not always been prioritized, and only relatively recently has the immune dysregulation seen in patients with Down Syndrome (DS) been garnering major research interest.

      This paper provides an unparalleled examination of immune disorder in patients with DS. In a truly herculean effort, the authors provided the cumulative examination of over 440 patients with DS, confirmed the alterations in immune cell subsets (n=292, 96 controls) and multi-organ autoimmunity seen in these patients as they age, and identified autoantibody production that could contribute to conditions co-occurring in patients with DS. They also sought to look at whether the early immunosenescence seen in DS was due to the inflammatory profile by comparing age-associated markers in DS patients and euploid controls separately, finding that several markers are regulated with age regardless of group, while comparing the effect of age versus DS status on cytokine status identified inflammatory markers elevated in DS patients across the lifespan that do not increase with age or that increase with age only in the DS cohort. This is very interesting in the context of DS in particular, and immunity during aging in general.

      The second part of the manuscript presents the results from a clinical trial with the JAK inhibitor tofacitinib in DS patients. While the number of DS patients treated with tofacitinib was small, the results were often quite striking. Treatment was well-tolerated and the improvement of dermatological conditions was clear. The less responsive patients AA4 and AA2 provide a very clear illustration that these patients are sensitive to immune triggers during treatment. Additionally, the demonstration that patients' IFN scores and cytokine levels decreased without clear immunosuppression with tofacitinib treatment is encouraging, since treatment with this drug would need to be continuous. I would be curious to see if the patients added past the cutoff for interim analysis follow a similar trajectory. I would not ask the authors to add any data; the paper is well-written and logically constructed.

      I only have a small comment: I really did not like how Figure 2 a, d, and g tethered the coloring to the magnitude of fold change to show the effect of DS particularly for 2a and 2g. Given that these fold changes are quite modest, the coloring is very light and hard to distinguish. The clear takeaway is that the effect on T cells is greatest, but there must be a better way to illustrate this. Perhaps displaying this graph on a non-white background could help with contrast.

      We are grateful for the Reviewer’s very positive assessment of the manuscript and constructive feedback. We want to assure the Reviewer that similar analyses will be completed in the future for the entire cohort recruited into the trial to determine if similar trajectories and results are observed with the larger sample size. Additionally, following Reviewer’s guidance, we have modified the color scales in Figures 2a, d and g so that each panel is on its own dynamic range, thus emphasizing the differences within each immune cell lineage.

      Reviewer #2 (Recommendations For The Authors):

      • Although the focus of the patients in the first part of the paper is on autoimmune/inflammatory conditions, it will be useful to also list the non-autoimmune infectious manifestations for reference with prevalence data. For example, otitis media, or lung infections (mentioned within the paper), or mucosal candidiasis. Same for other manifestations such as cardiac or malignant conditions. Given the impressive number of patients, it will be useful to the readers to have prevalence data for these as well, even in brief statements within the results.

      We appreciate this inquiry by the Reviewer. Following Reviewer’s guidance, we have included information on recurrent otitis media, frequent/recurrent pneumonia, congenital heart defects requiring repair, and various forms of leukemia. These additional data are presented in a revised Supplementary file 1 and briefly discussed in the results.

      • Have the authors looked at DN T cells and whether they may be enriched in DS patients, given their enrichment in some autoimmune conditions?

      Thanks for this inquiry. We did examine DN T cells (double negative T cells), which we referred to in our Figure 2 and Figure 2 – figure supplement 1 as non-CD4+ CD8+ T cells. Although this T cell subset is mildly elevated (in terms of frequency among T cells) in individuals with Down syndrome, the result did not reach statistical significance after multiple hypothesis correction. This negative result is shown in the heatmap in Figure 2 – figure supplement 1d.

      • It would be useful to move the segment of the discussion that discusses the interim predefined analysis of the phase 2 trial to the corresponding segment of the results. As this reviewer was reading the paper, it was unclear why the interim analysis was done, whether it was predefined and it was not until the discussion that it became apparent. I believe it will help the readers to have a brief mention that this interim analysis was predefined and set to occur at the first 10 DS enrollees. Also, it would be helpful to state what is the total number of DS patients planned for enrollment in the Phase 2 trial which is continuing recruitment.

      We appreciate this comment. Following the Reviewer’s guidance, we have revised the text to explain in the Results section that the interim analysis was predefined and triggered once the first 10 participants completed the 16 weeks of treatment. We also explain that the trial will be considered complete once a total of 40 participants undergo 16-weeks of treatment.

      • Although the authors present data on TPO autoantibodies before and after tofacitinib, it remains unclear whether the other non-TPO autoantibodies were altered during treatment or whether this was a TPO autoantibody-specific phenomenon. Was there an alteration in mature B cells or plasmablast populations after tofacitinib? If these data are available, they would further enhance the manuscript. If they are not available, it would be useful for the authors to discuss those in the discussion of the manuscript.

      We are grateful for this comment, which strongly aligns with our future research interests and plans for the analysis of the full cohort once the trial is completed. In the interim analysis, we analyzed only auto-antibodies related to autoimmune thyroid disease and celiac disease, as shown in the manuscript. However, we plan to complete a more comprehensive analysis of the effects of JAK inhibition on autoantibody production once the full sample set is available at the end of the trial. Likewise, the clinical trial protocol contemplates collection and processing of blood samples for immune mapping using mass cytometry, which will enable us to answer the question from the Reviewer about potential changes in B cells or plasmablast populations. Following Reviewer’s guidance, we discuss these planned analyses in the Discussion of the revised manuscript.

      Reviewer #3 (Recommendations For The Authors):

      (1) Cellular immune phenotyping data in Figure 2 presents a large number of patients with DS versus euploid controls (292 and 96 respectively). Given the relatively large cohort there would seem to be an opportunity to determine whether age or sex alters the immune phenotype shown, for example, TEMRAs, etc. Was the data analyzed in this way?

      We welcome this comment, which clearly aligns with our research interests and planned additional analyses of these datasets generated by the Human Trisome Project. We can share with the Reviewer that although sex as a biological variable has minimal impacts on the strong immune dysregulation observed in Down syndrome, there are clear age-dependent effects, with some immune changes occurring early during childhood versus others taking place later in adult life. A manuscript describing a complete analysis of age-dependent effects on the multi-omics datasets in the Human Trisome Project is currently under preparation.

      (2) The authors should strongly consider incorporating/discussing the findings from Gansa et al, Journal of Clinical Immunology May 2024 - where they reviewed the immune phenotype of 1299 patients with Down syndrome.

      Thanks for this publication to our attention, which is not cited in the revised manuscript.

      (3) It is difficult to differentiate patients Hs2 and Ps1 in Figure 5d.

      Thanks for this observation, we have modified the labels for greater clarity in the revised manuscript.

      (4) Given their finding of no correlation between cytokine levels/immune phenotype and autoimmunity, some additional discussion of the relevance of hypercytokinemia in the pathogenesis of autoimmunity would seem relevant (given that this was the basis for the clinical trial). The authors mention that cytokine levels may not be appropriate measures of disease in the patients.

      We welcome this suggestion and have revised the Discussion along these lines.

      (5) Data availability statement: appropriate.

    1. Author response:

      The following is the authors’ response to the original reviews.

      We greatly appreciate the opportunity to submit a revision of our manuscript entitled: "The Autophagy Protein, ATG14 Safeguards Against Unscheduled Pyroptosis Activation to Enable Embryo Transport During Early Pregnancy" by Popli et al. We thank all three Referees for underscoring the importance of our findings as well as the constructive critiques that we used to improve our paper. Most notably, we added the following new data:

      · To provide more insight into whether pyroptosis activation occurs distinctly in the oviduct, we looked for GSDMD, (primary executioner of the pyroptosis pathway) expression in the uterus and ovary too. We observed no signs of pyroptosis activation in response to ATG14 loss in either the uterus or ovary of Atg14 cKO mice compared to control ones suggesting that ATG14 plays a distinct role in regulating pyroptosis specifically in the oviduct (Revised Figure 5F).

      · To better understand the molecular mechanisms of pyroptosis activation in the oviducts, we examined various key markers of mitochondrial integrity, architecture, and function in control and Atg14 cKO oviducts. Our findings indicate a significant loss of mitochondrial structural and functional integrity, possibly contributing to the embryo retention phenotype via activating the pyroptosis pathway in the oviduct. (Revised Figure 5B & C).

      · To address the spatiotemporal and region-specific expression of ATG14 in the oviduct, we performed immunofluorescence analysis and observed the consistent expression of ATG14 in all the cellular compartments of oviducts including ciliary epithelial cells, secretory epithelial cells, and smooth muscle cells. Moreover, the region-specific expression analysis revealed that distinct expression of ATG14 in the ampullary region of cKO mice oviduct helps to preserve its structural integrity. Conversely, its loss in the isthmus region of the oviduct in concordance with active PR-cre activity causes completely distorted epithelial structures with luminal obliteration or narrowing resulting in an unorganized and obstructed lumen leading to embryo retention, suggesting that ATG14 is essential for maintaining the structural integrity of the oviduct (Revised Figure 3F & S2A).

      · Considering the expression of PR-cre in the pituitary, which could potentially influence hormonal secretion and ovulation, we evaluated the levels of E2 and P4 during pregnancy. Our findings show that these hormone levels remained unchanged in Atg14 cKO mice, indicating that the absence of ATG14 does not negatively affect the HPG axis or pituitary function (Revised Figure 2F).

      · ATG14 is an essential factor for the initiation of autophagy, and its loss can lead to reduced or inhibited autophagic activity. Consistently, we observed elevated levels of LC3b and p62 proteins, two well-known markers of autophagic flux in the oviducts of Atg14-deficient mice implying that loss of ATG14 leads to defective autophagy potentially disturbing the structural integrity of oviductal epithelial cells and impairing embryo transport. (New Supplementary Figure S2B).   

      Reviewer #1 (Public Review):

      This study by Popli et al. evaluated the function of Atg14, an autophagy protein, in reproductive function using a conditional knockout mouse model. The authors showed that female mice lacking Atg14 were infertile partly due to defective embryo transport function of the oviduct and faulty uterine receptivity and decidualization using PgrCre/+; Atg14f/f mice. The findings from this work are exciting and novel. The authors demonstrated that a loss of Atg14 led to an excessive pyroptosis in the oviductal epithelial cells that compromises cellular integrity and structure, impeding the transport function of the oviduct. In addition, the authors use both genetic and pharmacological approaches to test the hypothesis. Therefore, the findings from this study are high-impact and likely reproducible. However, there are multiple major concerns that need to be addressed to improve the quality of the work.

      Major comments:

      (1) It is interesting that deletion of Atg14 using PgrCre results in pyroptosis only in the oviduct; the authors should speculate/evaluate why the oviduct, but not the uterus or follicles. Is there any cellular specificity that is sensitive to autophagy/pyroptosis in the oviduct but not in other cell types? This has not been evaluated or discussed in the manuscript. Is it possible to include GSDMD IHC for the uterine section to ensure that there was no pyroptosis event in the cKO uteri?

      We performed GSDMD IHC and found that, unlike in the oviduct, the cKO uteri and ovaries do not exhibit detectable pyroptosis (Revised Figure 5F). Additionally, we have added text to the discussion section addressing possible reasons for the differential impact of Atg14 loss on pyroptosis along the reproductive tract continuum (Line number: 532-538)

      (2) Please include an explanation of how a loss of Atg14, important for the initiation process of autophagy (as indicated in line 88), can lead to pyroptosis. There was some discussion about inflammation. But the connection is still missing.

      We thank the reviewer for noting on this. We have now included a possible explanation of how autophagy could impact pyroptosis in the discussion section (Line number: 532-538)  

      (3) No expression data of ATG14 using IHC/IF analysis were included in the manuscript - this is missing. This is needed and important as the authors found that Foxj1Cre/+; Atg14f/f cKO mice had no fertility defect. Is it possible that ATG14 is not present in the ciliated epithelial cells of the oviduct? In addition, the data in Figure 5B also points to this speculation. This is because the GSDMD (the pyroptosis marker) is only observed in the isthmus region but not the ampulla.

      We thank the reviewer for this nice suggestion. We performed the immunofluorescence analysis for ATG14 expression in control and Atg14 cKO oviducts and observed the consistent expression of ATG14 in all the cellular compartments of oviducts including ciliary epithelial cells, secretory epithelial cells, and smooth muscle cells (New Supplementary Figure S2A). We also looked for α-tubulin expressions in the oviduct of Foxj1Cre/+; Atg14 f/f mice and control mice and observed that ciliated epithelial cells that were positive for acetylated α-tubulin staining did not appear to be different in Foxj1Cre/+; Atg14 f/f mice oviduct compared to controls (Revised Figure 4C). However, due to the unavailability of reliable fluorescent-labeled antibodies for both Foxj1 and Atg14, we were unable to conduct the co-localization study as intended. This limitation hindered our ability to precisely determine the spatial overlap of these proteins within the tissue.

      (4) In line with the previous comment, is ATG14 present in the human Fallopian tube? If so, which cell type? This needs to be addressed.

      Author’s Response: We appreciate the reviewer's valuable suggestion. While we currently lack access to human fallopian tube biopsies, the Human Protein Atlas (https://www.proteinatlas.org/ENSG00000126775-ATG14) demonstrates distinct ATG14 expression in various fallopian tube cell types, with localization in the cytoplasm, membrane, and nucleus.

      (5) As PgrCre is also expressed in the pituitary, is it possible that the deletion of Atg14 using PgrCre would affect pituitary function – hence a change in the FSH/LH secretion that subsequently affects ovulation? Although the uterine and ovarian histology in the Atg14 cKO looks similar to the controls, is it possible that cyclicity is also affected? The authors should evaluate whether the estrous cycle takes place regularly.

      Author’s Response: Thank you for the insightful comment. However, evaluating the estrous cycle requires significant time and effort and is beyond the scope of the current manuscript. Nonetheless, we have now shown that both P4 and E2 levels were not altered in Atg14 cKO mice, indicating that the loss of Atg14 did not adversely impact the HPG axis, and by extension, pituitary function (Revised Figure 2F).

      (6) The number of total embryos/oocytes in the cKO compared to the control has not been evaluated - this data must be included. Do the changes in autophagy in Atg14 cKO affect preimplantation embryo development? Please categorize the embryos found in the oviduct/uterus in both genotypes. i.e., % blastocyst, % morula, % developmentally delayed, % non-viable etc. It would be interesting to evaluate if the oviduct with heavy pyroptosis can support preimplantation embryo development.

      Author’s Response: We thank the reviewer for this nice suggestion. We categorized the embryos into different categories as suggested and included the data (Revised Figure 3C and Figure 6D).

      (7) It is unclear why the superovulation+mating experiment (Figure 3C) was performed. Please provide justification. Why was the data from natural mating (Figure 3A) insufficient?

      Author’s Response: In Figure 3C, superovulation was employed to complement the natural mating studies and to provide stronger evidence for the embryo retention phenotype observed in the oviduct.

      (8) In lines 297-298, the conclusion that "ATG14 is required for P4-mediated but not for E2-mediated actions during uterine receptivity" is not entirely correct. This is because the authors also observed that the downregulation of MUC1 (E2-target protein) is absent in the PgrCre/+;Atg14f/f cKO female uteri.

      We thank the reviewer for noting this. We detected more E2-induced targets in D-4 pregnant uterine samples and found no change in their expression in response to Atg14 depletion in cKO females (Revised Figure 2E).

      (9) Figure 3D: Please include an image that also represents the ampulla region. All images are from the isthmus region. It would be informative to see if the loss of cell boundaries also takes place at the ampulla region in the cKO oviduct.

      We thank the reviewer for this nice suggestion. We included the ampulla section from the cKO and control female oviducts (Revised Figure 3F). As PR-cre activity is limited to isthmus only [1, 2], we did not see any structural abnormality in ampulla sections of cKO oviducts.

      (10) Figure 3E: Please indicate which region the TEM was performed. Isthmus? Ampulla? Were the changes in mitochondrial phenotype observed across all oviductal regions?

      The TEM imaging was performed by the WashU Core services. Although we clearly mentioned the core person to look into the isthmus region only, we are not sure if they accurately follow the instructions.

      (11) Figure 4B; the evaluation of FOXJ1 IHC. The authors need to include sections that also have an ampulla region-especially in the cKO. In addition, it is misleading to state that there were fewer FOXJ1+ cells (line 361) in the cKO if the region being evaluated is the isthmus (which has a lot fewer ciliated epithelial cells in general) while the control image showed an ampulla where the abundancy of ciliated epithelial cells (FOXJ1+) is higher than that of the isthmus. The authors also need to include a higher resolution image (a zoom-in at the ciliated epithelial cells with FOXJ1+ signal) as well as the quantification of FOXJ1+ cells.

      We appreciate the reviewer for the suggestion. In Figure 4A, we have already shown the ampulla region from both control and cKO oviducts, wherein alpha-tubulin staining was evident in both oviducts.  

      We agree with the reviewer that the isthmus usually has fewer ciliary epithelial cells than the ampulla, however, as illustrated in Figures 4A and 4B, Atg14 depletion causes a marked disruption of structural integrity with loss of cell boundaries specifically in the isthmus, which is far more pronounced than in the ampulla. One reason for this is the reported Pgr Cre activity, which is much more robust in the isthmus than in the ampulla [1, 2] . This disruption leads to the substantial loss of both ciliated and secretory cells, compromising the epithelial architecture to such an extent that it is impossible to accurately quantify the Foxj1 signal as can be seen in higher resolution images in New Supplementary Figure S3.

      For more clarity, we modified the statement in the revised file (Line Number: 393-396)

      (12) All IHC/IF and embryo images need to include the scale bars.

      We thank the reviewer for this suggestion. We now included the scale bar in all the images.

      (13) Figure 5H: although IL1B is being discussed, there was no data in this study to support the figure.

      In Figure 5H, IL1B is presented as part of the pyroptosis signaling pathway. As we have already shown other key executioners of this pathway: Caspase 1 and GSDMD, we believe that additional IL1B data would not provide new insights beyond what has already been shown.

      Minor comments:

      (1) Please include n (sample size) for all data, including the histology image in the figure legends for all studies.

      We now included the sample size in figure legends for all data shown in the manuscript.

      (2) Line 32, did the authors mean to say, "Self-digestion of..." instead of "Self-digestion for..."?

      In Line 32, we meant, “Cellular self-digestion for female reproductive tract functions”. We have now corrected the statement.

      Fig. 1A - please include negative control.

      We included the negative control (Revised Figure 1)

      (3) Figure 1E left panel and Figure 4C - please label "Average no. of pups/female/litter" as each female has more than one litter over her reproductive lifespan. If the authors represent pups/females, then the number should be accumulative in the range of 35-40pups/females in the control group.

      We thank the reviewer for noting this. We now corrected the label in both Revised Figure 1E and Revised Figure 4E.

      (4) Line 273: please remove "& F" as there is no Figure F in the image.

      We removed “&F” from the Line 273.

      (5) The presence of CL is not always indicative of normal hormonal levels; therefore, the authors should include the measurement of progesterone levels at 3.5 dpc in the cKO compared to the control group. Hormonal regulation is also crucial for embryo transport.

      We thank the reviewer for this suggestion. We measured not only P4 but also E2 levels in D4 pregnant females and found no significant difference in their levels compared to corresponding controls (Revised Figure 2F).

      (6) Figure 2A shows that KRT expression is not present in the control uteri. Although the KRT8 levels may have decreased at 4 dpc, they should be present (see Figure S2A).

      We observed no decrease in KRT expression in control uteri on 5 dpc. We included better-resolution images for KRT expression (Revised Figure 2A).

      (7) The dotted white lines in Figure 2A are too thick. It's difficult to see the Ki67 positive signal in the luminal epithelial cells. Please also add a quantitative analysis of Ki67+ cells in the luminal epithelium vs. stromal cells.

      We now corrected the dotted lines in Revised Figure 2B. However, as the Ki-67 proliferation is evident in the representative images, we believe quantification analysis will not add anything new to the existing conclusion.

      (8) Figure 2D - the y-axis mentions the weight ratio. However, the figure legend describes the transcript levels of Atg14 - please correct this.

      We corrected the label in the revised manuscript.

      (9) Line 294 - Please correct Figure 2C to Figure 2B.

      We corrected it.

      (10) Line 308 - Please correct Figure 2E to Figure 2F.

      We corrected it.

      (11) Line 310 - Please correct Figure 2F to Figure 2G.

      We corrected it.

      (12) Line 311 - Please correct Figure 2F to Figure 2G.

      We corrected it.

      (13) Information in Figure S2A and S2B should be included in the main figure.

      We thank the reviewer for this nice suggestion. We now included the figures S2A and S2B in the main figure (Revised Figure 2C & D).

      (14) Figure 3C - due to a lot of cellular debris after flushing, it's difficult to see. But it seems like there are secondary follicles in the flushing of control oviducts - this is highly unlikely. This could be due to an artifact of an accidental poking of the ovaries during collection.

      We agree with the reviewer. It might be due to the unintentional poking of the ovaries. We will take extra care in future experiments to avoid this and ensure clean flushing to prevent any confusion from debris or artifacts.

      (15) Figure 2B and Figure 3D signals from DAPI are missing - it's black with no blue signal. This could be the data loss during file compression for manuscript submission.

      We included better-resolution pictures for the DAPI signal in Revised Figure 2B & Figure 3F.

      (16) Explain why some embryos in the cKO make it to the uterus when the females are superovulated.

      It might be due to the heightened hormonal stimulation provided by the superovulation which could facilitate the movement of some embryos through the oviduct despite any defects or abnormalities caused by the loss of ATG14 in the oviduct.

      Reviewer #2 (Public Review):

      Summary:

      In this manuscript, Popli et al investigated the roles of the autophagy-related gene, Atg14, in the female reproductive tract (FRT) using conditional knockout mouse models. By ablation of Atg14 in both oviduct and uterus with PR-Cre (Atg14 cKO), the authors discovered that such females are completely infertile. They went on to show that Atg14 cKO females have impaired embryo implantation and uterus receptivity due to impaired response to P4 stimulation and stromal decidualization. In addition to the uterus defect, the authors also discovered that early embryos are trapped inside the oviduct and cannot be efficiently transported to the uterus in these females. They went on to show that oviduct epithelium in Atg14 cKO females showed increased pyroptosis, which disrupts oviduct epithelial integrity and leads to obstructive oviduct lumen and impaired embryo transport. Therefore, the authors concluded that autophagy is critical for maintaining the oviduct homeostasis and keeping the inflammation under check to enable proper embryo transport.

      Strengths:

      This study revealed an important and unexpected role of the autophagy-related gene Atg14 in preventing pyroptosis and maintaining oviduct epithelial integrity, which is poorly studied in the field of reproductive biology. The study is well designed to test the roles ofATG14 in mouse oviduct and uterus. The experimental data in general support the conclusion and the interpretations are mostly accurate. This work should be of interest to reproductive biologists and scientists in the field of autophagy and pyroptosis.

      Weaknesses:

      Despite the strengths, there are several major weaknesses raising concerns. In addition, the mismatched figure panels, the undefined acronyms, and the poor description/presentation of some of the data significantly hinder the readability of the manuscript.

      (1) In the abstract, the authors stated that "autophagy is critical for maintaining the oviduct homeostasis and keeping the inflammation under check to enable embryo transport". This statement is not substantiated. Although Atg14 is an autophagy-related gene and plays a critical role in oviduct homeostasis, the authors did not show a direct link between autophagy and pyroptosis/oviduct integrity. In addition, the authors pointed out in the last paragraph of the introduction that none of the other autophagy-related genes (ATG16L, FIP200, BECN1) exhibited any discernable impact on oviduct function. Therefore, the oviduct defect is caused by Atg14 specifically, not necessarily by autophagy.

      We thank the reviewer for noting this. We corrected the statement in the revised manuscript (Line number: 53-54).

      (2) In lines 412-414, the authors stated that "Atg14 ablation in the oviduct causes activation of pyroptosis", which is also not supported by the experimental data. The authors did not show that Atg14 is expressed in oviduct cells. PR-Cre is also not specific in oviduct cells. It is possible that Atg14 knockout in other PR-expressing tissues (such as the uterus) indirectly activates pyroptosis in the oviduct. More experiments will be required to support this claim. In line with the no defect when Atg14 has knocked out in oviduct ciliary cells, it will be good to use the secretory cells Cre, such as Pax8-Cre, to demonstrate that Atg14 functions in the secretory cells of the oviduct thus supporting this conclusion.

      We now included the ATG14 expression data in the oviduct (New Supplementary Figure S2A). Consistent with previous studies reporting PR-cre activity in the isthmus [1, 2] , we observed that Atg14 depletion was more pronounced in the isthmus compared to the ampulla. However, generating a secretory Pax-8 cell Cre mice model will require a substantial amount of time and effort, and we respectfully note that this is beyond the scope of the current manuscript.

      (3) With FOXJ1-Cre, the authors attempted to specifically knockout Atg14 in ciliary cells, but there are no clear fertility and embryo implantation defects in Foxj1/Atg14 cKO mice. The author should provide verification data to show that Atg14 had been effectively depleted in ciliary cells if Atg14 is normally expressed.

      We understand the reviewer’s concern. We included new data for ATG14 expression in control and Atg14 cKO mice oviducts (New Supplementary Figure S2A). However, due to the unavailability of reliable fluorescent-labeled antibodies for both Foxj1 and Atg14, we could not conduct the co-localization studies as intended, and this limitation hindered our ability to precisely determine the spatial overlap of these proteins within the oviduct. Nonetheless, Foxj1-cre is a widely used mice model with reported cre-activity in ciliary epithelial cells including oviduct tissues [3]. Given the widespread expression of ATG14 in all the ciliary and secretory cells (New Supplementary Figure S2A) and distinct FOXJ1 expression in the oviduct (New Supplementary Figure S3), we are confident that Atg14 is deleted in the ciliary epithelial cells of Foxj1/Atg14 cKO mice oviducts.

      (4) In lines 307-313, the author tested whether ATG14 is required for the decidualization of HESCs. The author stated that "Control siRNA transfected cells when treated with EPC seemed to change their morphological transformation from fibroblastic to epithelioid (Fig. 2E) and had increased expression of the decidualization markers IGFBP1 and PRL by day three only (Fig. 2F)". First, the labels in Figure 2 are not corresponding to the description in the text. Second, the morphology of the HESCs in the control and Atg14 siRNA group showed no obvious difference even at day 3 and day 6. The author should point out the difference in each panel and explain in the text or figure legend.

      Decidualization is a post-implantation event, whereas our study primarily focuses on pre-implantation events in the oviduct. Therefore, we have removed all data related to human and mouse decidualization to enhance the clarity and precision of our study.

      (5) In lines 332-336, the authors pointed out that the cKO mice oviduct lining shows marked eosinophilic cytoplasmic change, but there's no data to support the claim. In addition, the authors further described that "some of the cells showed degenerative changes with cytoplasmic vacuolization and nuclear pyknosis, loss of nuclear polarity, and loss of distinct cell borders giving an appearance of fusion of cells (Fig. 3D)". First, Figure 3D did not show all these phenotypes, and it is likely a mismatch to Figure 3E. Even in Figure 3E, it is not obvious to notice all the phenotypes described here. The figure legend is overly simple, and there's no explanation of the arrowheads in the panel. More data/images are required to support the claim here and provide a clear indication and explanation in the figure legend.

      Dr. Ramya Masand, Chief pathologist in the Pathology Department at the Baylor College of Medicine, and a contributing author, assessed the H&E-stained oviduct sections from control and cKO mice. We have now included a new Supplementary Figure S3 with previous representative H&E images that depict the cellular alterations described in lines 332–336.

      (6) In lines 317-325, it is rather confusing about the description of the portion of embryos from the oviduct and uterus. In addition, the total number of embryos was not provided. I would recommend presenting the numerical data to show the average embryos from the oviduct and uterus instead of using the percentage data in Figures 3A and 5G.

      We thank the reviewer for this nice suggestion. We calculated the average number of embryos and found no difference in the number of embryos recovered from cKO or polyphyllin-treated pregnant mice at 4 dpc compared to their controls. (New Supplementary Figure S4A & B).

      (7) In lines 389-391, authors tested whether Polyphyllin VI treatment led to activated pyroptosis and blocked embryo transport. Although Figures 5F-G showed the expected embryo transport defect, the authors did not show the pyroptosis and oviduct morphology. It will be important to show that the Polyphyllin VI treatment indeed led to oviduct pyroptosis and lumen disruption.

      We performed the GSDMD staining IHC in Polyphyllin VI or vehicle-treated mice oviducts and observed elevated GSDMD expression with Polyphyllin V (New Figure 6E). However, no significant lumen disruption was detected, which may be attributed to the short-term exposure of the oviducts to pyroptosis induction, in contrast to the more pleiotropic effects observed in genetically induced models. Nonetheless, this observation clearly indicates that unscheduled or unwarranted activation of pyroptosis impedes embryo transport.

      (8) In line 378, it would be better to include a description of pyroptosis and its molecular mechanisms to help readers better understand your experiments. Alternatively, you can add it in the introduction.

      We thank the reviewer for this nice suggestion. We included literature on the pyroptosis pathway in the introduction section (Line Number: 105-118).

      (9) Please make sure to provide definitions for the acronyms such as FRT, HESCs, GSDMD, etc.

      We added definitions for the acronyms such as FRT, HESCs, and GSDMD used in the study.

      (10) It is rather confusing to use oviducal cell plasticity in this manuscript. The work illustrated the oviducal epithelial integrity, not the plasticity.

      We thank the reviewer for the suggestion. We have revised the manuscript accordingly to ensure clarity and precision in describing the oviductal epithelial structural changes observed in the absence of ATG14.

      A few of the additional comments for authors to consider improving the manuscript are listed below.

      (1) Some of the figures are missing scale bars, while others have inconsistent scale bars. It would be better to be consistent.

      We now included the scale bars in all images.

      (2) On a couple of occasions, the DAPI signal cannot be seen, such as in Figure 2B and Figure 3D.

      We now included better-resolution images for the DAPI signal in all fluorescent images shown in the revised manuscript.

      (3) Overall, the figure legends can be improved to provide more detailed information to help the reader to interpret the data.

      We included additional details in all the figure legends in the revised manuscript.

      (4) In Figure 2D, the Y-axis showed the stimulated/unstimulated uterine weight ratio, why did the author put "Atg14" at the top of the graph? At the same time, the X-axis title is missing in Figure 2D.

      We apologize for the typo error. We removed “Atg14” from the top of the graph and included the X-axis title in the revised manuscript.

      (5) In the left panel of Figure 2G, "ATG14" at the top should be "Atg14" to be consistent.

      In Figure 2G, we are representing “ATG14” according to human gene annotation.

      (6) In line 559, there miss "(A)" in front of Immunofluorescence analysis of GSDMD.

      We thank the reviewer for noting this. We corrected it in the revised manuscript.

      Reviewer #3 (Public Review):

      Summary:

      The manuscript by Pooja Popli and co-authors tested the importance of Atg14 in the female reproductive tract by conditionally deleting Atg14 using Pr Cre and also Foxj1cre. The authors showed that loss of Atg14 leads to infertility due to the retention of embryos within the oviduct. The authors further concluded that the retention of embryos within the oviduct is due to pyroptosis in oviduct cells leading to defective cellular integrity. The manuscript has some interesting findings, however there are also areas that could be improved.

      Strengths:

      The importance of Atg14 and autophagy in the female reproductive tract is incompletely understood. The manuscript also provide spatial evidence about a new mechanism linking Atg14 to pyroptosis.

      We thank the reviewer for the positive statements and constructive comments on our manuscript.

      Weaknesses:

      (1) It is not clear why the loss of Atg14 selectively induces Pyroptosis within oviduct cells but not in other cellular compartments. The authors should demonstrate that these events are not happening in uterine cells.

      We thank the reviewer for this nice suggestion. We performed GSDMD IHC and found that, unlike in the oviduct, the cKO uteri and ovaries do not exhibit detectable pyroptosis (Revised Figure 5F). Additionally, we have added text to the discussion section addressing possible reasons for the differential impact of Atg14 loss on pyroptosis along the reproductive tract continuum (Line number: 532-538)

      (2) The manuscript never showed any effect on the autophagy upon loss of Atg14. Is there any effect on autophagy upon Atg14 loss? If so, does that contribute to the observation?

      We thank the reviewer for the nice suggestion. We found LC3b and p62 protein levels, two well-known markers of autophagic flux are elevated due to Atg14 loss in the oviduct (New Supplementary Figure S2B).  Since, p62 accumulation is an indicative of the reduced autophagic flux [4], we posit loss of Atg14 results in defective autophagy in the oviduct. Importantly, this defective autophagy adversely impacted the structural integrity of oviductal epithelial cells, causing impairment in embryo transport.

      (3) It is not clear what the authors meant by cellular plasticity and integrity. There is no evidence provided in that aspect that the plasticity of oviduct cells is lost. Similarly, more experimental evidence is necessary for the conclusion about cellular integrity.

      We thank the reviewer for the suggestion. We have revised the text for clarity and precision in describing the oviductal epithelial structural changes observed in the absence of ATG14. To avoid ambiguity, we have removed the term "cellular plasticity." We have already provided extensive evidence, including multiple H&E stains and immunofluorescence analyses for KRT8 and smooth muscle actin to illustrate cellular integrity in both control and cKO oviducts. However, we respectfully believe that performing additional experiments on cellular integrity would not contribute further to the conclusions already drawn.

      (4) The mitochondrial phenotype shown in Figure 3 didn't appear as severe as it is described in the results section. The analyses should be more thorough. They should include multiple frames (in supplemental information) showing mitochondrial morphology in multiple cells. The authors should also test that aspect in uterine cells. The authors should measure Feret's diagram. Diff erence in membrane potential etc. for a definitive conclusion.

      We appreciate the reviewer’s suggestion. We carried out the TOM20 (mitochondrial structural marker) and cytochrome C (mitochondrial damage and cell death marker) immune-colocalization study and found loss of TOM20 signal with concomitant cytochrome c leakage into the peri-nuclear space (Revised Figure 5B). Additionally, we also observed reduced expression of mitochondrial structural and functional markers by qPCR analysis (Revised Figure 5C). However, we respectfully argue that conducting membrane potential studies on murine oviducts is extremely complex and is beyond the scope of this study.

      (5) The comment that the loss of Atg14 and pyroptosis leads to the narrowing of the lumen in the oviduct should be experimentally shown.

      We have now included a New Supplementary Figure S3 with representative previous immunofluorescence images that clearly show the narrowing of the lumen with Atg14 loss in the oviduct.

      (6) The manuscript never showed the proper mechanism through which Atg14 loss induces pyroptosis. The authors should link the mechanism.

      We respectfully disagree with the reviewer on this point. We have provided substantial evidence regarding the cellular mechanisms through which the loss of Atg14 may lead to the activation of pyroptosis as outlined below:

      (1) Cellular Changes: Loss of ATG14 in the oviduct results in cellular swelling and the formation of fused membranous structures, which are characteristic features of pyroptosis activation.

      (2) Expression of Key Pyroptosis Proteins: We observed an induced expression of GSDMD and Caspase-1, primary executioners of the pyroptotic pathway, in response to Atg14 loss.

      (3) Inflammatory Markers: Elevated levels of inflammatory markers such as TNF-α and CXCR3 were detected, both of which are known to promote pyroptosis [5, 6].

      (4) Mitochondrial Damage: We have added new data demonstrating disrupted colocalization of TOM20 (a mitochondrial structural marker) and Cytochrome c (a cell death marker), resulting in Cytochrome c leakage into the perinuclear space (Revised Figure 5B). Additionally, qPCR analysis revealed reduced expression of mitochondrial structural and functional markers in cKO oviduct tissues (Revised Figure 5C).

      Based on these evidences, we can clearly say that Atg14 has some direct or indirect link to inflammasome activation. However, understanding the complex rheostat between the Atg14-mediated autophagy and inflammation regulatory axis will necessitate future studies employing sophisticated models, such as combined knockout mice where ATG14 is deleted alongside key inflammatory regulators (e.g., NLRP3, GSDMD, or CASPASE-1). These dual knockout models could provide crucial insights into how ATG14 modulates inflammatory pathways.

      References:

      (1) Herrera, G.G.B., et al., Oviductal Retention of Embryos in Female Mice Lacking Estrogen Receptor alpha in the Isthmus and the Uterus. Endocrinology, 2020. 161(2).

      (2) Soyal, S.M., et al., Cre-mediated recombination in cell lineages that express the progesterone receptor. Genesis, 2005. 41(2): p. 58-66.

      (3) Zhang, Y., et al., A transgenic FOXJ1-Cre system for gene inactivation in ciliated epithelial cells. Am J Respir Cell Mol Biol, 2007. 36(5): p. 515-9.

      (4) Mizushima, N., T. Yoshimori, and B. Levine, Methods in mammalian autophagy research. Cell, 2010. 140(3): p. 313-26.

      (5) Vaher, H., Expanding the knowledge of tumour necrosis factor-alpha-induced gasdermin E-mediated pyroptosis in psoriasis. Br J Dermatol, 2024. 191(3): p. 319-320.

      (6) Liu, C., et al., CXCR4-BTK axis mediate pyroptosis and lipid peroxidation in early brain injury after subarachnoid hemorrhage via NLRP3 inflammasome and NF-kappaB pathway. Redox Biol, 2023. 68: p. 102960.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The anatomical connectivity of the claustrum and the role of its output projections has, thus far, not been studied in detail. The aim of this study was to map the outputs of the endopiriform (EN) region of the claustrum complex, and understand their functional role. Here the authors have combined sophisticated intersectional viral tracing techniques, and ex vivo electrophysiology to map the neural circuitry of EN outputs to vCA1, and shown that optogenetic inhibition of the EN→vCA1 projection impairs both social and object recognition memory. Interestingly the authors find that the EN neurons target inhibitory interneurons providing a mechanism for feedforward inhibition of vCA1.

      Strengths:

      The strength of this study was the application of a multilevel analysis approach combining a number of state-of-the-art techniques to dissect the contribution of the EN→vCA1 to memory function.

      Weaknesses:

      Some authors would disagree that the vCA1 represents a 'node for recognition of familiarity' especially for object recognition although that is not to say that it might play some role in discrimination, as shown by the authors. I note however that the references provided in the Introduction, concerning the role of vCA1 in memory refer to anxiety, social memory, temporal order memory, and not novel object recognition memory. Given the additional projections to the piriform cortex shown in the results, I wonder to what extent the observations may be explained by odour recognition effects.

      We have added references demonstrating that the ventral hippocampus contributes to object recognition memory in rodents (Broadbent NJ et al., Learn Mem 2010; Titulaer J et al., Front Behav Neurosci 2021).

      The odor recognition effect is an interesting perspective that we have also considered. However, in our object recognition test, the same odor (70% EtOH) was used for both objects, yet the mice were able to discriminate between the familiar and novel objects. This suggests that the likelihood of the odor cue contributing to their performance in object discrimination test is low.

      In addition, I wondered whether the impairments in discrimination following Chemogenetic inhibition of the EN→vCA1 were due to the subject treating the novel and familiar stimuli as either both novel- which might be observed as an increase in exploration, or both stimuli as familiar, with a decrease in overall exploration.

      We thank the reviewer for rising this interesting point. We analyzed the total exploration time (i.e., time in interaction zones in familiar and novel) during social discrimination test. The data is added to Fig. S9. Total exploration time was not affected by CNO treatment. This indicates inhibition of ENvCA1-proj. neurons reduced interaction time with the novel conspecific and increased interaction time with the familiar conspecific. The subject mice seem to give even weight on familiar and novel stimuli.

      Reviewer #2 (Public Review):

      Summary:

      Yamawaki et al., conducted a series of neuroanatomical tracing and whole-cell recording experiments to elucidate and characterise a relatively unknown pathway between the endopiriform (EN) and CA1 of the ventral hippocampus (vCA1) and to assess its functional role in social and object recognition using fibre photometry and dual vector chemogenetics. The main findings were that the EN sends robust projections to the vCA1 that colateralise to the prefrontal cortex, lateral entorhinal cortex, and piriform cortex, and these EN projection neurons terminate in the stratum lacunosum-moleculare (SLM) layer of distal vCA1, synapsing onto GABAergic neurons that span across the Pyramidal-Stratum Radiatum (SR) and SR-SML borders. It was also demonstrated that EN input disynaptically inhibits vCA1 pyramidal neurons. vCA1 projecting EN neurons receive afferent input from the piriform cortex, and from within EN. Finally, fibre photometry experiments revealed that vCA1 projecting EN neurons are most active when mice explore novel objects or conspecifics, and pathway-specific chemogenetic inhibition led to an impairment in the ability to discriminate between novel vs. familiar objects and conspecifics.

      This is an interesting mechanistic study that provides valuable insights into the function and connectivity patterns of afferent input from the endopiriform to the CA1 subfield of the ventral hippocampus. The authors propose that the EN input to the vCA1 interneurons provides a feedforward inhibition mechanism by which novelty detection could be promoted. The experiments appear to be carefully conducted, and the methodological approaches used are sound. The conclusions of the paper are supported by the data presented on the whole.

      We thank the reviewer for their positive comments on our work.

      The authors used dual retrograde tracing and observed that the highest percentage (~30%) of vCA1 projecting EN cells also projected to the PFC. They then employed an intersectional approach to show the presence of collaterals in other cortical areas such as the entorhinal cortex and piriform cortex in addition to the PFC. However, they state that 'Projection to prefrontal cortex was sparse relative to other areas, as expected based on the retrograde labeling data' (referring to Figure 2K) and subsequently appear to dismiss the initial data set indicating strong axonal projections to the PFC.

      Our interpretation is that 70% of the ENCA1-proj. population does not send collaterals to the PFC, suggesting that the PFC is not a major target for this population (unlike vCA1 where 100% of its population projects). This hypothesis is supported by our axon branching study, which showed lower axon density in the PFC compared to vCA1 (and other regions). We revised the text to 'much sparser relative to that of vCA1' (line 101) to facilitate a direct comparison with the retrograde and anterograde labeling study.

      Since this is a relatively unknown connection, it would be helpful if some evidence/discussion is provided for whether the EN projects to other subfields (CA3, DG) of the ventral hippocampus. This is important, as the retrograde tracer injections depicted in Figure 1B clearly show a spread of the tracer to vCA3 and potentially vDG and it is not possible to ascertain the regional specificity of the pathway.

      We addressed the potential caveat associated with the retrograde tracer injection, as mentioned by the reviewer, by performing intersectional axon branching analysis. This analysis demonstrated that EN axons are primarily located in the SLM of the distal CA1 subfield (Figs. 2, 3, S2). However, we occasionally observed very weak labeling in the CA3 or dentate gyrus. We modified our text (lines 106-108) and figure (Fig. S2D) to account for this.

      The vCA1 projecting EN cells appear to originate from an extensive range along the AP axis. Is there a topographical organization of these neurons within the vCA1? A detailed mapping of this kind would be valuable.

      This is an interesting question for future research. Our data show a non-uniform distribution of this cell type, suggesting the potential for topographic organization.

      Given this extensive range in the location of vCA1 EN originating cells, how were the targets (along the AP axis) in EP selected for the calcium imaging?

      Using our injection coordinates, ENvCA1-proj. neurons were consistently labeled at high density just posterior to the bregma (Fig. 1J). Therefore, we targeted this region for our imaging.

      The vCA1 has extensive reciprocal connections with the piriform cortex as well, which is in close proximity to the EN. How certain are the authors that the chemogenetic targeting was specific to the EN-vCA1 connection?

      We performed histology on every animal used in the behavioral study to examine the specificity of hM4D expression, and only included those with specific labeling in the EN.

      Raw data for the sociability and discrimination indices should be provided so that the readers can gain further insight into the nature of the impairment.

      The raw data for total interaction time during the social discrimination test has been added (Fig. S9F).

      Line 222: It is unclear how locomotor activity informs anxiety in the behavioral tests.

      The degree of exploratory behavior in a novel context is generally considered to infer anxiety levels in rodents. We have added a review paper (Ref 44, Prut, 2003) that discusses this point.

      Figure 7 title; It is stated that activity of EN neurons 'predict' social/object discrimination performance. However, caution must be exercised with this interpretation as the correlational data are underpowered (n=5-8). Furthermore, the results show a significant correlation between calcium event ratios and the discrimination index in the social discrimination test but not the object discrimination test.

      We added the sample size for EN calcium imaging during the object recognition memory test (Fig. 7G). The updated data indicate a significant correlation between EN activity and the object recognition index (N = 9, Pearson R = 0.8, p = 0.01).

      We have changed the title of Figure 7 to 'Activity of ENvCA1-proj. neurons correlates with social/object discrimination performance’.

      While both male and female mice were included in the anatomical tracing and recording experiments, only male mice were used for behavioral tests.

      The female behavior was highly inconsistent in the control condition of our social recognition memory paradigm; therefore, we decided to conduct the study with males. We will design a new behavioral paradigm for future studies to address this challenge.

      Reviewer #1 (Recommendations For The Authors):

      (1) It is not clear how the relative number of vCA1 projecting neurons in Figure 1H was acquired, not enough detail is presented in the methods section. To what extent could these data have been affected by differences in the size or anatomical position of the injection site in vCA1, which judging from the example fluorescent image in Figure 1B also appears to include CA3.

      We used AMaSiNe (Song et al. 2020) to semi-automatically quantify fluorescently labeled presynaptic neurons. This open-source software identifies the number and location of these cells across different regions based on the Allen Mouse Brain Common Framework. To control for transfection variability (e.g., due to slight differences in injection volume or site), we normalized the presynaptic cell count in each region by the total number of cells in regions of interest. We performed for N = 5 brain and found consistent trend as seen in Fig. 1H (grey lines).

      We have added the detailed method of quantification in the Materials and Methods section (line 393).

      (2) For a number of the results, the full statistical values are not presented in the Results section or figure legend.

      We have included the full statistical values in the figure legends of the revised manuscript.

      (3) It is not clear how much virus was injected in the different experiments (tract racing, electrophysiology, behaviour, etc.). The methods state 50-100ul, but there is no further detail in the results or figure legends.

      We have included the injected volumes of the virus in the revised manuscript.

      (4) Figure 2 mentions the CLA complex (line 702) but this is not defined in the text. Although the introduction does refer to the claustrum complex, there is no acronym.

      We have corrected the manuscript accordingly.

      (5) Line 131- 'we recorded from 3-4 GABAergic neurons' - presumably this is in each animal?

      We recorded 3 to 4 GABAergic neurons sequentially from the same slice to compare input strength. We have edited the text to clarify this (line 134).

      Reviewer #2 (Recommendations For The Authors):

      Figure 3C: It is not clear what the dashed lines labelled proximal and distal represent.

      It is the proximal and distal vCA1 regions where GFP signals were measured for Fig. 3D. We have modified the figure legend to clarify this (line 736).

      Figure 5D: what do the different colors represent? Different colors for one brain?

      I assume that the reviewer meant to refer to Fig. 4D instead of Fig. 5D. In Fig. 4D, one color indicates starter cells in one brain. To clarify this, we have edited the figure legend (line 748).

      Figure S6E: The images are low resolution and it is hard to decipher the exact locations of labeled neurons. Please provide more guidance (e/g/. labeling areas of interest).

      We have added reference lines and labels in Figure S6E.

      Some details are missing: what was the volume of AAV injected for each site/experiment; how was CNO made, and where was it purchased from?

      We have added this information (lines 330-331; 431-434).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      This work presents a replicable difference in predictive processing between subjects with and without tinnitus. In two independent MEG studies and using a passive listening paradigm, the authors identify an enhanced prediction score in tinnitus subjects compared to control subjects. In the second study, individuals with and without tinnitus were carefully matched for hearing levels (next to age and sex), increasing the probability that the identified differences could truly be attributed to the presence of tinnitus. Results from the first study could successfully be replicated in the second, although the effect size was notably smaller.

      Throughout the manuscript, the authors provide a thoughtful interpretation of their key findings and offer several interesting directions for future studies. Their conclusions are fully supported by their findings. Moreover, the authors are sufficiently aware of the inherent limitations of cross-sectional studies.

      Strengths:

      The robustness of the identified differences in prediction scores between individuals with and without tinnitus is remarkable, especially as successful replication studies are rare in the tinnitus field. Moreover, the authors provide several plausible explanations for the decline of the effect size observed in the second study.

      The rigorous matching for hearing loss, in addition to age and sex, in the second study is an important strength. This ensures that the identified differences cannot be attributed to differences in hearing levels between the groups.

      The used methodology is explained clearly and in detail, ensuring that the used paradigms may be employed by other researchers in future studies. Moreover, the registering of the data collection and analysis methods for Study 2 as a Registered Report should be commended, as the authors have clearly adhered to the methods as registered.

      Weaknesses:

      Although the authors have been careful to match their experimental groups for age, sex, and hearing loss, there are other factors that may confound the current results. For example, subjects with tinnitus might present with psychological comorbidities such as anxiety and depression. The authors' exclusion of distress as a candidate for explaining the found effects is based solely on an assessment of tinnitus-related distress, while it is currently not possible to exclude the effects of elevated anxiety or depression levels on the results. Additionally, as the authors address in the discussion, the presence of hyperacusis may also play a role in predictive processing in this population.

      The authors write that sound intensity was individually determined by presenting a short audio sequence to the participants and adjusting the loudness according to an individual pleasant volume. Neural measurements made during listening paradigms might be influenced by sound intensity levels. The intensity levels chosen by the participants might therefore also have an effect on the outcomes. The authors currently do not provide information on the sound intensity levels in the experimental groups, making it impossible to assess whether sound intensity levels might have played a role.

      Thank you very much for your favorable and constructive evaluation of our manuscript. We agree with you on various additional confounds that we did not consider and included a section in our discussion. It is also correct that we did not include the sound intensity levels in our analysis, which is also a potential confound. Unfortunately, we do not have the data on the individual sound intensity levels but we included a section regarding this issue in our discussion as well.

      Line 937-949:

      “In both studies, tinnitus distress was not correlated with the reported prediction effects. Nevertheless, tinnitus can also be characterized by other features such as its loudness, pitch or duration which were not included in the experimental assessment. Additionally, we solely used a short version of the Mini-TQ (Goebel and Hiller, 1992) in Study 2, which did not allow us to relate prediction scores to subscales like sleep disturbances which potentially influence cognitive functioning and thus predictive processing. Next to sleeping disorders and distress, tinnitus is often also accompanied by psychological comorbidities such as depression or anxiety (Langguth, 2011) which are potential confounds of the results. For the work described in this manuscript the replicability of the core finding was of main importance. More studies are needed taking into account to assess relate the prediction patterns in more detail to aspects of tinnitus sensation and distress.”

      Reviewer #2 (Public Review):  

      Summary:  

      This study aimed to test experimentally a theoretical framework that aims to explain the perception of tinnitus, i.e., the perception of a phantom sound in the absence of external stimuli, through differences in auditory predictive coding patterns. To this aim, the researchers compared the neural activity preceding and following the perception of a sound using MEG in two different studies. The sounds could be highly predictable or random, depending on the experimental condition. They revealed that individuals with tinnitus and controls had different anticipatory predictions. This finding is a major step in characterizing the top-down mechanisms underlying sound perception in individuals with tinnitus.

      Strengths:  

      This article uses an elegant, well-constructed paradigm to assess the neural dynamics underlying auditory prediction. The findings presented in the first experiment were partially replicated in the second experiment, which included 80 participants. This large number of participants for an MEG study ensures very good statistical power and a strong level of evidence. The authors used advanced analysis techniques - Multivariate Pattern Analysis (MVPA) and classifier weights projection - to determine the neural patterns underlying the anticipation and perception of a sound for individuals with or without tinnitus. The authors evidenced different auditory prediction patterns associated with tinnitus. Overall, the conclusions of this paper are well supported, and the limitations of the study are clearly addressed and discussed.  

      Weaknesses:  

      Even though the authors took care of matching the participants in age and sex, the control could be more precise. Tinnitus is associated with various comorbidities, such as hearing loss, anxiety, depression, or sleep disorders. The authors assessed individuals' hearing thresholds with a pure tone audiogram, but they did not take into account the high frequencies (6 kHz to 16 kHz) in the patient/control matching. Moreover, other hearing dysfunctions, such as speech-in-noise deficits or hyperacusis, could have been taken into account to reinforce their claim that the observed predictive pattern was not linked to hearing deficits. Mental health and sleep disorders could also have been considered more precisely, as they were accounted for only indirectly with the score of the 10-item mini-TQ questionnaire evaluating tinnitus distress. Lastly, testing the links between the individuals' scores in auditory prediction and tinnitus characteristics, such as pitch, loudness, duration, and occurrence (how often it is perceived during the day), would have been highly informative.

      Thank you very much for your careful and constructive evaluation. We agree with the weaknesses stated in our manuscript and aimed to highlight these aspects more in our analyses and discussion, so future studies can take them into account (see e.g., line 937949). 

      Recommendations for the authors:  

      Reviewer #1 (Recommendations For The Authors):

      I would strongly recommend the inclusion of data on the used sound intensity levels. It would be very useful to assess whether there are any group differences regarding sound intensity of the stimuli, to exclude any effects of sound intensity on the results.

      We agree with you that - next to experimental aspects like the stimulus frequencies and the number of trials - the sound intensity levels potentially influence the effects as well. Unfortunately, this data was not saved during the experimental procedure and we are not able to include this as a variable in our analyses. As we, however, acknowledge this issue and want to provide guidelines for future research, we added a section to our discussion targeting sound intensity levels. 

      Line 902-913:

      “Thirdly, both studies used individual sound intensity levels to ensure a comfortable listening situation for the participants. These differences in sound intensity levels are, however, a potential confound in the experimental design as well since sound intensity can have an impact on neural responses (Thaerig et al., 2008). Although in this design, we expect the intensity levels balanced equally to the hearing loss of the participants (which did not differ between groups), and basic decoding of sound frequency did not differ in both studies, we are not able to ultimately exclude the sound intensity level as a driver of our effects. Future studies should include a perceived loudness matching for each frequency and should compare the adapted sound intensity values between each group or integrate them into the analysis (e.g., using the logistic regression approach in Fig. 8).”

      Reviewer #2 (Recommendations For The Authors):

      Major comments

      Introduction

      • The authors wrote: "Overall, this situation calls for the pursuit of alternative or complementary models that place less emphasis on the hearing status of the individual." They clearly demonstrated that the altered-gain model focuses on hearing loss and does not overcome the three described limitations. However, they mentioned other models focusing on brain activity outside of the auditive pathway (noise cancellation, map reorganization, specific neural networks. The authors should better explain the novelty of their approach compared to the existing ones.

      Thank you for your input. The inconclusive results and open questions about the altered-gain framework let us search for a different theoretical foundation for this work. We agree with you, that there are other models such as the map reorganization theory or neural network models next to the altered gain model and recent literature showed results supporting these frameworks (see e.g., a review from our group discussing tinnitus research in MEG over the last 10 years, Reisinger et al. (2023)). Nevertheless, as we focus on prediction processes, the Bayesian inference framework in tinnitus (Sedley et al., 2016) fits best for our approach. As we stated in line 113-116 “The Bayesian inference framework could, therefore, explain the experience of tinnitus in lieu of any increase in neural activity in the auditory system, or indicate an additional alteration, on top of hearing loss, for tinnitus to be perceived”, this framework differs from the other models and demonstrate a novel approach in tinnitus research. The novelty in this work is our methodological approach, which allows for explicit analyses of predictive patterns, irrespective of the exact location in the brain. This is a first step towards our actual underlying question whether aberrant auditory prediction patterns act as a neural correlate of tinnitus or rather as a risk factor or disposition. In our opinion, this question is of crucial relevance for understanding tinnitus processes on a neural level and our robust effects highlight the necessity to investigate these predictive processes in a longitudinal manner. We included a paragraph in our manuscript to make this more apparent for the reader. 

      Line 128-137:

      “We utilized a powerful, recently established experimental approach (Demarchi et al., 2019) showing anticipatory activations of tonotopically specific auditory templates for regular tone sequences. This method allows us to explicitly investigate predictive patterns in line with the Bayesian inference framework (Sedley et al., 2016), leading towards the overall question whether alterations in predictive coding can be interpreted as a neural correlate of tinnitus or rather as a risk factor. Since this question can solely be targeted in a longitudinal manner, we aimed in a first step to investigate prediction patterns in tinnitus over two independent samples, deriving robust effects that should be considered in future research.”

      • "This conceptual model bridges several explanatory gaps: for example, the inconsistent findings in humans regarding the "altered gain" view which states enhanced neural activity in the auditory pathway". What are "the inconsistent findings in humans regarding the 'altered gain'"? It would be helpful if the authors were more explicit about their idea here and added reference(s) to support it.

      Thank you for pointing that out. We agree with you that this section lacks clarity and we aimed to be more precise. 

      Line 108-116:

      “This conceptual model bridges several explanatory gaps: for example, the inconsistent findings in humans regarding the “altered gain” view which states altered neural activity in the auditory pathway. Recent findings vary in both the targeted frequency bands and the direction of the reported power changes which impede consistent conclusions (Eggermont and Roberts, 2015; Elgohyen et al., 2015, Reisinger et al., 2023). The Bayesian inference framework could, therefore, explain the experience of tinnitus in lieu of any increase in neural activity in the auditory system, or indicate an additional alteration, on top of hearing loss, for tinnitus to be perceived.”

      • I suggest moving this part to the discussion:

      "However, alternative explanations cannot be excluded with certainty, such as tinnitus being the cause of altered prediction tendencies or that there is a third variable being responsible for predictions and tinnitus development. Furthermore, even if altered predictive tendencies were to be found, there could be various possibilities of exactly how they could be altered to contribute to the onset or persistence of tinnitus. Some further clarity might then be gained through longitudinal studies in humans or animals."

      Thank you for your suggestion, we moved this part to the corresponding section in the discussion.

      Line 742-756:

      “Distinct predictive processing patterns could e.g., either develop within an individual in contributing to chronification of tinnitus (e.g., shift of “default prediction” from silence to sound; Sedley, 2019). Alternatively, they could be conceived as sensory processing style, making certain individuals more vulnerable to develop tinnitus under certain conditions (e.g., hearing loss, aging), a notion reminiscent of the “strong prior” hypothesis of hallucinations (Corlett et al., 2019). Hence, the direction of the effect remains unclear and alternative explanations, such as a third variable being responsible for predictions and tinnitus development, cannot be excluded with certainty. Furthermore, even if altered predictive tendencies were to be found, there could be various possibilities of exactly how they could be altered to contribute to the onset or persistence of tinnitus. In any case, any more conclusive claims would require longitudinal data, ideally with a tinnitus-free baseline. As such research is challenging to implement, especially in humans, we first focused in this work on finding cross-sectional group differences between individuals with and without tinnitus.”

      Methods

      Participants

      • "We calculated the individual mean hearing ability based on the values for 500, 1000, 2000, and 4000 Hz, which is a common approach for averaging results of pure-tone audiometry". Even if this method has been used multiple times in the literature, I would not recommend it as it can hide differences. Hearing loss is usually larger at high frequencies (starting at 6 000 Hz). An average threshold calculated with those central frequencies is more relevant for clinical use than in research. I strongly recommend performing a linear model with the factors Frequency (including all tested frequencies), Group, Ear side, and their interactions to precisely test the group differences in hearing thresholds.

      Thank you for pointing that out. We agree with you that higher frequencies are of potential interest as well when analyzing hearing loss. We included your suggested linear model in our methods section and the results were in line with our assumption that the groups did not differ substantially. Additionally, we included another logistic regression model in our exploratory analyses when investigating the influence of hearing loss on the prediction scores. Once more, the addition of higher frequencies did not substantially influence the effects.

      Line 194-203:

      “We calculated the individual mean hearing ability based on the values for 500, 1000, 2000, and 4000 Hz, which is a common approach for averaging results of pure-tone audiometry (i.e., PTA-4, see for example Lin et al. (2011); Ozdek et al. (2010)). Using independent t-tests, we found no differences in hearing status over frequencies between groups for the left(t=-1.19, p=.238) and right ear (t=-1.72, p=.09). An additional linear regression including all frequencies from 125 Hz to 8000 Hz also showed that hearing thresholds did not differ between ears (b=0.311, SE=1.600, p=.846) and groups (b=1.702, SE=1.553, p=.273), but solely between frequencies (b=0.003, SE=0.000, p<.001). Interactions were not significant as well.”

      Line 712-725:

      “As these logistic regression models were computed using an average hearing score computed over the frequencies 500, 1000, 2000, and 4000 Hz (i.e., PTA-4, see for example Lin et al. (2011); Ozdek et al. (2010)), we questioned whether hearing loss in higher frequencies influenced our effects. We therefore computed an additional logistic regression including also the PTA values of 6000 and 8000 Hz. In this analysis, hearing loss was not a significant predictor of tinnitus but rather showed a trend with b\=0.211, SE\=0.111, p\=.062. Prediction scores, however, remained a significant predictor of tinnitus even after including high-frequency hearing loss (b\=0.232, SE\=0.111, p\=.040). In this analysis, odds ratios indicated an increase of 26% in the odds of having tinnitus with a one standard deviation increase in the prediction score. Overall, this analysis strongly supports the notion that the main effect genuinely reflects a process related to the experience or statistical risk of experiencing tinnitus.”

      Stimuli and experimental procedure

      • Can you explain the use of movies during sound listening? And not an active listening task with oddball events, for example, to ensure that the subject attention is directed to the sounds?

      Thank you for your comment. We agree with you that attention is a relevant factor and with our design we cannot exclude potential attention effects on our findings. We chose this paradigm since previous research in our group including this exact experimental design (Demarchi et al., 2019) impressively demonstrated the formation of feature-specific auditory predictions in the brain and we aimed to investigate to what extent this can be detected in the tinnitus brain.

      We acknowledged this issue in our discussion (see line 916-919): “In the current work, we used passive listening tasks including a movie to reduce attentional focus on the presented stimuli. Therefore, we cannot draw conclusions whether differences in attention had an influence on the effects. Future studies should include more manipulations of attention to investigate its relevance”. 

      Results

      Pre-stimulus effects are not related to hearing loss and tinnitus-related features

      • How was the hearing loss calculated for this analysis? I recommend a PCA on the hearing levels, to get individual scores with a data-driven approach. Usually, the first dimension will be an average of all the frequencies. The second should be a difference between low and high frequencies. The same comment applies to study 2.

      Thank you for pointing that out. In the first study, participant groups were not controlled for hearing loss and pure-tone audiograms were solely averaged over all frequencies and both ears. As we marked out throughout the manuscript, insufficient control for hearing loss was the key issue in study 1 which led to the implementation of study 2. Further, we do not have data about the hearing status of every participant in study 1 and we do therefore not believe that a more complex approach for calculating hearing loss will increase interpretability in study 1. Nevertheless, we agree with you that it is not apparent how hearing loss was calculated in study 1. The results of the pure-tone audiometry were averaged over all frequencies and both ears, but no cut-off values were defined to characterize hearing loss. We therefore highly appreciate your detailed revision of our manuscript and adjusted the phrasing in the corresponding section. With our approach, it is not justifiable to talk about hearing loss but rather hearing thresholds. As for study 2, the methodological approach was reviewed and accepted as a Registered Report and we therefore do not want to deviate drastically from our pre-registered approach.

      Line 162-165:

      “Standardized pure-tone audiometric testing for frequencies from 125Hz to 8kHz was performed in 31 out of 34 tinnitus participants using Interacoustic AS608 audiometer.

      Averages were computed over all frequencies and both ears.”

      Line 356-362:

      “In the whole sample of participants with tinnitus (n=34) we performed a Spearman correlation of the β-coefficient values corresponding to the time-point of the maximum and the minimum t-value in intergroup analysis (comprised of positive and negative significant clusters emerging in group comparison for sound trials) with hearing thresholds (averaged audiogram for both ears), tinnitus loudness (10-point scale) and tinnitus distress scores (TQ).”

      Line 463-464:

      See as well Line 471-481.

      Line 491-495:

      “Our main findings are: 1) basic processing of carrier frequencies are not altered in tinnitus; 2) with increasing regularity of the sequence, individuals with tinnitus show relatively enhanced predictions of frequency information; 3) the effect is not related to hearing thresholds and tinnitus distress or loudness in this sample.”

      • In the methods, the authors indicated that the volume was adjusted individually at a pleasant volume. Can authors test if the volume was related to the individual's accuracy? Did they test that all frequencies were audible for all participants?

      Thank you for your feedback. We agree with you that it would be interesting to see whether sound intensity levels were related to the accuracy. Unfortunately, data regarding the volume was not saved during the experimental procedure and we are not able to include this as a variable in our analyses. We acknowledge this issue and added a section to our discussion targeting sound intensity levels. As for the second question, the individual volume adjustment was also meant to guarantee that all frequencies were audible for the participant. We clarified this in the methods section. Overall, it is important to mention that we did not find any differences between groups in the decoding of random tones (see Fig. 2 and Fig. 6C), indicating that the volume did not substantially have an influence on one group compared to the other.

      Line 232-234:

      “Sound intensity was individually determined by presenting a short audio sequence to the participants and adjusting the loudness according to an individual pleasant volume with all four frequencies audible for the participant.”

      Line 902-913:

      “Thirdly, both studies used individual sound intensity levels to ensure a comfortable listening situation for the participants. These differences in sound intensity levels are, however, a potential confound in the experimental design as well since sound intensity can have an impact on neural responses (Thaerig et al., 2008). Although in this design, we expect the intensity levels balanced equally to the hearing loss of the participants (which did not differ between groups), and basic decoding of sound frequency did not differ in both studies, we are not able to ultimately exclude the sound intensity level as a driver of our effects. Future studies should include a perceived loudness matching for each frequency and should compare the adapted sound intensity values between each group or integrate them into the analysis (e.g., using the logistic regression approach in Fig. 8).”

      Pre-stimulus differences in ordered and random tone sequences are not related to tinnitus distress • Accuracy was not correlated with tinnitus distress. Could the authors test if the accuracy was related to other clinical data, such as tinnitus pitch, duration, and loudness? And at the subscales of the mini-TQ?

      We appreciate your constructive feedback and agree with you that other tinnitus features such as pitch, duration, or loudness are also interesting in this regard. Unfortunately, these features were not assessed in study 2 and we are therefore not able to provide this information. Additionally, we solely used a short version of the Mini-TQ in this study and did not assess all subscales but rather used all available items for calculating tinnitus distress. This is a limitation of our study design and we included it in the discussion.

      Line 937-949:

      “In both studies, tinnitus distress was not correlated with the reported prediction effects. Nevertheless, tinnitus can also be characterized by other features such as its loudness, pitch or duration which were not included in the experimental assessment. Additionally, we solely used a short version of the Mini-TQ (Goebel and Hiller, 1992) in Study 2, which did not allow us to relate prediction scores to subscales like sleep disturbances which potentially influence cognitive functioning and thus predictive processing. [...] More studies are needed taking into account to assess relate the prediction patterns in more detail to aspects of tinnitus sensation and distress.”

      The strength of group effects differs between the two studies

      • This section should be in the discussion, not the results

      Thank you for your valuable input. In this section, we show comparisons between the two studies and report Bayes factors over time for the differences in decoding accuracy (see Figure 7A). We introduce novel results and believe therefore that this section should remain in the results and is discussed later in the manuscript.  

      Discussion

      • Globally, the discussion is very long and a bit speculative. I recommend the authors shorten the discussion (especially the speculations), and delete the repetition.

      Thank you very much for your constructive feedback. We aimed to shorten our discussion and delete repetitions to increase clarity and readability.

      • The effect of hearing loss has been tested in this study, evaluated as the mean hearing threshold of 4 central frequencies. However, hearing abilities cannot be limited to a central audiogram. High frequencies, speech-in-noise abilities, or other hidden hearing loss can be impacted, even for individuals without hearing loss on 500Hz- 4000Hz. The conclusion on the prediction effect being independent of hearing loss should include this limitation.

      Thank you for pointing that out. We added this limitation to the discussion.

      Line 781-794:

      “In a complementary analysis, we used our prediction score in addition to hearing loss magnitudes as predictors of tinnitus in a logistic regression. Prediction related pre-activation levels were informative whether participants perceived tinnitus, also when statistically controlling for hearing loss. However, it has to be mentioned that we calculated hearing loss based on the PTA results of the frequencies between 500 and 4000 Hz. This does not reflect hearing impairments like high frequency hearing loss or hidden hearing loss (i.e., hearing difficulties despite a normal audiogram, Liberman (2015)). As for hidden hearing loss, we were not able to draw conclusions regarding our effects since this concept of hearing damage is difficult to measure objectively, especially in humans. However, we included an additional logistic regression expanding the frequency range up to 8000 Hz and again, hearing loss did not substantially impact the prediction score as an informative tinnitus predictor.”

      Line 712-723:

      “As these logistic regression models were computed using an average hearing score computed over the frequencies 500, 1000, 2000, and 4000 Hz (i.e., PTA-4, see for example Lin et al. (2011); Ozdek et al. (2010)), we questioned whether hearing loss in higher frequencies influenced our effects. We therefore computed an additional logistic regression including also the PTA values of 6000 and 8000 Hz. In this analysis, hearing loss was not a significant predictor of tinnitus but rather showed a trend with b\=0.211, SE\=0.111, p\=.062. Prediction scores, however, remained a significant predictor of tinnitus even after including high-frequency hearing loss (b\=0.232, SE\=0.111, p\=.040). In this analysis, odds ratios indicated an increase of 26% in the odds of having tinnitus with a one standard deviation increase in the prediction score.”

      • "An increased focus on hippocampal regions, e.g., in fMRI, patient, or animal studies, could be a worthwhile complement to our MEG work, given the outstanding relevance of medial temporal areas in the formation of associations in statistical learning paradigms (see e.g., Covington et al., (2018); Schapiro et al., (2016)).".

      in the opinion of this reviewer, this claim is not well introduced and should be removed.

      Thank you for pointing that out. In our opinion, an increased focus on hippocampal regions is an important consideration for future research and we decided to keep this part in the manuscript. However, we added a third reference highlighting the relevance of temporal areas in tinnitus to strengthen our claim. 

      Line 866-868:

      “... given the outstanding relevance of medial temporal areas in the formation of associations in statistical learning paradigms (see e.g., Covington et al., (2018); Paquette et al., (2017); Schapiro et al., (2016)).”

      References:

      Paquette, S., Fournier, P., Dupont, S., de Edelenyi, F. S., Galan, P., & Samson, S. (2017). Risk of tinnitus after medial temporal lobe surgery. JAMA neurology, 74(11), 1376-1377. https://doi.org/10.1001/jamaneurol.2017.2718.

      • "Overall, our work clearly underlines the true presence of differences, in terms of predictive processing, between individuals with and without tinnitus. At the same time, distinct design choices impact the strength of the effects which is not only apparent in the present work but was also reported recently by Yukhnovich and colleagues (2024). Further to controlling for basic variables (age, sex, hearing loss), future studies using our paradigm and analysis approach should opt for a broad frequency spacing (>2 octaves) and ideally more than 2000 trials per carrier frequency in the random sequence. These recommendations are likely even more important for efforts of testing this paradigm using EEG, which normally comes with inferior data quality as compared to MEG."

      This reviewer considers that the entire paragraph should be deleted, as the effects are already covered in the previous paragraph.

      Thank you very much for your feedback, however, we believe that this paragraph acts as a brief and accurate summary for our guidelines to improve future research in this field. This section therefore remained in the manuscript.

      Minor comments

      Introduction

      • "The onsets of tinnitus and hearing loss often do not occur at the same time ". This sentence should have a reference.

      We appreciate your careful evaluation of our manuscript and included a reference to the sentence pointing out hearing loss as a precursor of tinnitus.

      Line 95f.:

      “2) The onsets of tinnitus and hearing loss often do not occur at the same time (Roberts et al., 2010).” 

      Methods

      Participants

      • Participants' laterality needs to be mentioned.

      Thank you for your input. We agree with you that laterality is an interesting aspect that should be taken into account. Unfortunately, however, we did not assess this in the current design. We mentioned the lack of this information in the methods section.

      Line 158:

      “Laterality of the participants was not assessed.”

      176-177:

      “No participants with psychiatric or neurological diseases were included in the sample. Laterality of the participants was not assessed.”

      "Four individuals with tinnitus did not show any audiometric abnormality; four of the participants showed unilateral hearing impairments; 26 volunteers had high-frequency hearing loss; and six individuals were hearing impaired over most frequencies (i.e. hearing thresholds higher than 30 dB)."

      This part is not precise enough. "Unilateral hearing impairment": is it on one or multiple frequencies? "26 volunteers had high-frequency hearing loss". What is considered as highfrequency here? The precision "(i.e. hearing thresholds higher than 30 dB)" can be dropped as it was defined in the sentence just before.

      We appreciate your constructive feedback and added information to clarify the audiometric characteristics of our participants.

      Line 186-190:

      “Four individuals with tinnitus did not show any audiometric abnormality; four of the participants showed unilateral hearing impairments on at least one frequency; 26 volunteers had high-frequency hearing loss (i.e. hearing thresholds higher than 30 dB); and six individuals were hearing impaired over most frequencies (i.e. hearing thresholds higher than 30 dB).”

      Results

      • Figure 3C: are those group differences significant? It should be noted on the graphs.

      • Figure 6D: I would suggest to remove this figure, as the correlation is not significant.

      • Figure 7A: It would be useful to precise the number of trials for each study, in parenthesis.

      • Figure 8 is unnecessary.

      Thank you for your careful assessment of our figures. We agree with you that significance should be indicated in Figure 3C and that the precise number of trials is relevant information in Figure 7A. We corrected the figures accordingly. However, the Figures 6D and 8 remained in the manuscript since they were already part of our Registered Report and we do not want to remove graphical information that was reviewed and accepted already.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Kimura et al performed a saturation mutagenesis study of CDKN2A to assess the functionality of all possible missense variants and compare them to previously identified pathogenic variants. They also compared their assay result with those from in silico predictors.

      Strengths:

      CDKN2A is an important gene that modulates cell cycle and apoptosis, therefore it is critical to accurately assess the functionality of missense variants. Overall, the paper reads well and touches upon major discoveries in a logical manner.

      Weaknesses:

      The paper lacks proper details for experiments and basic data, leaving the results less convincing. Analyses are superficial and do not provide variant-level resolution.

      We thank the reviewer for their comments. We have updated the manuscript to include additional detail of experimental methods and variant level resolution of data and analyses. We have also conducted additional analyses to compare variant classifications using a gamma generalized linear model and log2 normalized fold change, establish the effect of low variant coverage on variant functional classifications, determine the performance of combining multiple in silico predictions, and determine the prevalence of functionally deleterious variants in gnomAD and functionally deleterious variants of uncertain significance in ClinVar compared all CDKN2A missense variants.

      Reviewer #2 (Public Review):

      This study describes a deep mutational scan across CDKN2A using suppression of cell proliferation in pancreatic adenocarcinoma cells as a readout for CDKN2A function. The results are also compared to in silico variant predictors currently utilized by the current diagnostic frameworks to gauge these predictors' performance. The authors also functionally classify CDKN2A somatic mutations in cancers across different tissues.

      This study is a potentially important contribution to the field of cancer variant interpretation for CDKN2A, but is almost impossible to review because of the severe lack of details regarding the methods and incompleteness of the data provided with the paper. We do believe that the cell proliferation suppression assay is robust and works, but when it comes to the screening of the library of CDKN2A variants the lack of primary data and experimental detail prevents assessment of the scientific merit and experimental rigor.

      We are grateful for the opportunity to clarify our experimental methods and to provide additional data in the revised manuscript. The manuscript has been updated to include, among other changes, additional information on assay design, analysis of variant representation in the library, inclusion of primary data with variant level resolution, and a comparison of variant classifications using a gamma generalized linear model and log2 normalized fold change.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Major issues:

      (1) Can the pathogenicity values of individual amino acid changes be opened to the public? It would serve as a valuable asset to the community.

      Thank you for your suggestion. We are happy to provide this information. Individual variant data and functional classifications from the functional assay are given in Appendix 1-table 4.

      (2) In the method section, it is not clear (at least to the reviewer) whether the protocol describing the construction of the CDKN2A missense library was provided.

      Thank you for your comment. We have included additional information in the manuscript describing construction of the CDKN2A missense library.

      “CDKN2A expression plasmid libraries

      Codon-optimized CDKN2A cDNA using p16INK4A amino acid sequence (NP_000068.1), was designed (Appendix 1-table 12) and pLJM1 containing codon optimized CDKN2A (pLJM1-CDKN2A) generated by Twist Bioscience (South San Francisco, CA). 156 plasmid libraries were then synthesized by using pLJM1-CDKN2A, such that each library contained all possible 20 amino acids variants (19 missense and 1 synonymous) at a given position, generating 500 ng of each plasmid library (Twist Bioscience, South San Francisco, CA). The proportion of variant in each library was shown in Appendix 1-table 2. Variants with a representation of less than 1% in a plasmid library were individually generated using the Q5 Site-Directed Mutagenesis kit (New England Biolabs, Ipswich, MA; catalog no. E0552), and added to each library to a calculated proportion of 5%. Primers used for site-directed mutagenesis are given in Appendix 1-table 13. Each library was then amplified to generate at least 5 ug of plasmid DNA using QIAGEN Plasmid Midi Kit (QIAGEN, Germantown, MD; catalog no. 12143).”

      (3) The paper lacks basic experimental results. The results cover almost all possible missense variants, but it would be clearer if actual coverage values used for calculating relative enrichment were shown. Are all variants well covered? Isn't there any spurious signal due to low coverage? How many times were the experiments performed? Also, how many cells were used, what was the expected MOI, and what proportion of harvested cells is thought to have a single variant? How can you distinguish the effect of a single variant from a multiple variants effect?

      We thank the reviewer for their comment. We have provided additional information in the manuscript to address these issues. Briefly, in response to each issue:

      (1) We have provided read count data for all variants, used to determine functional classifications based on either gamma generalized linear model or normalized fold change, in Appendix 1-table 4.

      (2) To assess if low variant coverage resulted in spurious signals, we compared prevalence of functionally deleterious classifications among variants binned by coverage in the Day 9 cell pool. We did not identify any statistically significant differences based on variant coverage.

      “We also determined whether underrepresentation in the cell pool at Day 9 affected variant functional classifications. Fifty-three of 2,964 missense variants (1.8%) were present in the cell pool at Day 9 of the first assay replicate (experiment 1) at < 2%, as determined by the number of sequence reads supporting the variant (Figure 2 -figure supplement 4A, Appendix 1-table 4). There was no statistically significant difference in the proportion of variants classified as functionally deleterious for variants present in less than 2% of the cell pool at Day 9 (12 of 53 variants; 22.6%), and variants present in more than 2% of the cell pool (496 of 2,911 variants; 17.0%) (P value = 0.28) (Figure 2 -figure supplement 4B). We also found no significant differences in the proportion of variants classified as functionally deleterious for variants present in more than 2% of the cell pool at Day 9 when variants were binned in 1% intervals (Figure 2 -figure supplement 4B).”

      (3) The assay was repeated in duplicate for 28 CDKN2A residues. For the remaining 128 residues of CDKN2A, the assay was completed once. We found good agreement between variant classifications in assay repeats. We have added to the text as follows:

      “To confirm the reproducibility of our variant classifications, 28 amino acid residues were assayed in duplicate, and variants classified using the gamma GLM. The majority of missense variants, 452 of 560 (80.7%), had the same functional classification in each of the two replicates (Figure 2 -figure supplement 3A and B, Appendix 1-table 4).”

      We have also added discussion of this study limitation to the manuscript:

      “We repeated our functional assay twice for 28 CDKN2A residues. For the remaining 128 residues of CDKN2A, the functional assay was completed once. While we found general agreement between functional classifications from each replicate for the 28 residues assayed in duplicate, additional repeats for each residue are necessary to determine variability in variant functional classifications.”

      (4) We have added additional information about the number of cells used for transduction and MOI to the method section:

      “Lentiviral transduction

      PANC-1 cells were used for CDKN2A plasmid library and single variant CDKN2A expression plasmid transductions. PANC-1 cells previously transduced with pLJM1-CDKN2A (PANC-1CDKN2A) and selected with puromycin were used for CellTag library transductions. Briefly, 1 x 105 cells were cultured in media supplemented with 10 ug/ml polybrene and transduced with 4 x 107 transducing units per mL of lentivirus particles. Cells were then centrifuged at 1,200 x g for 1 hour. After 48 hours of culture at 37oC and 5% CO2, transduced cells were selected using 3 µg/ml puromycin (CDKN2A plasmid libraries and single variant CDKN2A expression plasmids) or 5 µg/ml blasticidin (CellTag plasmid library) for 7 days. Expected MOI was one. After selection, cells were trypsinized and 5 x 105 cells were seeded into T150 flasks. DNA was collected from remaining cells and this sample was named as (Day 9). T150 flasks were cultured until confluent and then DNA was collected. The time for cells to become confluent varied for each amino acid residue (Day 16 – 40, Appendix 1-table 5).”

      (5) Our assay was not designed to distinguish multiple variant effects. However, we do not anticipate multiple transductions to significantly impact variant classifications in our assay. We found that our functional classifications were consistent with previously reported classifications:

      “In general, our results were consistent with previously reported classifications. Of variants identified in patients with cancer and previously reported to be functionally deleterious in published literature and/or reported in ClinVar as pathogenic or likely pathogenic (benchmark pathogenic variants), 27 of 32 (84.4%) were functionally deleterious in our assay (Figure 2B, Figure 2 -figure supplement 1B and 1C, Appendix 1-table 4) (Chaffee et al., 2018; Chang et al., 2016; Horn et al., 2021; Hu et al., 2018; Kimura et al., 2022; McWilliams et al., 2018; Roberts et al., 2016; Zhen et al., 2015). Five benchmark pathogenic variants were characterized as indeterminate function, with log2 P values from -19.3 to -33.2. Of 156 synonymous variants and six missense variants previously reported to be functionally neutral in published literature and/or reported in ClinVar as benign or likely benign (benchmark benign variants), all were characterized as functionally neutral in our assay (Figure 2B, Figure 2 -figure supplement 1B and 1C, Appendix 1-table 4) (Kimura et al., 2022; McWilliams et al., 2018; Roberts et al., 2016). Of 31 VUSs previously reported to be functionally deleterious, 28 (90.3%) were functionally deleterious and 3 (9.7%) were of indeterminate function in our assay. Similarly, of 18 VUSs previously reported to be functionally neutral, 16 (88.9%) were functionally neutral and 2 (11.1%) were of indeterminate function in our assay, (Figure 2B, Figure 2 -figure supplement 1B and 1C, Appendix 1-table 4).”

      (4) Comparison of functional classifications (shown in Figure 3) from this study and other in silico tools is superficial. The analysis is based on the presumption that their result is gold-standard, thereby calculating the sensitivity, accuracy, and PPV of individual predictors. But apparently, this won't be true, so it would be more reasonable to check the "correlation" of the study results and other predictors: e.g. which variants show consistent results between this study and other predictors? Are there any indicators of consistent vs inconsistent results? How does the consistency change by protein sequences or domains? Etc

      Thank you for your comment. We have added additional analysis to our manuscript comparing our functional classifications with in silico variant effect predictions. Specifically, we have included analysis combining multiple predictors:

      “We also tested the effect of combining multiple in silico predictors. 904 missense variants had in silico predictions from all 7 algorithms. The remaining 2,060 missense variants had in silico predictions from 5 algorithms. Of variants with in silico predictions from all 7 algorithms, 378 (41.8%) had predictions of deleterious or pathogenic effect from a majority of algorithms (≥ 4), and of these, 137 (36.2%) were functionally deleterious in our assay. Similarly, of 2,060 missense variants that had in silico predictions from 5 algorithms, 1107 (53.7%) had predictions of deleterious or pathogenic effect from a majority of algorithms (≥ 3), of which, 361 (32.6%) were functionally deleterious in our assay (Appendix 1-table 7).”

      (5) Similarly, Figure 4 does not deliver much information, either. Rather than delivering a simple summary, it would be more informative if deeper analyses were conducted. e.g., do pathogenic variants show higher frequency among patients, or higher variant frequency in tumors (if data were available).

      We have included additional analysis of somatic alterations in the manuscript. We found pathogenic/likely pathogenic somatic mutations were enriched in patients. This was also the case for somatic mutations that were classified as functionally deleterious in our assay. We also found statistically significant depletion of functionally deleterious mutations in colorectal adenocarcinoma. Interestingly, no patients with a somatic mutation in a mismatch repair gene had a functionally deleterious CDKN2A missense somatic mutation. However, this observation was not statistically significant. Future studies will determine whether CDKN2A and MMR gene somatic mutations are mutually exclusive in colorectal adenocarcinoma.

      “We found that 34.2% - 53.4% of unique missense somatic mutations classified as functionally deleterious, with 61.4% - 67.6% of patients having a functionally deleterious somatic mutation (Figure 4A, Appendix 1-table 9). As with functionally deleterious variants, functionally deleterious missense somatic mutations were also not distributed evenly across CDKN2A, being enriched within the ankyrin repeat 3 (Figure 4B, Appendix 1-table 9). We found that 32.4% - 50.0% of all functionally deleterious missense somatic mutations occurred within ankyrin repeat 3, with 48.0% - 58.0% of patients in each cohort having a functionally deleterious missense somatic mutation in this domain. Notably, 65.7% - 76.0% of functionally deleterious missense somatic mutations in this domain were in residues 80-89 (Appendix 1-table 9).”

      “We were also able to determine the functional classification of CDKN2A missense somatic mutations in COSMIC, TCGA, JHU, and MSK-IMAPCT by cancer type. We found that 22.2% - 100% of CDKN2A missense somatic mutations were functionally deleterious depending on cancer type (Figure 4-figure supplement 2A-D). When considering missense somatic mutation reported in any database, there was a statistically significant depletion of functionally deleterious mutations in colorectal adenocarcinoma (20.4%; adjusted P value = 5.4 x 10-9) (Figure 4C). As the proportion of missense somatic mutations that were functionally deleterious was less in colorectal carcinoma compared to other types of cancer, we assessed whether somatic mutations in mismatch repair genes (MLH1, MLH3, MSH2, MSH6, PMS1, and PMS2) were associated with the functional status of CDKN2A missense somatic mutations. Thirty-five patients in COSMIC had a CDKN2A missense somatic mutation, of which 12 (34.3%) had a somatic mutation in a mismatch repair gene. We found that no patients with a somatic mutation in a mismatch repair gene had a functionally deleterious CDKN2A missense somatic mutation compared to 6 of 23 samples (26.1%) without a somatic mutation in a mismatch repair gene (P value = 0.062).”

      (6) It would be helpful to validate the neutral variants set. Are variants of UK biobank or gnomAD enriched on neutral population? Are synonymous variants exclusively found in neutral populations?

      Thank you for the suggestion. All synonymous variants were found to functionally neutral in our assay. We also assessed VUSs from gnomAD and found a lower prevalence of functionally deleterious variants compared to all CDKN2A variants and CDKN2A missense somatic mutations:

      “The Genome Aggregation Database (gnomAD) v4.1.0 reports 287 missense variants in CDKN2A, including the 13 pathogenic, 4 likely pathogenic, 3 likely benign, 3 benign, and 264 VUSs classified using ACMG variant interpretation guidelines (Figure 5A, Figure 5B, and Appendix 1-table 10). Of the 264 missense VUSs, 177 were functionally neutral (67.0%), 56 (21.2%) were indeterminate function, and 31 (11.7%) were functionally deleterious in our assay using the gamma GLM for classification (Figure 5C).”

      (7) They used a pancreatic cancer cell line and assayed for cell proliferation. The limitations of this method and the possibility of complementing the limitations should be discussed.

      Thank you for the suggestion. We have added discussion of this limitation to our manuscript:

      “We characterized variants based upon a broad cellular phenotype, cell proliferation, in a single PDAC cell line. It is possible that CDKN2A variant functional classifications are cell-specific and assay-specific. Our assay may not encompass all cellular functions of CDKN2A and an alternative assay of a specific CDKN2A function, such as CDK4 binding, may result in different variant functional classifications. Furthermore, CDKN2A variants may have different effects if alternative cell lines are used for the functional assay. However, cell-specific effects appear to be limited. In our previous study, we characterized 29 CDKN2A VUSs in three PDAC cell lines, using cell proliferation and cell cycle assays, and found agreement between all functional classifications (Kimura et al., 2022).”

      Minor issues:

      (1) Figures 2B, C: it would be more intuitive to plot significance by logging p-values than raw p-values.

      We used log2 P value (or log2 normalized fold change) for figures in the manuscript as appropriate.

      (2) Figure 2D: annotate protein domain information at the side. Supplementary Figure 2 shows the domains but it would be more informative to show it in Figure 2D heatmap.

      Thank you for the suggestion, we have annotated protein domain information on the left side of the heatmap in (the now) Figure 2C.

      Reviewer #2 (Recommendations For The Authors):

      Major Concerns:

      (1) How many replicates of the screen were performed? It seems like only one library infection/ proliferation assay was done. If so this is insufficient to obtain any idea of the uncertainty of measurement for each variant.

      The assay was repeated in duplicate for 28 CDKN2A residues. For the remaining 128 residues of CDKN2A, the assay was completed once. We found good agreement between variant classifications in assay repeats. We have added to the text as follows:

      “To confirm the reproducibility of our variant classifications, 28 amino acid residues were assayed in duplicate, and variants classified using the gamma GLM. The majority of missense variants, 452 of 560 (80.7%), had the same functional classification in each of the two replicates (Figure 2 -figure supplement 3A and B, Appendix 1-table 4).”

      We have also added discussion of this study limitation to the manuscript:

      “We repeated our functional assay twice for 28 CDKN2A residues. For the remaining 128 residues of CDKN2A, the functional assay was completed once. While we found general agreement between functional classifications from each replicate for the 28 residues assayed in duplicate, additional repeats for each residue are necessary to determine variability in variant functional classifications.”

      (2) The count data from the experiment and NGS pipeline to call variants need to be provided for each replication (i.e. the counts that were fed into the gamma model)

      Accompanying this should be information about the depth of sequencing of the cells, the number of cells infected with the library, and standard metrics for pooled screens.

      Quality metrics regarding the representation and completeness of the TWIST library need to be provided. See Brenan et al. Cell Reports (2016) Supplemental Figure 1

      Thank you for your suggestion. We are happy to provide this additional information. Sequence read counts for each variant are given in Appendix 1-table 4. We have provided addition detail in the methods section on functional assay, including number of cells infected with each library:

      “Lentiviral transduction

      PANC-1 cells were used for CDKN2A plasmid library and single variant CDKN2A expression plasmid transductions. PANC-1 cells previously transduced with pLJM1-CDKN2A (PANC-1CDKN2A) and selected with puromycin were used for CellTag library transductions. Briefly, 1 x 105 cells were cultured in media supplemented with 10 ug/ml polybrene and transduced with 4 x 107 transducing units per mL of lentivirus particles. Cells were then centrifuged at 1,200 x g for 1 hour. After 48 hours of culture at 37oC and 5% CO2, transduced cells were selected using 3 µg/ml puromycin (CDKN2A plasmid libraries and single variant CDKN2A expression plasmids) or 5 µg/ml blasticidin (CellTag plasmid library) for 7 days. Expected MOI was one. After selection, cells were trypsinized and 5 x 105 cells were seeded into T150 flasks. DNA was collected from remaining cells and this sample was named as (Day 9). T150 flasks were cultured until confluent and then DNA was collected. The time for cells to become confluent varied for each amino acid residue (Day 16 – 40, Appendix 1-table 5). DNA was extracted from PANC-1 cells using the PureLink Genomic DNA Mini Kit (Invitrogen, Carlsbad, CA; catalog no. K1820-01). The assay for CellTag library was repeated in triplicate. We repeated our CDKN2A assay in duplicate for 28 residues. For the remaining 128 CDKN2A residues the assay was completed once.”

      We have also provided additional information on the TWIST library:

      “CDKN2A expression plasmid libraries

      Codon-optimized CDKN2A cDNA using p16INK4A amino acid sequence (NP_000068.1), was designed (Appendix 1-table 12) and pLJM1 containing codon optimized CDKN2A (pLJM1-CDKN2A) generated by Twist Bioscience (South San Francisco, CA). 156 plasmid libraries were then synthesized by using pLJM1-CDKN2A, such that each library contained all possible 20 amino acids variants (19 missense and 1 synonymous) at a given position, generating 500 ng of each plasmid library (Twist Bioscience, South San Francisco, CA). The proportion of variant in each library was shown in Appendix 1-table 2. Variants with a representation of less than 1% in a plasmid library were individually generated using the Q5 Site-Directed Mutagenesis kit (New England Biolabs, Ipswich, MA; catalog no. E0552), and added to each library to a calculated proportion of 5%. Primers used for site-directed mutagenesis are given in Appendix 1-table 13. Each library was then amplified to generate at least 5 ug of plasmid DNA using QIAGEN Plasmid Midi Kit (QIAGEN, Germantown, MD; catalog no. 12143).”

      (3) It is unclear when barcode abundance is assessed in the cell proliferation assay/in the screen. The exact timepoints of "before and after in vitro culture" (line 91) need to be clarified in the text.

      We are happy to clarify. We collected DNA on Day 9 post transfection and at confluency. Day of confluency for each residue is detailed in Appendix 1-table 5. The text of the manuscript has been updated appropriately.

      (4) Is "before" day 9, as detailed in Figure 1 source data 1? If so, it is misleading to state that the experiment is in culture for 14 days but call day 9 "before... in vitro culture."

      The "before" sample should be obtained immediately after viral infection and selection with the library to provide a representation of library representation.

      We apologize for your confusion. We have clarified in the text and figures that our baseline measurement was at Day 9 post transfection. We also determined whether the proportion of each variant is maintained in the Day 9 cell pool compared to the amplified plasmid library for three CDKN2A amino acid residues (p.R24, p.H66, and p.A127) and updated the manuscript text:

      “To confirm that the representation of each variant was maintained after transduction, we transduced three lentiviral libraries (amino acid residues p.R24, p.H66, and p.A127) individually into PANC-1 cells and determined the proportion of each variant in the amplified plasmid library and in the cell pool at Day 9 post-transduction. The proportion of each variant in the amplified plasmid library and in the cell pool at Day 9 were highly correlated (Figure 1 -figure supplement 2C and D, Appendix 1-table 3).”

      (5) There is no information regarding the function of each variant, aside from just a p-value resulting from the final analysis with the gamma model. Some variants may cause loss of function, others may be neutral while others may be gain of function. Simply providing a p-value is not sufficient. The standard in the field is to provide a function score/ test-statistic giving the sign and magnitude of the effect. For proliferation assays at least a ratio of fold-change of (mut/ synonymous)[day 14] vs (mut/synonymous)[baseline] should be provided.

      Thank you for your comment. We have provided read counts, P values, and functional classifications for each variant using the gamma GLM in Appendix 1-table 4. We have also analyzed variants using log2 normalized fold change. This data is presented in the text and compared to our classifications with the gamma GLM. We have provided normalized fold change and resulting classification for each variant in Appendix 1-table 6.

      (6) A plot of the distribution of function scores for all variants is needed. This will serve as an effective visual to distinguish the control variants from those that are functionally deleterious or benign/neutral (see Findlay et al. Nature (2018) Figure 3A for an example visual).

      Thank you for your suggestion. We have provided additional figures to visualize distribution of assay outputs using the gamma GLM in Figure 2 -figure supplement 1.

      (7) Synonymous variants are used as a proxy for WT per variant library, but do all the synonymous variants truly behave like WT CDKN2A in their ability to suppress cell proliferation? A plot of the distribution of synonymous variant function relative to WT CDKN2A function would be effective here.

      All 156 synonymous variants suppressed cell proliferation and were classified as functionally neutral in our assay using the gamma GLM. The manuscript has been updated to reflect this:

      “Of 156 synonymous variants and six missense variants previously reported to be functionally neutral in published literature and/or reported in ClinVar as benign or likely benign (benchmark benign variants), all were characterized as functionally neutral in our assay (Figure 2B, Figure 2 -figure supplement 1B and 1C, Appendix 1-table 4)”

      (8) The gamma generalized linear model is not commonly used to analyze the results of saturation mutagenesis screens. Please provide a justification for the use of this analysis method vs using log fold change as other dms scan studies have done (PMID: 27760319, PMID: 30224644).

      Thank you for this important suggestion. We are happy to provide additional information. We used a gamma GLM to functionally characterize CDKN2A variants as it does not rely on an annotated set of pathogenic and benign variants to determine classification thresholds. Instead, classification thresholds are determined using the change in representation of 20 non-functional barcodes in a pool of PANC-1 cells stably expressing CDKN2A after a period of in vitro growth. As a gamma GLM is not commonly used for saturation mutagenesis screens, as noted by the reviewer, we also classified variants using log2 normalized fold change. We compared variant functional classifications using the gamma GLM and log2 normalized fold change and in general we found agreement between both methods with 98.5% of missense variants classified as functionally deleterious using a gamma GLM, similarly classified using log2 normalized fold change. We have updated the text to reflect this reasoning and additional analysis.

      (9) The statistical methods used to calculate enrichment of deleterious variants per region of CDKN2A (Figure 2 supplement 1B; lines 163-168) are not described anywhere in the paper. Additionally, the same statistical analysis is not applied to the variants in the subregions near the ankyrin repeats (lines 168-172).

      We are happy to clarify and have added text to the methods section:

      “Z-tests with multiple test correction performed with the Bonferroni method was used in the following comparisons: 1) proportion of functionally deleterious variants present in < 2% of the cell pool and ≥ 2% of the cell pool at Day 9 binned in 1% intervals, 2) proportion of variants in each domain predicted to have deleterious or pathogenic effect by the majority of algorithms, 3) proportion of functionally deleterious variants in each domain, and 4) proportion of functionally deleterious missense variants and somatic mutations.”

      Minor:

      (1) Please review the manuscript for spelling and grammatical errors.

      Sure.

    1. Author response:

      Reviewer #1:

      Weaknesses:

      However, the authors should conduct a more thorough computational analysis to complement their manuscript. While the identification of improved multi-point mutants is commendable, the manuscript lacks a detailed investigation into the mechanisms by which these mutations enhance protein properties. The authors briefly mention that some physicochemical characteristics of the mutants are unusual, but they do not delve into why these mutations result in improved performance. Could computational techniques, such as molecular dynamics simulations, be employed to explore the effects of these mutations?  Additionally, the authors claim that their method is efficient. However, the selected VHH is relatively short (<150 AA), resulting in lower computational costs. It remains unclear whether the computational cost of this approach would still be acceptable when designing larger proteins (>1000 AA). Besides, the design process involves a large number of prediction tasks, including the properties of both single-site saturation and multi-point mutants. The computational load is closely tied to the protein length and the number of mutation sites. Could the authors analyze the model's capability boundaries in this regard and discuss how scalable their approach is when dealing with larger proteins or more complex mutation tasks?

      We agree that further analysis of the mechanisms by which the identified mutations enhance protein performance would strengthen our study. In the revised manuscript, we plan to conduct molecular dynamics simulations to explore the physicochemical effects of these mutations in more details. This analysis will help elucidate how the observed structural and dynamic changes contribute to the improved resistance and stability of the designed VHH antibody.

      We acknowledge the need to assess the scalability of our method to larger proteins. To address this, we will include an analysis of the method’s performance when applied to longer proteins, including an estimation of computational cost and potential bottlenecks.

      Reviewer #2:

      (1) The writing throughout the paper is poor. This leaves the reader confused.

      (2) The main technical issue the authors address is whether AI can identify protein mutations that adapt to extreme environments based solely on natural protein data. However, the introduction could be more concise and focused on the key points to better clarify the significance of this question.

      (3) The authors did not develop a new model but instead used their previously developed Pro-PRIME model. This significantly weakens the novelty and contribution of this work.

      (4) The computational experiments are not well-justified. For instance, the authors used a zero-shot setting for single-point mutation experiments but opted for fine-tuning in multiple-point mutation experiments. There is no clear explanation for this discrepancy. How does the model perform in zero-shot settings for multiple-point mutations? How would fine-tuning affect single-point mutation results? The choice of these strategies seems arbitrary and lacks sufficient discussion.

      (1&2) We will revise the manuscript to improve the overall clarity and readability. Specifically, we will restructure the introduction to focus more concisely on the key scientific questions and contributions of our study.

      (3) While the Pro-PRIME model was previously developed, this work focuses on designing proteins with properties that do not naturally exist and are scarce in the natural world. To address the concern about novelty, we will expand the discussion to highlight this unique contribution and its implications for advancing protein design.

      (4) We appreciate the comment regarding the discrepancy between the zero-shot and fine-tuning strategies. In the revised manuscript, we will provide a detailed explanation for the choice of these settings, including an analysis of the trade-offs between zero-shot and fine-tuning approaches in multi-point mutation tasks. We will also explore the model’s performance in zero-shot settings for multi-point mutations and report these results in the supplementary materials to ensure completeness.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      Qin and colleagues analysed data from the Human Connectome Project on four right-handed subgroups with different gyrification patterns in Heschl's gyrus. Based on these groups, the authors highlight the structure-function relationship of planum temporale asymmetry in lateralised language processing at the group level and next at the individual level. In particular, the authors propose that especially microstructural asymmetries are related to functional auditory language asymmetries in the planum temporale.

      Strengths:

      The study is interesting because of an ongoing and long-standing debate about the relationship between structural and functional brain asymmetries, and in particular whether structural brain asymmetries can be seen as markers of functional language brain lateralisation.

      In this debate, the relationship between Heschl's gyrus asymmetry and planum temporale asymmetry is rare and therefore valuable here. A large sample size and inter-rater reliability support the findings.

      Weaknesses:

      In this case of multiple brain measures, it would be important to provide the reader with some sort of effect size (e.g. Cohen's d) to help interpret the results.

      Thank you for pointing this out. In the revised version, the effect size, i.e., Cohen's d, has been incorporated into the results (page 8, line 159-160; page 9, line 181-186, supplementary page 14, Table S14).

      In addition, the authors highlight the microstructural results in spite of the macrostructural results. However, the macrostructural surface results are also strong. I would suggest either reducing the emphasis on micro vs macrostructural results or adding information to justify the microstructural importance.

      In the original manuscript, we highlighted the results of microstructural measures because the correlations between PT microstructural and functional measures were more pronounced both within the hemispheres and in terms of asymmetry, compared with the significant results of surface area. Following your comments here, we now lowered the tone of microstructure results (page 2, line 40; page 14, line 267), and added relevant discussion regarding the macrostructural results in the revised version (page 18, line 363-370; as copied below):

      “As for macrostructural measures, the asymmetric PT surface area was also associated with speech comprehension AI. Given that the within-hemispheric coupling tendency between surface and speech comprehension existed only in the left PT, it was possible that the larger surface area of the left PT led to a less recruitment of its right homologous, and therefore the lateralization of functional activity would be more pronounced. Additionally, an opposite tendency was found between the correlation of speech perception and comprehension with surface area, potentially implying the segregation of the different speech processing in the PT area.”

      Recommendations for the authors:

      I have only some comments that I wish to be addressed by the authors:

      (1) Please always specify "structural" or "functional" asymmetry or lateralisation, as the reader may be confused.

      This has been done in relevant places.

      (2) Please state that the scale is not the same between the results in Figure 3.

      This have been specified, as suggested (see below).

      “Notably, we did not standardize these structural measures, so the scales differed between indicators.”

      (3) It may be of interest to the reader to learn more about interpretations of how Heschl's gyrus and planum temporale asymmetries are related.

      Thank you for this comment. Given that the asymmetry of Heschl's gyrus was not analyzed in the present study, we do not have direct data/results for such an interpretation. Also, we reviewed the literature but found no relevant results on how Heschl's gyrus and planum temporale asymmetries are related. To address this, specific investigation targeting on this topic is needed. This has now been added in the discussion (page 20, line 415-417).

      (4) As this manuscript builds somewhat on the Science Advances article by Ocklenburg et al. (2018), it would be important to discuss how this more liberal planum temporale definition might (or might not) affect the results compared to the more conservative planum temporale definition described here.

      Yes, the definition of planum temporale varies across studies. Our current manual one is relatively more conservative than the Ocklenburg et al. (2018), in which the planum temporale was automatically derived from the Destrieux atlas. We believe that the definition of the planum temporale likely have non-trivial impact on the results, and our current manual definition with the consideration of the HG duplication should be more reliable and accurate, therefore favored, relative to the other ones. This has been briefly discussed in the revision (page 15-16, line 300-304).

      (5) I would like the authors to briefly but critically discuss what exactly the MRI NODDI model measures and how this is interpreted as measuring microstructural properties of tissue.

      We now provided relevant information regarding the NODDI measures (page 26, line 552-558; as copied below).

      “NODDI is a highly effective method for detecting key features of neurite morphology, which employs a tissue model that detects three microstructural environments: the intracellular, extracellular and cerebrospinal fluid compartments (Zhang et al., 2012). In the grey matter of the cerebral cortex, the neurite density index (NDI) is an estimated volume fraction of the intracellular microstructural environment, with higher NDIs indicating greater neurite density (Jespersen et al., 2010; Zhang et al., 2012). The orientation dispersion index (ODI) is a measure of the alignment or dispersion of neurite, with higher ODIs indicating more dispersed neurite and lower ODIs indicating more aligned neurite (Jespersen et al., 2012; Zhang et al., 2012).”

      (6) While not mandatory, I would be interested to read the authors' thoughts on the evolution of such a functional/(micro)structural lateralisation link of the planum temporale, in light of the literature on planum temporale asymmetries in (newborn) non-human primate species.

      Thank you for this inspiring suggestion. We have incorporated relevant discussion into the revised version (page 15, line 281-288; as copied below).

      “Moreover, there exist evolutionary evidence supporting the role of the PT as an anatomical substrate for language lateralization. For example, the leftward structural asymmetry of the PT have been observed in multiple non-human primates, including chimpanzees, macaques, and baboons (Becker et al., 2024; Gannon et al., 1998; Xia et al., 2019). Particularly, recent studies on baboons further demonstrated that PT structural leftward asymmetry in newborn baboons could predict future development of communicative gestures, implying a key role of PT structural asymmetry in the lateralized communication system for human and non-human brain evolution (Becker et al., 2024, 2021).”

      Reference

      Becker Y, Phelipon R, Marie D, Bouziane S, Marchetti R, Sein J, Velly L, Renaud L, Cermolacce A, Anton J-L, Nazarian B, Coulon O, Meguerditchian A. 2024. Planum temporale asymmetry in newborn monkeys predicts the future development of gestural communication’s handedness. Nat Commun 15:4791. doi:10.1038/s41467-024-47277-6

      Becker Y, Sein J, Velly L, Giacomino L, Renaud L, Lacoste R, Anton J-L, Nazarian B, Berne C, Meguerditchian A. 2021. Early Left-Planum Temporale Asymmetry in newborn monkeys (Papio anubis): A longitudinal structural MRI study at two stages of development. NeuroImage 227:117575. doi:10.1016/j.neuroimage.2020.117575

      Gannon PJ, Holloway RL, Broadfield DC, Braun AR. 1998. Asymmetry of Chimpanzee Planum Temporale: Humanlike Pattern of Wernicke’s Brain Language Area Homolog. Science 279:220–222. doi:10.1126/science.279.5348.220

      Jespersen SN, Bjarkam CR, Nyengaard JR, Chakravarty MM, Hansen B, Vosegaard T, Østergaard L, Yablonskiy D, Nielsen NChr, Vestergaard-Poulsen P. 2010. Neurite density from magnetic resonance diffusion measurements at ultrahigh field: Comparison with light microscopy and electron microscopy. NeuroImage 49:205–216. doi:10.1016/j.neuroimage.2009.08.053

      Jespersen SN, Leigland LA, Cornea A, Kroenke CD. 2012. Determination of Axonal and Dendritic Orientation Distributions Within the Developing Cerebral Cortex by Diffusion Tensor Imaging. IEEE Trans Med Imaging 31:16–32. doi:10.1109/TMI.2011.2162099

      Xia J, Wang F, Wu Z, Wang L, Zhang C, Shen D, Li G. 2019. Mapping hemispheric asymmetries of the macaque cerebral cortex during early brain development. Hum Brain Mapp. doi:10.1002/hbm.24789

      Zhang H, Schneider T, Wheeler-Kingshott CA, Alexander DC. 2012. NODDI: Practical in vivo neurite orientation dispersion and density imaging of the human brain. NeuroImage 61:1000–1016. doi:10.1016/j.neuroimage.2012.03.072

      Reviewer #2 (Public Review):

      Summary:

      The authors assessed the link between structural and functional lateralization in area PT, one of the brain areas that is known to exhibit strong structural lateralization, and which is known to be implicated in speech processing. Importantly, they included the sulcal configuration of Heschl's gyrus (HG), presenting either as a single or duplicated HG, in their analysis. They found several significant associations between microstructural indices and task-based functional lateralization, some of which depended on the sulcal configuration.

      Strengths:

      A clear strength is the large sample size (n=907), an openly available database, and the fact that HG morphology was manually classified in each individual. This allows for robust statistical testing of the effects across morphological categories, which is not often seen in the literature.

      Weaknesses:

      - Unfortunately, no left-handers were included in the study. It would have been a valuable addition to the literature, to study the effect of handedness on the observed associations, as many previous studies on this topic were not adequately powered. The fact that only right-handers were studied should be pointed out clearly in the introduction or even the abstract.

      Thank for pointing this out. We have explicitly specified this in the Abstract and Introduction.

      - The tasks to quantify functional lateralization were not specifically designed to pick up lateralization. In the interest of the sample size, it is understandable that the authors used the available HCP-task-battery results, however, it would have been feasible to access another dataset for validation. A targeted subset of results, concerning for example the relationship between sulcal morphology and task-based functional lateralization, could be re-assessed using other open-access fMRI datasets.

      Yes, the fMRI task was not specifically designed to evaluate PT functional lateralization, which has been acknowledged in the discussion (page 17, line 330-342). Given the observed small effect size of our current structural-functional relationship, reproducing similar results with other datasets would require a cohort with a large sample size. This would induce a quite labor-intensive work given our current manual protocol for outlining PT and HG for everyone. The lack of validation with independent dataset has been discussed as a limitation in the revised version. We will try to conduct such a validation in future work, likely after developing an automatic pipeline for accurately extracting the PT and HG in the individual space (like the manual outlining protocol).

      - The study is mainly descriptive and the general discussion of the findings in the larger context of brain lateralization comes a bit short. For example, are the observed effects in line with what we know from other 'language-relevant' areas? What could be the putative mechanisms that give rise to functional lateralization based on the microstructural markers observed? And which mechanisms might be underlying the formation of a duplicated HG?

      Thank you for these insightful comments. As suggested, we strengthened the discussion as below:

      “Another possible explanation could be that higher myelin content and larger surface area in left PT potentially indicated more white matter connection with other language-related regions such as Broca’s area, and therefore is more involved in language tasks than its right homolog (Allendorfer et al., 2016; Catani et al., 2005; Giampiccolo and Duffau, 2022).

      The distinct roles of left and right PT in speech processing have been well-documented. A number of studies substantiated that PT of the left hemisphere responded more strongly to lexical-semantic and syntactic aspects of sentence processing, whereas the right hemisphere demonstrated a greater involvement in the speech melody (Albouy et al., 2020; Meyer et al., 2002).

      These findings are consistent with those reported for the arcuate fasciculus (AF). The left AF has been identified as a crucial structure for language function (Giampiccolo and Duffau, 2022; Zhang et al., 2021). Disruption to this pathway has been linked to multimodal phonological and semantic deficits (Agosta et al., 2010), while injuries in the right AF did not affect language function (Zeineh et al., 2015).”

      Regarding the mechanism underlying the formation of a duplicated HG, we did not come up with good thoughts after careful literature review. Also, we feel that this is kind of out of the scope of the present study and therefore did not add more discussion on this topic.

      Recommendations for the authors:

      (1) The data availability statement makes no explicit mention of the manual labels of HG configuration. Would the authors consider making available a list of HCP-subject-ID with a morphological group (L1/R1, L1/R2, etc.) for replicability and for re-use by other researchers?

      The list of HCP-subject-ID with a morphological group (L1/R1, L1/R2, etc.) is now available in the supplementary material 2. We have specified this in the revised version.

      (2) It would be helpful to state again the statistical tests associated with the p-value in the figure/table caption, e.g. Table 2.

      As suggested, we now specified the statistical method in the figure/table caption.

      (3) Sometimes, the y-axis labels are missing or not clear, for example in Figure S2.

      Sorry about these. We double-checked all the figures, and corrected the missing or unclear labels for Figure S2 and S3 in the revised version.

      (4) In a few instances the font sizes vary within a figure caption.

      This has been corrected in the revision.

      Reference

      Agosta F, Henry RG, Migliaccio R, Neuhaus J, Miller BL, Dronkers NF, Brambati SM, Filippi M, Ogar JM, Wilson SM, Gorno-Tempini ML. 2010. Language networks in semantic dementia. Brain J Neurol 133:286–299. doi:10.1093/brain/awp233

      Albouy P, Benjamin L, Morillon B, Zatorre RJ. 2020. Distinct sensitivity to spectrotemporal modulation supports brain asymmetry for speech and melody. Science 367:1043–1047. doi:10.1126/science.aaz3468

      Allendorfer JB, Hernando KA, Hossain S, Nenert R, Holland SK, Szaflarski JP. 2016. Arcuate fasciculus asymmetry has a hand in language function but not handedness. Hum Brain Mapp 37:3297–3309. doi:10.1002/hbm.23241

      Catani M, Jones DK, Ffytche DH. 2005. Perisylvian language networks of the human brain. Ann Neurol 57:8–16. doi:10.1002/ana.20319

      Giampiccolo D, Duffau H. 2022. Controversy over the temporal cortical terminations of the left arcuate fasciculus: a reappraisal. Brain J Neurol 145:1242–1256. doi:10.1093/brain/awac057

      Meyer M, Alter K, Friederici AD, Lohmann G, von Cramon DY. 2002. FMRI reveals brain regions mediating slow prosodic modulations in spoken sentences. Hum Brain Mapp 17:73–88. doi:10.1002/hbm.10042

      Zeineh MM, Kang J, Atlas SW, Raman MM, Reiss AL, Norris JL, Valencia I, Montoya JG. 2015. Right arcuate fasciculus abnormality in chronic fatigue syndrome. Radiology 274:517–526. doi:10.1148/radiol.14141079

      Zhang H, Schneider T, Wheeler-Kingshott CA, Alexander DC. 2012. NODDI: Practical in vivo neurite orientation dispersion and density imaging of the human brain. NeuroImage 61:1000–1016. doi:10.1016/j.neuroimage.2012.03.072

      Zhang J, Zhong S, Zhou L, Yu Yamei, Tan X, Wu M, Sun P, Zhang W, Li J, Cheng R, Wu Y, Yu Yanmei, Ye X, Luo B. 2021. Correlations between Dual-Pathway White Matter Alterations and Language Impairment in Patients with Aphasia: A Systematic Review and Meta-analysis. Neuropsychol Rev 31:402–418. doi:10.1007/s11065-021-09482-8

      Reviewing Editor:

      I encourage the authors to incorporate the suggestions of the reviewers, such as:

      (1) to provide more in-depth interpretations about how and why structural and functional lateralization relate,

      Done.

      (2) to provide statistical effect sizes,

      Done.

      (3) to make their sulcal-morphology classification openly available,

      Done.

      (4) to provide statistical effect sizes,

      Done

      (5) to discuss the possible impact of diverging PT definitions with regard to previous studies,

      Done.

      (6) to provide more in-depth interpretations about how and why structural and functional lateralization relate.

      Done.

      Detailed comments:

      In an impressive cohort of 907 human participants, the present paper presents a very interesting set of data on PT asymmetries not only at the macro-structural but also at the microstructural levels in order to investigate their potential correlates with PT functional asymmetry in relation to perceptual acoustic language tasks.

      I believe this is a key paper for the following reasons:

      (1) it provides critical data and results for addressing a controversial but important question: the relevance of measures of anatomical asymmetry for inferring its language-related functional hemispheric specialization;

      (2) to do so, the authors made a very impressive effort to manually trace the anatomical delineation of the planum temporale at different levels in every participant, the best (but crazy time-consuming) approach so far to document interindividual variability of the PT and to address such a question;

      (3) the contribution is particularly relevant regarding the statistical power of the study, the study and measures having been done in 907 participants!

      (4) I also found the study well designed and well written with great relevance of the findings for the field.

      As the results, the authors reported asymmetric measures of microstructural asymmetry (including intracortical myelin content, neurite density, and neurite orientation) but also of macrostructural asymmetries in relation to functional lateralization for language.

      Comments:

      I have only 2 additional minor comments of my own:

      (1) In agreement with reviewer 2, I don't understand why the authors seem to downplay the links they found between gross PT asymmetry and functional lateralization. I recommend the authors to highlight and discuss this important result, just as the microstructural PT asymmetries and their functional links.

      This has been done (page 18, line 363-370).

      (2) PT structural asymmetry (both micro & macro) has been well documented in nonhuman primates (and their functional link with manual lateralization for gestural communication). Without detailing this literature, I recommend the authors at least mention this literature as a comparative perspective in the introduction and/or discussion in order to make the question of PT asymmetry less anthropocentric.

      This has been done (page 15, line 281-288).

    1. Author response:

      We thank the reviewers for their feedback. We are currently revising the manuscript to address their questions and concerns. Here we briefly summarize our planned revisions.

      Reviewer 1 requested clarification on three points. We will clarify all these points with text edits. One point is brief enough to be addressed here: in cases when we pooled data from the left and right hemispheres, the reviewer wants to know how this was done. Simply put, we defined the “ipsi” side of the body as the side where the recorded DN resided, and we defined “contra” as the other side.

      Reviewer 2 requested clarification on two minor points. We will clarify these points with text edits and with an additional analysis.

      Reviewer 3 had a number of substantive concerns. Briefly:

      (1) The reviewer asks us to improve its discussion of some relevant literature. We will provide updated information on the DN steering network, and in particular, we will cite Bidaye et al. 2020 and Sapkal et al. 2024. We apologize for the oversight.

      (2) The reviewer asks us for immunofluorescent images documenting the expression patterns of our effector transgenes. With regard to GtACR1::eYPF expression, we will include these images in our resubmission. With regard to ReachR expression, we expressed this reagent stochastically under hs-FLP control, and so different brains had different expression patterns; however, we carefully documented the number of DNa02 cells that expressed ReachR in each brain. With regard to GFP expression, these expression patterns are available online from the FlyLight documentation associated with Namiki et al. eLife 2018 (https://splitgal4.janelia.org/precomputed/Descending%20Neurons%202018.html). The UAS-GFP transgene used by Namiki et al. 2018 (pJFRC200-10XUASIVS-myr::smGFP-HA in attP18) is different from the UAS-GFP transgene we used (10XUAS-IVS-mCD8::GFP(su(Hw)attP8), and so there may be minor differences in expression pattern. However, it should be noted that we only used GFP expression to target somata for patch clamp recording, and DNa01 and DNa02 somata have a distinctive location and a distinctive size; when we performed these recordings, we only targeted a soma in this location, and we verified that there were no “distractor” somata in this vicinity with similar size and appearance. The same applies to patch clamp recordings targeted via Halo7 expression (SiR110-HaloTag fluorescence). In paired recordings from both DNa02 and DN01, we verified the identity of each cell as described in Fig. S1.

      (3) The reviewer asks why we focused on DNa02 in the latter part of the manuscript, rather than DNa01. We made this decision because DNa02 is more highly predictive of steering behavior, as compared to DNa01 (Fig. 1H). Also, an impulse of DNa02 activity is followed by a relatively large turning maneuver, on average, whereas an impulse of DNa01 activity is followed by a relatively small turning maneuver (Fig. 1E-F). Moreover, DNa02 has many more synaptic inputs in the brain (Fig. 7A), and it has many more direct synaptic connections onto motor neurons (Fig. 1B).

      (4) The reviewer highlights difficulties in interpreting DN activity during backward movement (Figs. S3/S4). We included this material in the spirit of completeness, but we agree with the reviewer that it is difficult to interpret. In our revision, we will omit Fig. S3C and Fig. S4A-B, and we will revise these legends to improve clarity.

      (5) The reviewer asks why do a systematic analysis of paired DNa01 recordings, as we did for DNa02. It is difficult to get paired right/left recordings from two DNs of the same type in the same fly, while the fly is walking vigorously, and we were only able to get two such paired recordings from DNa01. We did not feel this was a sufficiently large sample size to support a systematic analysis. We chose not to invest more time in getting more paired DNa01 recordings because we thought that DNa02 was more important, for the reasons noted above.

      (6) The reviewer asks for an analysis of trials where bump-jump led to turning in the opposite direction to the DNa02 being recorded. We will provide this analysis in the revision.

      (7) The reviewer points out that “latent” steering drives might not be latent, as they might produce small postural changes we are not capturing. This is a fair point, and we will note this in our revision.

      (8) The reviewer asks for a systematic analysis of DNa01 inputs in Figure 7, similar to our analysis of DNa02 inputs. Here we would prefer to focus on DNa02, for three reasons. First, we think DNa02 is likely more important, for the reasons noted above. Second, there has been some uncertainty as to the identity of DNa01 in connectome data; indeed, in the hemibrain data set, the cell recently identified as DNa01 was annotated as VES006 (Schlegel et al. Nature 634: 139-152). Third, the cell now identified as DNa01 does not receive direct input from either the central complex or the mushroom body, and for this reason, we felt that the inputs to DNa01 might be less interesting to a general audience.

      (9) The reviewer wonders whether DNa01 is more involved in sideways movement, rather than rotational movement. Our data do not support this conclusion: rather, our data show that DNa01 is only weakly correlated with sideways movement. Thus, the forward filter (Fig. 1F) shows that an impulse of DNa01 activity is (on average) followed by a relatively small amount of sideways movement. Conversely, the reverse filter (in Fig. S2I) shows that an impulse of sideways movement is (on average) preceded by a relatively large amount of DNa01 activity.

      (10) The reviewer points out that the phenotype associated with optogenetic suppression in Fig. 8G is weak. We will highlight this point and discuss potential reasons for this weak phenotype in the revision.

    1. Author response:

      The following is the authors’ response to the original reviews.

      This study presents a valuable finding on sperm flagellum and HTCA stabilization. The evidence supporting the authors' claims is incomplete. The work will be of broad interest to cell and reproductive biologists working on cilium and sperm biology.

      We thank the Editor and the two reviewers for their time and thorough evaluation of our manuscript. We greatly appreciate their valuable guidance on improving our study. In the revised manuscript, we have conducted additional experiments and provided quantitative data in response to the reviewers' comments. Furthermore, we have refined the manuscript and added further context to elucidate the significance of our findings for the readers.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this paper, Wu et al. investigated the physiological roles of CCDC113 in sperm flagellum and HTCA stabilization by using CRISPR/Cas knockouts mouse models, co-IP, and single sperm imaging. They find that CCDC113 localizes in the linker region among radial spokes, the nexin-dynein regulatory complex (N-DRC), and doublet microtubules (DMTs) RS, N-DRC, and DMTs and interacts with axoneme-associated proteins CFAP57 and CFAP91, acting as an adaptor protein that facilitates the linkage between RS, N-DRC, and DMTs within the sperm axoneme. They show the disruption of CCDC113 produced spermatozoa with disorganized sperm flagella and CFAP91, DRC2 could not colocalize with DMTs in Ccdc113-/- spermatozoa. Interestingly, the data also indicate that CCDC113 could localize on the HTCA region, and interact with HTCA-associated proteins. The knockout of Ccdc113 could also produce acephalic spermatozoa. By using Sun5 and Centlein knockout mouse models, the authors further find SUN5 and CENTLEIN are indispensable for the docking of CCDC113 to the implantation site on the sperm head. Overall, the experiments were designed properly and performed well to support the authors' observation in each part. Furthermore, the study's findings offer valuable insights into the physiological and developmental roles of CCDC113 in the male germ line, which can provide insight into impaired sperm development and male infertility. The conclusions of this paper are mostly well supported by data, but some points need to be clarified and discussed.

      We thank Reviewer #1 for his or her critical reading and the positive assessment.

      (1) In Figure 1, a sperm flagellum protein, which is far away from CCDC113, should be selected as a negative control to exclude artificial effects in co-IP experiments.

      We greatly appreciate Reviewer #1’s insightful suggestion. In response, we selected two sperm outer dense fiber proteins, ODF1 and ODF2, which are located distant from the sperm axoneme, as negative controls in the co-IP experiments. As shown in Figure 1- figure supplement 1A and B, neither ODF1 nor ODF2 bound to CCDC113, indicating the interaction observed in Figure 1 is not an artifact.

      (2) Whether the detachment of sperm head and tail in Ccdc113-/- mice is a secondary effect of the sperm flagellum defects? The author should discuss this point.

      Good question. Considering that CCDC113 is localized in the sperm neck region and interacts with SUN5 and CENTLEIN, it may play a direct role in connecting the sperm head and tail. Indeed, PAS staining revealed that Ccdc113–/– sperm heads exhibit abnormal orientation in stages V–VIII of the seminiferous epithelia (Figure 6C-D). Furthermore, transmission electron microscopy (TEM) analysis indicated that the absence of CCDC113 caused detachment of the damaged coupling apparatus from the sperm head in step 9–11 spermatids (Figure 6E). These results suggest that the detachment of the sperm head and tail in Ccdc113–/– mice may not be a secondary effect of sperm flagellum defects. We have discussed this point further below:

      “CCDC113 can interact with SUN5 and CENTLEIN, but not PMFBP1 (Figure 7A-C), and left on the tip of the decapitated tail in Sun5–/– and Centlein–/– spermatozoa (Figure 7K and L). Furthermore, CCDC113 colocalizes with SUN5 in the HTCA region, and immunofluorescence staining in spermatozoa shows that SUN5 is positioned closer to the sperm nucleus than CCDC113 (Figure 7G and H). Therefore, SUN5 and CENTLEIN may be closer to the sperm nucleus than CCDC113. PAS staining revealed that Ccdc113–/– sperm heads are abnormally oriented in stages V–VIII seminiferous epithelia (Figure6 C and D), and TEM analysis further demonstrated that the disruption of CCDC113 causes the detachment of the destroyed coupling apparatus from the sperm head in step 9–11 spermatids (Figure 6E). All these results suggest that the detachment of sperm head and tail in Ccdc113–/– mice may not be a secondary effect of sperm flagellum defects.”

      (3) Given that some cytoplasm materials could be observed in Ccdc113-/- spermatozoa (Fig. 5A), whether CCDC113 is also essential for cytoplasmic removal?

      Good question. Unremoved cytoplasm could be detected in spermatozoa by using transmission electron microscopy (TEM) analysis, including disrupted mitochondria, damaged axonemes, and large vacuoles. These observations indicate defects in cytoplasmic removal in Ccdc113–/– mice. We have discussed this point as below:

      “Moreover, TEM analysis detected excess residual cytoplasm in spermatozoa, including disrupted mitochondria, damaged axonemes, and large vacuoles, indicating defects in cytoplasmic removal in Ccdc113–/– mice (Figure 5A).”

      (4) Although CCDC113 could not bind to PMFBP1, the localization of CCDC113 in Pmfbp1-/- spermatozoa should be also detected to clarify the relationship between CCDC113 and SUN5-CENTLEIN-PMFBP1.

      We appreciate Reviewer #1’s suggestion. We have analyzed the localization of CCDC113 in Pmfbp1-/- spermatozoa and found that CCDC113 was located at the tip of the decapitated tail in Pmfbp1-/- spermatozoa (Figure 7K and L). This finding has been incorporated into the revised manuscript as below:

      “To further elucidate the functional relationships among CCDC113, SUN5, CENTLEIN, and PMFBP1 at the sperm HTCA, we examined the localization of CCDC113 in Sun5-/-, Centlein–/–, and Pmfbp1–/– spermatozoa. Compared to the control group, CCDC113 was predominantly localized on the decapitated flagellum in Sun5-/-, Centlein–/–, and Pmfnp1–/– spermatozoa (Figure 7K and L), indicating SUN5, CENTLEIN, and PMFBP1 are crucial for the proper docking of CCDC113 to the implantation site on the sperm head. Taken together, these data demonstrate that CCDC113 cooperates with SUN5 and CENTLEIN to stabilize the sperm HTCA and anchor the sperm head to the tail.”

      Reviewer #2 (Public Review):

      Summary:

      In the present study, the authors select the coiled-coil protein CCDC113 and revealed its expression in the stages of spermatogenesis in the testis as well as in the different steps of spermiogenesis with expression also mapped in the different parts of the epididymis. Gene deletion led to male infertility in CRISPR-Cas9 KO mice and PAS staining showed defects mapped in the different stages of the seminiferous cycle and through the different steps of spermiogenesis. EM and IF with several markers of testis germ cells and spermatozoa in the epididymis indicated defects in flagella and head-to-tail coupling for flagella as well as acephaly. The authors' co-IP experiments of expressed CCDC113 in HEK293T cells indicated an association with CFAP91 and DRC2 as well as SUN5 and CENTLEIN.

      The authors propose that CCDC113 connects CFAP91 and DRC2 to doublet microtubules of the axoneme and CCDC113's association with SUN5 and CENTLEIN to stabilize the sperm flagellum head-to-tail coupling apparatus. Extensive experiments mapping CCDC13 during postnatal development are reported as well as negative co-IP experiments and studies with SUN5 KO mice as well as CENTLEIN KO mice.

      Strengths:

      The authors provide compelling observations to indicate the relevance of CCDC113 to flagellum formation with potential protein partners. The data are relevant to sperm flagella formation and its coupling to the sperm head.

      We are grateful to Reviewer #2 for his or her recognition of the strength of this study.

      Weaknesses:

      The authors' observations are consistent with the model proposed but the authors' conclusions for the mechanism may require direct demonstration in sperm flagella. The Walton et al paper shows human CCDC96/113 in cilia of human respiratory epithelia. An application of such methodology to the proteins indicated by Wu et al for the sperm axoneme and head-tail coupling apparatus is eagerly awaited as a follow-up study.

      We thank Reviewer 2 for his/her kindly help in improving the manuscript.  We now understand that directly detection of CCDC113 precise localization in sperm axoneme and head-tail coupling apparatus (HTCA) using cryo-electron microscopy (cryo-EM) could powerfully strengthen our model. Recent advances in cryo-EM have indeed advanced our understanding of axonemal structures analysis of axonemal structures and determined the structures of native axonemal DMTs from mouse, bovine, and human sperm (Leung et al., 2023; Zhou et al., 2023). However, high-resolution structures of sperm axoneme and HTCA regions, including those involving CCDC113, have yet to be fully characterized. Thus, we would like to discuss this point and consider it a valuable direction for future research.

      “Given that the cryo-EM of sperm axoneme and HTCA could powerfully strengthen the role of CCDC113 in stabilizing sperm axoneme and head-tail coupling apparatus, it a valuable direction for future research.”

      References:

      Bazan, R., Schröfel, A., Joachimiak, E., Poprzeczko, M., Pigino, G., & Wloga, D. (2021). Ccdc113/Ccdc96 complex, a novel regulator of ciliary beating that connects radial spoke 3 to dynein g and the nexin link. PLoS Genet, 17(3), e1009388.

      Ghanaeian, A., Majhi, S., McCafferty, C. L., Nami, B., Black, C. S., Yang, S. K., Legal, T., Papoulas, O., Janowska, M., Valente-Paterno, M., Marcotte, E. M., Wloga, D., & Bui, K. H. (2023). Integrated modeling of the Nexin-dynein regulatory complex reveals its regulatory mechanism. Nat Commun, 14(1), 5741.

      Leung, M. R., Zeng, J., Wang, X., Roelofs, M. C., Huang, W., Zenezini Chiozzi, R., Hevler, J. F., Heck, A. J. R., Dutcher, S. K., Brown, A., Zhang, R., & Zeev-Ben-Mordehai, T.  (2023). Structural specializations of the sperm tail. Cell, 186(13), 2880-2896.e2817

      Walton, T., Gui, M., Velkova, S., Fassad, M. R., Hirst, R. A., Haarman, E., O'Callaghan, C., Bottier, M., Burgoyne, T., Mitchison, H. M., & Brown, A. (2023). Axonemal structures reveal mechanoregulatory and disease mechanisms. Nature, 618(7965), 625-633.

      Zhou, L., Liu, H., Liu, S., Yang, X., Dong, Y., Pan, Y., Xiao, Z., Zheng, B., Sun, Y., Huang, P., Zhang, X., Hu, J., Sun, R., Feng, S., Zhu, Y., Liu, M., Gui, M., & Wu, J. (2023). Structures of sperm flagellar doublet microtubules expand the genetic spectrum of male infertility. Cell, 186(13), 2897-2910.e2819.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) Please provide full gel for the Figure 2C experiment (could be as a supplementary file).

      Thanks for your insightful suggestions. We have replaced Figure 2C and provided the full gel in Figure 2-figure supplement 1A.

      (2) The authors write on Line 163 "In contrast, the flagellum staining appeared reduced in Ccdc113-/- seminiferous tubules (Fig. 2J, red asterisk)." However, the magnification of the pictures is not sufficient to distinguish anything in the panel mentioned, please provide others.

      Many thanks for pointing this out. We have provided the iconic figure to show the flagella defect in seminiferous tubules.

      (3) Please add statistical p-values for figures.

      Thanks for your valuable advice. We have added statistical p-values to the figures in the revised manuscript.

      (4) Line 128: Should "speculate" be "speculated"?

      Thank you for pointing out this problem. We have corrected it in the revised manuscript, as shown below:

      “Given that CFAP91 has been reported to stabilize RS on the DMTs (Bicka et al., 2022; Dymek et al., 2011; Gui et al., 2021) and cryo-EM analysis shows that CCDC113 is closed to DMTs, we speculated that CCDC113 may connect RS to DMTs by binding to CFAP91 and microtubules.”

      (5) In lines 384-385, more "-" is typed.

      Thank you for pointing out this problem. We have corrected it in the revised manuscript, as shown below:

      “Furthermore, CCDC113 colocalizes with SUN5 in the HTCA region, and immunofluorescence staining in spermatozoa shows that SUN5 is closer to the sperm nucleus than CCDC113 (Figure 7G and H). Therefore, SUN5 and CENTLEIN may be closer to the sperm nucleus than CCDC113.”

      (6) In general, the article has many typos and should be professionally proofread.

      Many thanks for pointing this out. We have thoroughly revised the manuscript with the assistance professional proofreading.

      Reviewer #2 (Recommendations For The Authors):

      Can the authors indicate in the Materials and Methods if n=3 biological replicates were done for all co-IP, EM, LM, and IF studies? The statistical analysis section indicates this but quantification is missing for most figures including co-IP, most IF, PAS staining, EM, etc.

      We thank Reviewer 2 for the insightful comments and guidance to improve our data quality. All the experiments in this study were repeated at least three times to ensure reproducibility. We have quantified the co-IP experiments in Figures 1C-H and 7A-F, the IF data in Figures 2K, 5C, and 5D, as well as the PAS staining in Figure 6C. Since electron microscopy samples require very little testicular tissue and the sections obtained are very thin, the likelihood of capturing sections specifically at the sperm head-tail junction is considerably low. This challenge makes it difficult to perform quantitative analysis and statistical evaluation in the TEM experiment. To address this limitation, we have quantified the percentage of _Ccdc113-/-_sperm heads with abnormal orientation in stages V–VIII of the seminiferous epithelium to indicate impaired head-to-tail anchorage.

      Figure S2 is compelling and might be indicated as a major figure instead of a supplementary figure.

      We appreciate the positive comment. We have included it as a major figure in Figure 3F.

      Figure 4A may be incomplete. Data sets for RNA expression suggest high expression in the ovary and other organs in males and females including the brain and are not indicated by the authors. Figure 4A may be considered for removal with a more complete study for another paper.

      Thank you for pointing out this issue. We reviewed RNA expression data from various tissues using RNA-Seq data from Mouse ENCODE (https://www.ncbi.nlm.nih.gov/gene/244608) and found that CCDC113 is highly expressed in the testis, but not significantly in the ovary and brain (Figure 4- figure supplement 1A). Additionally, we re-evaluated CCDC113 protein levels in the spleen, lung, kidney, testis, intestine, stomach, brain, and ovary, confirming that it is highly expressed in the testes, with negligible expression in the ovary and brain (Figure 4- figure supplement 1B). In line with Reviewer 2's suggestion, we have removed Figure 4A in the revised manuscript.

      There are grammatical errors throughout the manuscript and Figure 7 is truncated.

      Thank you for pointing out this problem. We have thoroughly revised the manuscript with the assistance professional proofreading.

      The Introduction and Discussion parts of the paper may need some clarification for the general reader. The material in the "Additional Context " section of the critique below may be a helpful place to introduce what a stage is, and the steps in germ cell development in the testis with the latter of course where and when the flagellum develops.

      We appreciate your valuable suggestions. We have referred to the material in the “Additional Context” section to introduce the stages of spermatogenesis and the steps in germ cell development in the testis in the introduction and results.

      “Male fertility relies on the continuous production of spermatozoa through a complex developmental process known as spermatogenesis. Spermatogenesis involves three primary stages: spermatogonia mitosis, spermatocyte meiosis, and spermiogenesis. During spermiogenesis, spermatids undergo complex differentiation processes to develop into spermatozoa, which includes nuclear elongation, chromatin remodeling, acrosome formation, cytoplasm elimination, and flagellum development (Hermo et al., 2010).”

      Hermo, L., Pelletier, R. M., Cyr, D. G., & Smith, C. E. (2010). Surfing the wave, cycle, life history, and genes/proteins expressed by testicular germ cells. Part 1: background to spermatogenesis, spermatogonia, and spermatocytes. Microscopy research and technique, 73(4), 241–278. https://doi.org/10.1002/jemt.20783

      “Pioneering work in the mid-1950s used the PAS stain in histologic sections of mouse testis to visualize glycoproteins of the acrosome and Golgi in seminiferous tubules (Oakberg, 1956). The pioneers discovered in cross-sectioned seminiferous tubules the association of differentiating germ cells with successive layers to define different stages that in mice are twelve, indicated as Roman numerals (XII). For each stage, different associations of maturing germ cells were always the same with early cells in differentiation at the periphery and more mature cells near the lumen. In this way, progressive differentiation from stem cells to mitotic, meiotic, acrosome-forming, and post-acrosome maturing spermatocytes was mapped to define spermatogenesis with the XII stages in mice representing the seminiferous cycle. The maturation process from acrosome-forming cells to mature spermatocytes is defined as spermiogenesis with 16 different steps that are morphologically distinct spermatids (O'Donnell L, 2015).”

      Oakberg, E. F. (1956). A description of spermiogenesis in the mouse and its use in analysis of the cycle of the seminiferous epithelium and germ cell renewal. The American journal of anatomy, 99(3), 391-413. https://doi.org/10.1002/aja.1000990303

      O'Donnell L. (2015). Mechanisms of spermiogenesis and spermiation and how they are disturbed. Spermatogenesis, 4(2), e979623. https://doi.org/10.4161/21565562.2014.979623

      For the Discussion, the authors indicate that the function of CCDC113 in mammals is unknown yet the authors point to the work of Walton et al on human respiratory epithelia that points to a function for CCDC96/113. The work in the manuscript here does indicate a role in sperm flagella and the head-to-tail coupling apparatus but remains descriptive until the methodology of Walton et al is applied. Hopefully, the authors will consider it for a follow-up study.

      Thank you for pointing out this problem. We have revised this part and highlighted the Walton et al’s work in the Discussion.

      “CCDC113 is a highly evolutionarily conserved component of motile cilia/flagella. Studies in the model organism, Tetrahymena thermophila, have revealed that CCDC113 connects RS3 to dynein g and the N-DRC, which plays essential role in cilia motility (Bazan et al., 2021; Ghanaeian et al., 2023). Recent studies have also identified the localization of CCDC113 within the 96-nm repeat structure of the human respiratory epithelial axoneme, and localizes to the linker region among RS, N-DRC and DMTs (Walton et al., 2023). In this study, we reveal that CCDC113 is indispensable for male fertility, as Ccdc113 knockout mice produce spermatozoa with flagellar defects and head-tail linkage detachment (Figure 3D).”

      “Overall, we identified CCDC113 as a structural component of both the flagellar axoneme and the HTCA, where it performs dual roles in stabilizing the sperm axonemal structure and maintaining the structural integrity of HTCA. Given that the cryo-EM of sperm axoneme and HTCA could powerfully strengthen the role of CCDC113 in stabilizing sperm axoneme and head-tail coupling apparatus, it a valuable direction for future research.”

      The Discussion may be focused on the key aspects of CCDC113 related to sperm flagella and the head-to-tail coupling apparatus that represent a genuine advance. The more speculative parts of the Discussion that have not been addressed by experimentation in the Results section may be considered for removal in the Discussion section.

      Thank you for pointing out this. We have removed the speculative parts of the Discussion that have not been addressed by experimentation in the Results section.

      Additional Context to help readers understand the significance of the work:

      Pioneering work in the mid-1950s used the periodic acid Schiff (PAS) stain in histologic sections of rodent testis to visualize glycoproteins of the acrosome and Golgi in seminiferous tubules. The pioneers discovered in cross-sectioned seminiferous tubules the association of differentiating germ cells with successive layers to define different stages that in mice are twelve, indicated as Roman numerals (XII). For each stage, different associations of maturing germ cells were always the same with early cells in differentiation at the periphery and more mature cells near the lumen. In this way, progressive differentiation from stem cells to mitotic, meiotic, acrosome-forming, and post-acrosome maturing spermatocytes was mapped to define spermatogenesis with the XII stages in mice representing the seminiferous cycle. The maturation process from acrosome-forming cells to mature spermatocytes is defined as spermiogenesis with 19 different steps that are morphologically distinct spermatids. It is from steps 8-19 of spermiogenesis that the formation of the flagellum takes place. Final maturation occurs in the epididymis as sperm move through the caput, corpus, and cauda of the organ with motile spermatozoa generated.

      Thank you very much!

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors aimed to investigate the oscillatory activity of GnRH neurones in freely behaving mice. By utilising GCaMP fiber photometry, they sought to record real-time neuronal activity to understand the patterns and dynamics of GnRH neuron firing and their implications for reproductive physiology.

      Strengths:

      (1) The use of GCaMP fiber photometry allows for high temporal resolution recordings of neuronal activity, providing real-time data on the dynamics of GnRH neurones.

      (2) Recording in freely behaving animals ensures that the findings are physiologically relevant and not artifacts of a controlled laboratory environment.

      (3) The authors used statistical methods to characterise the oscillatory patterns, ensuring the reliability of their findings.

      Weaknesses:

      (1) While the study identifies distinct oscillatory patterns in GnRH neurones' calcium dynamics, it falls short in exploring the functional implications of these patterns for GnRH pulsatility and overall reproductive physiology.

      The functional roles of pulsatile and surge patterns of GnRH release are extremely well established. We have found perfect correlations between GnRH neuron dendron GCaMP activity and LH pulses as well as the LH surge clearly indicating the function of these activity patterns. We do not know the functional role of the clustered high-frequency basal activity that we have discovered and, as noted in the Discussion, are unsure of its physiological importance. Although it may be minor, it will require future investigation.

      (2) The study lacks a broader discussion to include comparisons with existing studies on GnRH neurone activity and pulsatility and highlight how the findings of this study align with or differ from previous research and what novel contributions are made.

      The Reviewer fails to recognise that these are first recordings of GnRH neurons in vivo. There are no prior studies for comparison. We have noted the only other in vivo study (undertaken by ourselves) many years ago in anaesthetized mice. It was never expected that electrophysiological recordings of GnRH neurons in acute brain slices (by ourselves and others) would reflect their activity in vivo. Now that we know this to be the case, it would be churlish to point this out explicitly. We have made some modifications to the Discussion by comparing the present data more thoroughly with other in vivo GnRH secretion and kisspeptin neuron activity studies.

      (3) The authors aimed to characterise the oscillatory activity of GnRH neurons and successfully identified distinct oscillatory patterns. The results support the conclusion that GnRH neurons exhibit complex oscillatory behaviours, which are critical for understanding their role in reproductive physiology. However, it has not been made clear what exactly the authors mean by "multi-dimensional oscillatory patterns" and how has this been shown.

      The study shows three types of GnRH neuron activity; two of which would be classified as oscillatory in nature and these show different temporal dimensions.

      Reviewer #2 (Public Review):

      Summary:

      In this manuscript, the authors report GCaMP fiber-photometry recordings from the GnRH neuron distal projections in the ventral arcuate nucleus. The recordings are taken from intact, male and female, freely behaving mice. The report three patterns of neuronal activity:

      (1) Abrupt increases in the Ca2+ signals that are perfectly correlated with LH pulses.

      (2) A gradual, yet fluctuating (with a slow ultradian frequency), increase in activity, which is associated with the onset of the LH surge in female animals.

      (3) Clustered (high frequency) baseline activity in both female and male animals.

      Strengths:

      The GCaMP fiber-photometry recordings reported here are the first direct recordings from GnRH neurones in vivo. These recordings have uncovered a rich repertoire of activity suggesting the integration of distinct "surge" and "pulse" generation signals, and an ultradian rhythm during the onset of the surge.

      Weaknesses:

      The data analysis method used for the characterisation of the ultradian rhythm observed during the onset of the surge is not detailed enough. Hence, I'm left wondering whether this rhythm is in any way correlated with the clusters of activity observed during the rest of the cycle and which have similar duration.

      We have provided further information on the characterisation of the ultradian rhythm observed at the time of the surge. Whether this is related to the clustered basal activity is an interesting point but very difficult to resolve. We note that the “basal” and “surge” ultradian oscillations have very different durations of ~30 and ~80 min suggesting that they may be independent phenomenon. However, the only way to really exclude a similar genesis will be to establish the origin of each type of oscillatory activity. Preliminary data in the lab show that the RP3V kisspeptin neurons exhibit an identical pattern of ultradian oscillation at the time of the surge leading us to suspect that the surge oscillation is driven by this input. As noted in the Discussion it is presently difficult to determine where the high basal activity originates.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) Evidence of Multi-Dimensional Oscillatory Patterns: The manuscript presents data showing the oscillatory activity of GnRH neurones with distinct frequency and amplitude characteristics. The analysis includes statistical tests that illustrate the variability in neuronal firing patterns. However, the multi-dimensional nature of this activity has not been demonstrated. It is not clear what is meant by "dimension" with regard to the calcium recordings (oscillatory activity). If the authors refer to the frequency content of the calcium signal then a proper Fourier or Wavelet analysis should be carried out to characterise the multiple frequencies present in the calcium dynamics in male mice and during various stages of the cycle in female mice

      The study shows three types of GnRH neuron activity; two of which would be classified as oscillatory in nature. One occurs for ~10 min every hour or so and the other occurs for ~ 12 hours once every 4-5 days. This does not require any analysis to distinguish between the two or claim that they are different i.e. multidimensional. 

      (2) Data Interpretation: Expand the discussion on the physiological relevance of the identified oscillatory patterns. Specifically, explore how these patterns might influence GnRH pulsatility, hormone secretion dynamics, and reproductive cycles.

      The functional roles of pulsatile and surge patterns of GnRH release are extremely well established. We have found perfect correlations between GnRH neuron dendron GCaMP activity and LH pulses as well as the LH surge clearly indicating the function of these activity patterns. We do not know the functional role of the clustered high-frequency basal activity that we have discovered and, as noted in the Discussion, are unsure of its physiological importance. Although it may be minor, it will require future investigation.

      (3) Literature Contextualisation: Broaden the discussion to include comparisons with existing studies on GnRH neuron activity and pulsatility. Highlight how the findings of this study align with or differ from previous research and what novel contributions are made.

      The Reviewer fails to recognise that these are first recordings of GnRH neurons in vivo. There are no prior studies for comparison. We have noted the only other in vivo study (undertaken by ourselves) many years ago in anaesthetized mice. It would be naive to expect that electrophysiological recordings of GnRH neurons in acute brain slices (by ourselves and others) would reflect their activity in vivo. Now that we know this to be the case, it would be churlish to point this out explicitly. We have made some modifications to the Discussion by comparing the present data more thoroughly with other in vivo GnRH secretion and kisspeptin neuron activity studies.

      (4) Future Directions: Suggest potential follow-up experiments to explore the regulatory mechanisms underlying the observed oscillatory patterns. This could include investigating the role of neurotransmitters, hormonal feedback mechanisms, and other factors that might influence GnRH neuron activity.

      By addressing these recommendations, the authors can further strengthen their manuscript and enhance its impact on the field.

      Reviewer #2 (Recommendations For The Authors):

      Suggestions:

      (1) The authors might want to analyse their inter-peak interval data by fitting them to a simple parametric statistical model (the gamma distribution would be a good choice to capture the skewness of these data). This way they would be able to describe the observed variability, and if the fits are not good back up to their claims "The dSEs occurred on average ... and showed no clear modal distribution pattern (Fig. 2D)".

      Thank you for the suggestion. We have carried out Shapiro-Wilk tests for male inter-peak interval distribution and found a W value of 0.87 and P value <0.0001****, providing strong evidence that the data is not normally distributed. Skewness and Kurtosis values are 1.39 and 1.81 respectively, indicating that the distribution is right-skewed with a platykurtic distribution, indicating that the data is less peaked and more spread out than the normal distribution (with a kurtosis of 3). This has now been added to the manuscript.

      (2) If I understand correctly, in Figure 3D, inter-peak intervals from all 4 stages of the estrus cycle are pooled together. It would also be interesting if the authors gave the interval histograms for the different stages of the cycle separately.

      We have now plotted the inter-peak interval distribution histograms for each individual cycle next to the example traces in Figure 3. The descriptions of the distribution pattern are also updated in the figure legends.

      (3) In Figure 3C, one can see the mean interval for different animals (as open circles), is that right? Is the statistical test run on these animals mean, or is the entire dSEs dataset used? In any case, it's not clear to the reader how variable intervals are in individual recordings from each animal. Could the authors add this information (could be easily added in the figure caption)?

      The reviewer is correct, that each open circle is the mean interval for each animal. The statistical test was run on the animals mean. Now this information is added to the figure legend.

      (4) The authors should explain how they identify the regions (clusters) of high-frequency baseline activity, which they present in Figure 4.

      The relevant information is now added to the methods section under the heading ‘GCaMP6 fiber photometry and blood sampling’.

      (5) The authors should detail how to identify and characterise the ultradian rhythm they observe at the onset of the surge.

      The relevant information is now added to the methods section under the heading ‘GCaMP6 fiber photometry and blood sampling’.

      (6) The author could perform some kind of wavelet-type analysis to quantify and analyse how the frequency content of the observed Ca2+ signal changes over the cycle. From their current analysis, I am not sure whether the ultradian oscillations they observe during the surge are related to the low-activity cluster events they observe during the other stages of the cycle.

      This is an interesting point but very difficult to resolve. We note that the “basal” and “surge” ultradian oscillations have very different durations of ~30 and ~80 min suggesting that they may be independent phenomenon. However, the only way to really exclude a similar genesis will be to establish the origin of each type of oscillatory activity. Preliminary data in the lab show that the RP3V kisspeptin neurons exhibit an identical pattern of ultradian oscillation at the time of the surge leading us to suspect that the surge oscillation is driven by this input. As noted in the Discussion it is presently difficult to determine where the high basal activity originates.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Response to Reviewer’s comments

      We are most grateful for the opportunity to address the reviewer comments. Point-by-point responses are presented below.

      Overall, the paper has several strengths, including leveraging large-scale, multi-modal datasets, using computational reasonable tools, and having an in-depth discussion of the significant results.

      We thank the reviewer for the very supportive comments.

      Based on the comments and questions, we have grouped the concerns and corresponding responses into three categories.

      (1) The scope and data selection

      The results are somewhat inconclusive or not validated.

      The overall results are carefully designed, but most of the results are descriptive. While the authors are able to find additional evidence either from the literature or explain the results with their existing knowledge, none of the results have been biologically validated. Especially, the last three result sections (signaling pathways, eQTLs, and TF binding) further extended their findings, but the authors did not put the major results into any of the figures in the main text.”

      The goal of this manuscript is to provide a list of putative childhood obesity target genes to yield new insights and help drive further experimentation. Moreover, the outputs from signaling pathways, eQTLs, and TF binding, although noteworthy and supportive of our method, were not particularly novel. In our manuscript we placed our focus on the novel findings from the analyses. We did, however, report the part of the eQTLs analysis concerning ADCY3, which brought new insight to the pathology of obesity, in Figure 4C.

      The manuscript would benefit from an explanation regarding the rationale behind the selection of the 57 human cell types analyzed. it is essential to clarify whether these cell types have unique functions or relevance to childhood development and obesity.

      We elected to comprehensively investigate the GWAS-informed cellular underpinnings of childhood development and obesity. By including a diverse range of cell types from different tissues and organs, we sought to capture the multifaceted nature of cellular contributions to obesity-related mechanisms, and open new avenues for targeted therapeutic interventions.

      There are clearly cell types that are already established as being key to the pathogenesis of obesity when dysregulated: adipocytes for energy storage, immune cell types regulating inflammation and metabolic homeostasis, hepatocytes regulating lipid metabolism, pancreatic cell types intricately involved in glucose and lipid metabolism, skeletal muscle for glucose uptake and metabolism, and brain cell types in the regulation of appetite, energy expenditure, and metabolic homeostasis.

      While it is practical to focus on cell types already proven to be associated with or relevant to obesity, this approach has its limitations. It confines our understanding to established knowledge and rules out the potential for discovering novel insights from new cellular mechanisms or pathways that could play significant roles in the pathogenesis if obesity. Therefore, it was essential to reflect known biology against the unexplored cell types to expand our overall understanding and potentially identify innovative targets for treatment or prevention.

      I wonder whether the used epigenome datasets are all from children. Although the authors use literature to support that body weight and obesity remain stable from infancy to adulthood, it remains uncertain whether epigenomic data from other life stages might overlook significant genetic variants that uniquely contribute to childhood obesity.

      The datasets utilized in our study were derived from a combination of sources, both pediatric and adult. We recognize that epigenetic profiles can vary across different life stages but our principal effort was to characterize susceptibility BEFORE disease onset.

      Given that the GTEx tissue samples are derived from adult donors, there appears to be a mismatch with the study's focus on childhood obesity. If possible, identifying alternative validation strategies or datasets more closely related to the pediatric population could strengthen the study's findings.

      We thank the reviewer for raising this important point. We acknowledge that the GTEx tissue samples are derived from adult donors, which might not perfectly align with the study's focus on childhood obesity. The ideal strategy would be a longitudinal design that follows individuals from childhood into adulthood to bridge the gap between pediatric and adult data, offering systematic insights into how early-life epigenetic markers influencing obesity later in life. In future work, we aim to carry out such efforts, which will represent substantial time and financial commitment.

      Along the same lines, the Developmental Genotype-Tissue Expression (dGTEx) Project is a new effort to study development-specific genetic effects on gene expression at 4 developmental windows spanning from infant to post-puberty (0-18 years). Donor recruitment began in August 2023 and remains ongoing. Tissue characterization and data production are underway. We hope that with the establishment of this resource, our future research in the field of pediatric health will be further enhanced.

      Figure 1B: in subplots c and d, the results are either from Hi-C or capture-C. Although the authors use different colors to denote them, I cannot help wondering how much difference between Hi-C and capture-C brings in. Did the authors explore the difference between the Hi-C and capture-C?

      Thank you for your comment. It is not within the scope of our paper to explore the differences between the Hi-C and Capture-C methods. In the context of our study, both methods serve the same purpose of detecting chromatin loops that bring putative enhancers to sometimes genomically distant gene promoters. Consequently, our focus was on utilizing these methods to identify relevant chromatin interactions rather than comparing their technical differences.

      (2) Details on defining different categories of the regions of interest

      Some technical details are missing.

      While the authors described all of their analysis steps, a lot of the time, they did not mention the motivation. Sometimes, the details were also omitted.”

      We have added a section to the revision to address the rationale behind different OCRs categories.

      Line 129: should "-1,500/+500bp" be "-500/+500bp"?

      A gene promoter was defined as a region 1,500 bases upstream to 500 bases downstream of the TSS. Most transcription factor binding sites are distributes upstream (5’) from TSS, and the assembly of transcription machinery occurs up to 1000 bases 5’ from TSS. Given our interest in SNPs that can potentially disrupt transcription factor binding, this defined promoter length allowed us to capture such SNPs in our analyses.

      How did the authors define a contact region?

      Chromatin contact regions identified by Hi-C or Capture-C assays are always reported as pairs of chromatin regions. The Supplementary eMethods provide details on the method of processing and interaction calling from the Hi-C and Capture-C data.

      The manuscript would benefit from a detailed explanation of the methods used to define cREs, particularly the process of intersecting OCRs with chromatin conformation data. The current description does not fully clarify how the cREs are defined.

      In the result section titled "Consistency and diversity of childhood obesity proxy variants mapped to cREs", the authors introduced the different types of cREs in the context of open chromatin regions and chromatin contact regions, and TSS. Figure 2A is helpful in some way, but more explanation is definitely needed. For example, it seems that the authors introduced three chromatin contacts on purpose, but I did not quite get the overall motivation.

      We apologize for the confusion. Our definition of cREs is consistent throughout the study. Figure 2A will be the first Figure 1A in the revision in order to aid the reader.

      The 3 representative chromatin loops illustrate different ways the chromatin contact regions (pairs of blue regions under blue arcs) can overlap with OCRs (yellow regions under yellow triangles – ATAC peaks) and gene promoters.

      (1) The first chromatin loop has one contact region that overlaps with OCRs at one end and with the gene promoter at the other. This satisfies the formation of cREs; thus, the area under the yellow ATAC-peak triangle is green.

      (2) The second loop only overlapped with OCR at one end, and there was no gene promoter nearby, so it is unqualified as cREs formation.

      (3) The third chromatin loop has OCR and promoter overlapping at one end. We defined this as a special cRE formation; thus, the area under the yellow ATAC-peak triangle is green.

      To avoid further confusion for the reader, we have eliminated this variation in the new illustration for the revised manuscript.

      Figure 2A: The authors used triangles filled differently to denote different types of cREs but I wonder what the height of the triangles implies. Please specify.

      The triangles are illustrations for ATAC-seq peaks, and the yellow chromatin regions under them are OCRs. The different heights of ATAC-seq peaks are usually quantified as intensity values for OCRs. However, in our study, when an ATAC-seq peak passed the significance threshold from the data pipeline, we only considered their locations, regardless of their intensities. To avoid further confusion for the reader, we have eliminated this variation in the new illustration for the revised manuscript.

      Figure 1B-c. the title should be "OCRs at putative cREs". Similarly in Figure 1B-d.

      cREs are a subset of OCRs.

      - In the section "Cell type specific partitioned heritability", the authors used "4 defined sets of input genomic regions". Are you corresponding to the four types of regions in Figure 2A? 

      Figure 2A is the first Figure 1A in the revision and is modified to showcase how we define OCRs and cREs.

      It seems that the authors described the 771 proxies in "Genetic loci included in variant-to-genes mapping" (ln 154), and then somehow narrowed down from 771 to 94 (according to ln 199) because they are cREs. It would be great if the authors could describe the selection procedure together, rather than isolated, which made it quite difficult to understand.

      In the Methods section entitled “Genetic loci included in variant-to-genes mapping," we described the process of LD expansion to include 771 proxies from 19 sentinel obesity-significantly associated signals. Not all of these proxies are located within our defined cREs. Figure 2B, now Figure 2A in the revision, illustrates different proportions of these proxies located within different types of regions, reducing the proxy list to 94 located within our defined cREs.

      Figure 2. What's the difference between the 771 and 758 proxies?

      13 out of 771 proxies did not fall within any defined regions. The remaining 758 were located within contact regions of at least one cell type regardless of chromatin state.

      (3) Typos

      In the paragraph "Childhood obesity GWAS summary statistics", the authors may want to describe the case/control numbers in two stages differently. "in stage 1" and "921 cases" together made me think "1,921" is one number.

      This has been amended in the revision.

      Hi-C technology should be spelled as Hi-C. There are many places, it is miss-spelled as "hi-C". In Figure 1, the author used "hiC" in the legend. Similarly, Capture-C sometime was spelled as "capture-C" in the manuscript.

      At the end of the fifth row in the second paragraph of the Introduction section: "exisit" should be "exist".

      In Figure 2A: "Within open chromatin contract region" should be "Within open chromatin contact region”

      These typos and terminology inconsistencies have been amended in the revision.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Zhang et al. report a genetic screen to identify novel transcriptional regulators that could coordinate mitochondrial biogenesis. They performed an RNAi-based modifier screen wherein they systematically knocked down all known transcription factors in the developing Drosophila eye, which was already sensitised and had decreased mitochondrial DNA content. Through this screen, they identify CG1603 as a potential regulator of mitochondrial content. They show that protein levels of mitochondrial proteins like TFAM, SDHA, and other mitochondrial proteins and mtDNA content are downregulated in CG1603 mutants. RNA-Seq and ChIP-Seq further show that CG1603 binds to the promoter regions of several known nuclear-encoded mitochondrial genes and regulates their expression. Finally, they also identified YL-1 as an upstream regulator of CG1603. Overall, it is a very important study as our understanding of the regulation of mitochondrial biogenesis remains limited across metazoans. Most studies have focused on PGC-1α as a master regulator of mitochondrial biogeneis, which seems a context-dependent regulator. Also, PGC-1α mediated regulation could not explain the regulation of 1100 genes that are required for mitochondrial biogenesis. Therefore, identifying a new regulator is crucial for understanding the overall regulation of mitochondrial biogenesis.

      Reviewer #2 (Public Review):

      Summary:

      In this study, the authors aim to identify the nuclear genome-encoded transcription factors that regulate mtDNA maintenance and mitochondrial biogenesis. They started with an RNAi screening in developing Drosophila eyes with reduced mtDNA content and identified a number of putative candidate genes. Subsequently, using ChIP-seq data, they built a potential regulatory network that could govern mitochondrial biogenesis. Next, they focused on a candidate gene, CG1603, for further characterization. Based on the expression of different markers, such as TFAM and SDHA, in the RNAi and OE clones in the midgut cells, they argue that CG1603 promotes mitochondrial biogenesis and the expression of ETC complex genes. Then, they used a mutant of CG1603 and showed that both mtDNA levels and mitochondrial protein levels were reduced. Using clonal analyses, they further show a reduction in mitochondrial biogenesis and membrane potential upon loss of CG1603. They made a reporter line of CG1603, showed that the protein is localized to the mitochondria, and binds to polytene chromosomes in the salivary gland. Based on the RNA-seq results from the mutants and the ChIP data, the authors argue that the nucleus-encoded mitochondrial genes that are downregulated >2 folds in the CG1603 mutants and that are bound by CG1603 are related to ETC biogenesis. Finally, they show that YL-1, another candidate in the network, is an upstream regulator of CG1603.

      Strengths:

      This is a valuable study, which identifies a potential regulator and a network of nucleus-encoded transcription factors that regulate mitochondrial biogenesis. Through in-vivo and in-vitro experimental evidence, the authors identify the role of CG1603 in this process. The screening strategy was smart, and the follow-up experiments were nicely executed.

      Weaknesses:

      Some additional experiments showing the effects of CG1603 loss on ETC integrity and functionality would strengthen the work.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) Fig 3F: SDHA levels are severely downregulated in CG1603 RNAi clones. Therefore, estimating mitochondrial volume based on the SDHA reporter might be misleading. I suggest the authors perform this experiment with an independent marker of mitochondria, like mitoTracker Green or other dyes. I also suggest checking for mitochondrial number/quantity/size by electron microscopy.

      Even though being downregulated, the SDHA-mNeon signal in EC clones clearly outlined mitochondria and the overall mitochondrial network, allowing us to quantify the total mitochondrial volume. Examining mitochondrial number/quantity/size by electron microscopy would further strengthen this statement, and we will consider it in future studies.

      (2) The authors might comment on whether there was any decrease in the volume of CG1603i clone cells. And whether this was taken into account while normalising the mitochondrial volume.

      The size/volume of CG1603i clone cells were indeed decreased, which was considered while normalizing the mitochondrial volume. We clarified this point in methods section (page 18, line 511-512 (revised version page 18, line 515-517)).

      (3) Line 230-234: Collectively, these results demonstrate that CG1603 promotes the expression of both nuclear and mtDNA-encoded ETC genes and boosts mitochondrial biogenesis. CG1603 RNAi produced very few EC clones, consistent with the notion that mitochondrial respiration is necessary for ISCs differentiation.

      (4) Quantifying the number of EC clone cells observed might help support this statement.

      This is a great point. We quantified the number of EC clone cells, and the data was included in the revised Figure 3—figure supplement.

      (5) Figure 5: The intensity of MTGreen in CH1603 clones seems comparable to that in control cells, at least visually. Since the authors claim a reduction in mitochondrial volume in CG1603 mutants, it is crucial to estimate mitochondrial volume based on MTGreen intensity in mutant and control cells.

      There are two types of clones shown in Figure 5:  germ cell clones including all 16 germ cells in the same egg chamber and follicle cell clones. We highlight these two types of clones in the revised Figure 5, to emphasize this point. The total MT Green intensity in both germ cell and follicle cell CG1603PBac clones were reduced, compared to germ cells in adjacent egg chambers and adjacent follicle cells in the same egg chamber, respectively. We included the quantification of MTGreen intensity in the revised Figure 5—figure supplement C. Examining mitochondrial number/quantity/size by electron microscopy would further strengthen this statement, and we will consider it in future studies.

      (6) Figure 8: It would be interesting to know what happens to steady-state mtDNA levels during YL-1 knockdown. If decreased, could overexpressing CG1603 in YL-1 knockdown cells rescue the phenotype?

      YL-1 knockdown reduced steady-state mtDNA levels in eyes, and overexpressing CG1603 restored mtDNA level in YL-1 knockdown cells. These results are included in the revised Figure 8-figure supplement C.

      Minor comments:

      (7) The paper is lucidly written, but there are minor typos in several places. The authors might proofread it to remove these errors.

      We corrected typos and other minor errors in the manuscript.

      (8) Quantification for Figure 8 - Supplementary needs to be included.

      We performed the quantification, and the result is shown in Figure 8—figure supplement B.

      Reviewer #2 (Recommendations For The Authors):

      (1) In lines 275-276 and Figure 6E, the authors mention that more than 800 nuclear-encoded mitochondrial genes were reduced by >2-folds in CG1603 mutants. One gene related to mitochondrial replication and three genes related to mtDNA transcription were among them. Was TFAM one of these candidates? What were the reduction levels of TFAM mRNA in RNA seq results? Can the author confirm it via RT-PCR?

      In RNAseq analyses, TFAM was differentially expressed with a log2 Fold-Change of “ -0.74”, corresponding to ~1.6-fold decrease, and hence was not one of these candidates that were down-regulated more than two folds in CG1603 mutant. Per reviewer’s suggestion, we carried out RT-PCR and found TFAM was downregulated about 2-fold in CG1603 mutant. We included this result in the revised Figure 6F and listed all differentially expressed genes in Supplementary file 5a.

      (2) In many places, the authors argued about the role of CG1603 in ETC biogenesis. Also, the RNA-seq data shows that 64 genes related to the ETC complex were reduced by > 2-fold in CG1603 mutant. Therefore, it would be critical to expand a little on this aspect. For example, what are these genes and related to which of the ETC complex? Can the authors show the reduced levels of some of the candidate genes from each complex via RT-PCR?

      We listed all ETC genes that were down-regulated more than two folds in CG1603 mutant in a separate sheet in Supplementary file 5b. We further validated the reduced expression of ETC genes by RT-PCR on three randomly selected candidate genes from each complex. The result is included in the revised Figure 6F.

      (3) To make their argument solid on the role of CG1603 on ETC biogenesis, it is important to show the assembly/integrity of ETC complexes as well as the functionality/activity of the ETC complexes in CG1603 mutants.

      We purified mitochondria, and assayed assembly/integrity of three ETC complexes (Complex I, II and IV) and their activities, using blue native PAGE analysis and in gel activity analysis, respectively.  The amount of these three complexes, and accordingly, their activities were all markedly reduced in CG1603 mutant compared to wt.  The result is included as Figure 4—figure supplement A.

      (4) CG1603 has already been named as cliff. Why do the authors not use this name, or alternatively propose one?

      We thank the reviewer for the note. The CG1603 has not been named as cliff when we were preparing this manuscript.

      (5) In lines 230-231, based on the TFAM-GFP and SDHA-mNG levels, the authors claim that "these results demonstrate that CG1603 promotes the expression of both nuclear and mtDNA-encoded ETC genes..." The authors may tone down this statement since it sounds overstating. It would be prudent to claim that a subset of genes are regulated by CG1603.

      We appreciate the reviewer’s suggestion. We revised the text to tone down this statement (page 8, line 201; page 9, line 229-230).

    1. Author response:

      Reviewer #1:

      Weaknesses:

      However, given that S1P is upstream NF-κB signaling, it is unclear if it offers conceptual innovations as compared to previous studies from the same team (Palazzo et al. 2020; 2022, 2023)

      We find distinct differences between the impacts of S1P- and NFkB-signaling on glial activation, neuronal differentiation of the progeny of MGPCs and neuronal survival in damaged retinas. In the current study we demonstrate that 2 consecutive daily intravitreal injections of S1P selectively activated mTor (pS6) and Jak/Stat3 (pStat3), but not MAPK (pERK1/2) signaling in Müller glia.  Further, inhibition of S1P synthesis (SPHK1 inhibitor) decreased ATF3, mTor (pS6) and pSmad1/5/9 levels in activated Müller glia in damaged retinas. Inhibition of NFkB-signaling in damaged chick retinas did not impact the above-mentioned cell signaling pathways (Palazzo et al., 2020). Thus, S1P-signaling impacts cell signaling pathways in MG that are distinct from NFκB, but we cannot exclude the possibility of cross-talk between NFkB and these pathways. Further, inhibition of NFκB-signaling potently decreases numbers of dying cells and increases numbers of surviving ganglion cells (Palazzo et al 2020). Consistent with these findings, a TNF orthologue, which presumably activates NFκB-signaling, exacerbates cell death in damage retinas (Palazzo et al., 2020). By contrast, 5 different drugs targeting S1P-signaling had no effect on numbers of dying cells and only one S1PR1 inhibitor modestly decreased numbers of dying cells (current study). In addition, inhibition of NFκB does not influence the neurogenic potential of MGPCs in damaged chick retinas (Palazzo et al., 2020), whereas inhibition of S1P receptors (S1PR1 and S1PR3) and inhibition of S1P synthesis (SPHK1) significantly increased the differentiation of amacrine-like neurons in damaged retinas (current study). Collectively, in comparison to the effects of pro-inflammatory cytokines and NFκB-signaling, our current findings indicate that S1P-signaling through S1PR1 and S1PR3 in Müller glia has distinct effects upon cell signaling pathways, neuronal regeneration and cell survival in damaged retinas. We will revise text in the Discussion to better highlight these important distinctions between NFκB- and S1P-signaling.

      Reviewer #2:

      Weaknesses:

      The methodology is not very clean. A number of drugs (inhibitors/ antagonists/agonists signal modulators) are used to modulate S1P expression or signaling in the retina without evidence that these drugs are reaching the target cells. No alternative evaluation if the drugs, in fact, are effective. The drug solubility in the vehicle and in the vitreous is not provided, and how did they decide on using a single dose of each drug to have the optimal expected effect on the S1P pathway?

      Müller glia are the predominant retinal cell type that expresses S1P receptors. Consistent with these patterns of expression, we report Müller glia-specific effects of different agonists and antagonists that increase or decrease S1P-signaling. Since we compare cell-level changes within contralateral eyes wherein one retina is exposed to vehicle and the other is exposed to vehicle plus drug, it seems highly probable that the drugs are eliciting effects upon the Müller glia. It is possible, but very unlikely, that the responses we observed could have resulted from drugs acting on extra-retinal tissues, which might secondarily release factors that elicit cellular responses in Müller glia. However, this seems unlikely given the distinct patterns of expression for different S1P receptors in Müller glia, and the outcomes of inhibiting Sphk1 or S1P lyase on retinal levels of S1P.

      For example, we provide evidence that S1PR1 and S1PR3 expression is predominant in Müller glia in the chick retina using single cell-RNA sequencing and fluorescence in situ hybridization (FISH). Thus, we expect that S1PR1/3-targeting small molecule inhibitors to directly act on Müller glia, which is consistent with our read-outs of cell signaling with injections of S1P in undamaged retinas. We show that SPHK1 and SGPL1, which encode the enzymes that synthesize or degrade S1P, are expressed by different retinal cell types, including the Müller glia. The efficacy of the drugs that target SPHK1 and SGPL1 was assessed by measuring levels of S1P in the retina. By using liquid chromatography and tandem mass spectroscopy (LC-MS/MS), we provide data that inhibition of S1P synthesis (inhibition of SPHK1) significantly decreased levels of S1P in normal retinas, whereas inhibition of S1P degradation (inhibition of SGPL1) increased levels of S1P in damaged retinas (Fig. 5).  These data suggest that the SPHK1 inhibitor and the SGPL1 inhibitor specifically act at the intended target to influence retinal levels of S1P.  Further, inhibition of SPHK1 (to decrease levels S1P) results in decreased levels of ATF3, pS6 (mTor) and pSMAD1/5/9 in Müller glia, consistent with the notion that reduced levels of S1P in the retina impacts signaling at Müller glia. Finally, we find similar cellular responses to chemically different agonists or antagonists, and we find opposite cellular responses to agonists and antagonists, which are expected to be complimentary if the drugs are specifically acting at the intended targets in the retina. We will revise the Discussion to better address caveats and concerns regarding the actions and specificity of different drugs within the retina following intravitreal delivery.

      We will provide the drug solubility specifications and estimates of the initial maximum dose per eye for each drug. For chick eyes between P7 and P14, these estimates will assume a volume of about 100 µl of liquid vitreous, 800 µl gel vitreous and an average eye weight of 0.9 grams. We will revise Table 1 (pharmacological compounds) with ranges of reported in vivo ED50’s (mg/kg) for drugs and we will list the calculated initial maximum dose (mg/kg equivalent per eye). Doses were chosen based on estimates of the initial maximum ocular dose that were within the range of reported ED50’s. However, as is the case for any in vivo model system, it is difficult to predict rates of drug diffusion out of the vitreous, how quickly the drugs are cleared from the entire eye, how much of the compound enters the retina, and how quickly the drug is cleared from the retina. Accordingly, we assessed drug specificity and sites of activation by relying upon readouts of cell signaling pathways, parsed with S1P receptor expression patterns, together with measurements of retinal levels of S1P following exposure to drugs targeting enzymes that catalyze synthesis or degradation of S1P, as described above.

    1. Author response:

      Reviewer #1 (Public review):

      Summary:

      Shrestha et al report an investigation of mechanisms underlying gustatory preference for carboxylic acids in Drosophila. They begin with a screen of selected IR mutants, identifying 5 candidates - 2 IR co-receptors and 3 other IRs - whose loss of function causes defects in feeding preference for one or more of the three tested carboxylic acids. The requirement for IR51b, IR94a, and IR94h in carboxylic acid responses is evaluated in more detail using behavior, electrophysiology (labellar sensilla), and calcium imaging (pharyngeal neurons). The behavioral valence of IR94a and IR94h neurons is assessed using optogenetics. Overall the study uses a variety of approaches to test and validate the requirement of IRs in pharyngeal carboxylic acid taste.

      Strengths:

      The involvement of the identified IRs in gustatory responses to carboxylic acids is very clear from this study. The authors use mutants and transgenic rescue experiments and evaluate outcomes using electrophysiology, behavior, and imaging. Complementary approaches of loss-of-function and artificial activation support the main conclusion that the identified pharyngeal neurons sense carboxylic acids and convey a positive behavioral valence.

      Weaknesses:

      Some aspects of expression analysis and calcium imaging need to be clarified to better support the conclusions.

      (1) The conclusion of two parallel IR-mediated pathways rests on expression analysis of Ir94a-GAL4 and Ir94h-GAL4 lines and the observation that Ir51b expression driven by either can rescue the Ir51b mutant phenotype. However, the expression analysis is not as rigorous as it needs to be for such a conclusion. Prior work found co-expression of Ir94a and Ir94h in the LSO. Here, the co-expression of the two drivers has not been examined, and Ir94a-GAL4 does not appear to be expressed in the LSO. Given the challenges in validating expression patterns in pharyngeal organs, the possibility that the drivers do not entirely capture endogenous expression cannot be ruled out. Rescue experiments using feeding preference or single-cell imaging don't suffice as validation. Plus, the expression of Ir51b could not be defined.

      Based on current literature, Ir94a and Ir94h exhibit distinct expression patterns localized to different sensory regions. Specifically, Ir94a is primarily expressed in the V5 region of the VCSO, where it co-localizes with Ir94c-GAL4 (Chen et al., 2017). Conversely, Ir94h is found in the L7-7 sensilla of the LSO, where it co-expresses with Ir94f, and also within the V2 cells of the VCSO. Notably, the projections of Ir94a and Ir94h into the dorso-anterior subesophageal ganglion suggest divergent expression patterns rather than co-expression in the pharyngeal regions (Koh et al., 2014). Regarding co-expression of Ir94a and Ir94h in the LSO, we did not find any evidence to support this claim. Our data reinforce this view, showing that Ir94a-GAL4 expression is limited to the VCSO, while Ir94h-GAL4 is present in both the LSO and VCSO. Thus, the notion of co-expression of Ir94a and Ir94h in the LSO is not substantiated by current evidence.

      As a reviewer suggested, it is possible that the GAL4 drivers utilized may not fully reflect the endogenous expression of these receptors. Despite this limitation, our behavioral, expression, and physiological analyses strongly suggest that Ir94a and Ir94h are located in distinct regions, supporting a model of two parallel IR-mediated pathways operating within the sensory system.

      In addition, RT-PCR analysis confirmed the presence of Ir51b. However, due to methodological constraints, we were unable to conduct cell-type-specific expression studies using Ir51b-GAL4. This limitation, which we have acknowledged in the manuscript, does not detract from our core findings but highlights an area for future research. Further studies utilizing cell-specific expression analysis and co-expression studies with additional drivers could offer more definitive insights into IR51b’s functional role and its interactions within broader IR-mediated pathways.

      (2) The description of methods and results for the ex vivo calcium imaging is not satisfactory. Details about which cells are being analyzed, and in which organs are not included. No solvent stimulus is tested. The temporal dynamics of the responses are not presented. Movies of the imaging are not included as supplementary information - it would be important to visualize those with what was considered modest movement.

      We appreciate this valuable feedback. As discussed above, Ir94h is specifically expressed in the L7-7 sensilla of the LSO, while Ir94a is expressed in the V2 cells of the VCSO. This evidence led us to focus specifically on these cells in our calcium imaging study to ensure accuracy and relevance. In our experiments, Adult hemolymph solution (AHL) (108 mM NaCl, 5 mM KCl, 8.2 mM MgCl2, 2 mM CaCl2, 4 mM NaHCO3, 1 mM NaH2PO4, 5 mM HEPES, pH 7.5) was used as the solvent and employed as a pre-stimulus (as mentioned in the Methods section). During this phase, we observed no changes in fluorescence, indicating that AHL itself did not influence the responses. Fluorescence changes occurred only when the test chemical, dissolved in AHL, was introduced. To further confirm that AHL had no impact on the results, we conducted continuous recordings with AHL alone before beginning our main experiments, and these trials confirmed the absence of fluorescence alterations. We have included the temporal dynamics and supplementary video recordings to provide a more comprehensive understanding of our findings.

      (3) The observed differences in phenotypes of Ir25a and Ir76b mutants are intriguing, as are those between the co-receptor mutants and Ir51b, Ir94a, and Ir94h, but have not been sufficiently considered. Prior studies have also found roles for other response modes (OFF response), other IRs and GRs, and other organs (labellum, tarsi) in behavioral responses to carboxylic acids. Overall, the authors' model may be overly simplistic, and the discussion does not do justice to how their model reconciles with the body of work that already exists.

      Stanley et al. (2021) reported that the gustatory detection of lactic acid requires both IRs and GRs functioning together. Specifically, they found that IR25a mediates the onset peak response (ON response) to lactic acid, while GRs dampen this response and contribute to a removal peak (OFF response). Interestingly, in Ir25a mutants, a small onset peak still occurred, while Gr64a-f mutants showed an enhanced onset, suggesting that IRs and GRs interact dynamically to modulate taste responses.

      In our previous work, we also observed the role of sweet GRs, in addition to Ir25a and Ir76b, in detecting carboxylic acids in the labellum (Shrestha et al., 2021). This raises the possibility of a similar interplay with carboxylic acids in our current study, where different IRs may contribute to distinct aspects of sensory responses in the pharynx, leading to the phenotypic differences we observed. Moreover, Chen et al. (2017) demonstrated that sour-sensing neurons in the tarsi express both IR76b and IR25a and specifically respond to carboxylic and inorganic acids without reacting to sweet or bitter compounds. This finding points to a specialized role for these receptors in sour detection and suggests a coordinated response involving multiple sensory organs—such as the labellum, tarsi, and pharynx.

      The phenotypic differences observed in our mutants align with a more integrated model of carboxylic acid detection, in which multiple receptors and sensory organs contribute to the overall behavioral response. This supports the idea that our current model offers a more detailed understanding of how different carboxylic acids are detected and processed by the gustatory system.

      Reviewer #2 (Public review):

      Shrestha et al investigated the role of IR receptors in the detection of 3 carboxylic acids in adult Drosophila. A low concentration of either of these carboxylic acids added to 2 mM sucrose (1% lactic acid (LA), citric acid (CA), or glycolic acid (GA)) stimulates the consumption of adult flies in choice conditions. The authors use this behavioral test to screen the impact of mutations within 33 receptors belonging to the IR family, a large family of receptors derived from glutamate receptors and expressed both in the olfactory and gustatory sensilla of insects. Within the panel of mutants tested, they observed that 3 receptors (IR25a, IR51b, and IR76b) impaired the detection of LA, CA, and GA, and that 2 others impacted the detection of CA and GA (IR94a and IR94h). Interestingly, impairing IR51b, IR94a, and IR94h did not affect the electrophysiological responses of external gustatory sensilla to LA, CA, and GA. Thanks to the use of GAL4 strains associated with these receptors and thanks to the use of poxn mutants (which do not develop external gustatory sensilla but still have functional internal receptors), they show evidence that IR94a and IR94h are only expressed in two clusters of gustatory neurons of the pharynx, respectively in the VCSO (ventral cibarial sense organ) and in the VCSO + LSO (labral sense organ). As for IR51b, the GAL4 approach was not successful but RT-PCR made on different parts of the insect showed an expression both in the pharyngeal organs and in peripheral receptors. These main findings are then complemented by a host of additional experiments meant to better understand the respective roles of IR94a and IR94h, by using optogenetics and brain calcium imaging using GCamp6. They also report a failed attempt to co-express IR51b, IR94a, and IR94h into external receptors, a co-expression which did not confer the capability of bitter-sensitive cells (expressing GR33a-GAL4) to detect either of the carboxylic acids. These data complete and expand previous observations made on this group and others, and dot to 2 new IR receptors which show an unsuspected specific expression, into organs that still remain difficult to study.

      The conclusions of this paper are supported by the data presented, but it remains difficult to make general conclusions as concerns the mechanisms by which carboxylic acids are detected.

      (1) All experiments were done with 1% of carboxylic acids. What is the dose dependency of the behavioral responses to these acids, and is it conceivable that other receptors are involved at other concentrations?

      In our study, we conducted experiments to examine the dose dependency of behavioral responses to carboxylic acids, with results presented in Supplementary Figure 1. We found that lower concentrations of carboxylic acids are perceived as attractive, while higher concentrations are aversive. This differential response suggests that the receptors identified in our study are primarily tuned to detect low concentrations of these acids. Since higher concentrations elicited aversive responses, it is plausible that additional receptors, beyond the scope of our study, may be involved in sensing these higher concentrations. These receptors could be part of other gustatory receptor neurons that respond specifically to increased acid levels, as fruit flies tend to avoid higher concentrations. We propose that future research could investigate these alternative pathways to gain a complete understanding of the behavioral responses to carboxylic acids. In summary, our findings suggest that specific receptors are involved in detecting low concentrations, while distinct receptor pathways—possibly mediated by other GRNs—may regulate responses to higher concentrations.

      (2) One result needs to be better discussed and hypotheses proposed - which is why the mutations of most receptors lead to a loss of detection (mutant flies become incapable of detecting the acid) while mutations in IR94a and IR94h make CA and GA potent deterrents. Does it mean that CA and GA are detected by another set of receptors that, when activated, make flies actively avoid CA and GA? In that case, do the authors think that testing receptors one by one is enough to uncover all the receptors participating in the detection of these substances?

      As we mentioned above, it is possible that distinct receptor pathways mediate avoidance of GA and CA. This suggests that CA and GA might activate different sets of receptors that trigger avoidance behavior, pointing to a more complex interplay of receptor activity than we initially considered. Certain acids may indeed be detected by multiple receptors, with each receptor contributing uniquely to the behavioral response. Regarding the sufficiency of testing receptors individually, we recognize the limitations of this approach. Examining receptors one by one may not reveal the full spectrum of receptors involved, especially due to potential interactions or compensatory mechanisms that only emerge when certain receptors are inactive. Therefore, a more holistic approach—such as genetic screens for behavioral responses or using complex genetic models to disrupt multiple receptors simultaneously—could provide deeper insights. Moving forward, incorporating receptor interactions that modulate each other, along with more comprehensive assays, could help explain these discrepancies by uncovering previously overlooked receptor functions.

      (3) The paper needs to be updated with a recent paper published by Guillemin et al (2024), indicating that LA is detected externally by a combination of IR94e, IR76b and IR25a. IR25a might help to form a fully functional receptor in GR33a neurons (a former study from Chen et al (2017) indicate that IR25a is expressed in all gustatory neurons of the pharynx).

      According to Guillemin et al. (2024), the combination of IR94e, IR76b, and IR25a is required for amino acid detection but not for detecting lactic acid (LA). In their calcium imaging experiments, 100 mM LA elicited a response similar to the vehicle control, suggesting that these receptors do not play a role in LA detection.

      (4) Although it was not the main focus of the paper, it would have been most interesting if the cells expressing IR94a and IR94h were identified, and placed on the functional map proposed by the group of Dahanukar (Chen et al 2017 Cell Reports, Chen et al 2019 Cell Reports).

      The expression patterns of IR94a and IR94h were previously detailed by Chen et al. (2017), showing that IR94h is expressed in the labial sense organ (LSO, specifically in L7-7) and the ventral cibarial sense organ (VCSO, V2), while IR94a is expressed in the VCSO (V5). Given this established information, we referenced these known expression patterns without replicating the mapping in our study. Our primary focus was to investigate the functional role of these neurons within the pharynx, and we believe we have successfully highlighted their specific contributions. However, we recognize that integrating the functional mapping of these neurons in alignment with the work of Dahanukar’s group would have strengthened our findings and provided a more comprehensive understanding. We acknowledge this as a limitation of our study and appreciate your suggestion, as it points to a valuable direction for future research.

      Reviewer #3 (Public review):

      Summary:

      In this work, the authors investigated the molecular and cellular basis of sour taste perception in Drosophila melanogaster, focusing on identifying receptors that mediate attractive responses to certain carboxylic acids. It builds on previous work from the same group that had identified the IR co-receptors IR25a and IR76b for this sensory process, screening a set of mutants in IRs to identify three, IR51b, IR94a, and IR94h, required for feeding preference responses to some or all of the tested acids.

      Strengths:

      The work is of interest because it assigns sensory roles to IRs of previously unknown function, in particular IR94a and IR94h, and points to pharyngeal neurons in which these receptors are expressed as the relevant sensory neurons (potentially with different roles for IR94a- and IR94h-expressing neurons). The work combines elegant genetics, simple but effective feeding and taste assays, chemo-/opto-genetic activation, and some calcium imaging. Overall the presented data look solid and well-controlled.

      Weaknesses:

      The in situ expression analysis relies entirely on transgenic driver lines for IR94a and IR94h (which had been previously described, though not fully cited in this work). Importantly, given that many of the behavioral experiments (genetic rescue, physiology, artificial activation) use the IR94a and IR94h GAL4 driver lines, it would be helpful to validate that these faithfully reflect IR94a and IR94h expression (as far as I can tell, such validation wasn't done in the original papers describing these lines as part of a large collection of IR drivers). For IR51b, pharyngeal expression is concluded indirectly from non-quantitative RT-PCR analysis (genetic reporters did not work). The lack of direct detection of gene/protein expression (for example, through RNA FISH, immunofluorescence, or protein tagging) would have made for a more complete characterization of these receptors (for example, there is no direct evidence that they also express IR25a and IR76b, as one might expect). Finally, the relationship of IR94a and IR94h neurons to other types of pharyngeal neurons remains unclear, as are their projection patterns in the SEZ.

      Conceptually, the work is of interest mostly to those in the immediate field; there have been a very large number of studies in the past decade (several from this lab) characterizing the contributions of different IRs to various chemosensory processes. The current work doesn't lend much insight into the nature of the minimal functional unit of gustatory IRs (reconstitution of a functional IR in a heterologous neuron/cell has not been achieved here, but this is a limitation of many other previous studies), nor to how different pharyngeal sensory pathways might collaborate to control behavior. Nevertheless, the findings provide a useful contribution to the literature.

      We appreciate your thoughtful feedback. As noted in our response, our primary objective was to investigate the sensory functions of IR94a and IR94h. To this end, we conducted behavioral assays, which we validated with additional approaches including genetic rescue, physiological tests, and artificial activation. Throughout these experiments, we extensively utilized Ir94a- and Ir94h-GAL4 driver lines. To ensure these lines accurately reflect the expression of IR94a and IR94h, we verified their expression patterns using immunohistochemistry across various body parts. Our results align with previous findings that show both receptors are exclusively expressed in the pharynx. Regarding IR51b, we employed RT-PCR due to its high sensitivity and specificity, which supported our hypothesis. Nonetheless, we agree that more direct detection methods would have provided a stronger validation of IR51b expression. Our previous study (Sang et al., 2024) also demonstrated the pharyngeal expression of co-expressed receptors, specifically IR25a and IR76b. However, we recognize that the lack of direct evidence for their co-expression with IR51b remains a significant gap. This limitation primarily stems from the unavailability of specific reagents needed for direct assays targeting IR51b, which restricted our experimental approach.

      You also raised the potential relationship between IR94a and IR94h neurons and other pharyngeal neuron types, including their projection patterns in the subesophageal zone. This is indeed an important area for future research that could clarify neural connectivity and further our understanding of sensory mechanisms. However, our study was focused on exploring sensory mechanisms in peripheral regions rather than detailed neural mapping in the SEZ. Investigating these connections would undoubtedly provide valuable insights into the neural circuitry involved and represents an intriguing direction for future research.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Syngnathid fishes (seahorses, pipefishes, and seadragons) present very particular and elaborated features among teleosts and a major challenge is to understand the cellular and molecular mechanisms that permitted such innovations and adaptations. The study provides a valuable new resource to investigate the morphogenetic basis of four main traits characterizing syngnathids, including the elongated snout, toothlessness, dermal armor, and male pregnancy. More particularly, the authors have focused on a late stage of pipefish organogenesis to perform single-cell RNA-sequencing (scRNA-seq) completed by in situ hybridization analyses to identify molecular pathways implicated in the formation of the different specific traits. 

      The first set of data explores the scRNA-seq atlas composed of 35,785 cells from two samples of gulf pipefish embryos that authors have been able to classify into major cell types characterizing vertebrate organogenesis, including epithelial, connective, neural, and muscle progenitors. To affirm identities and discover potential properties of clusters, authors primarily use KEGG analysis that reveals enriched genetic pathways in each cell types. While the analysis is informative and could be useful for the community, some interpretations appear superficial and data must be completed to confirm identities and properties. Notably, supplementary information should be provided to show quality control data corresponding to the final cell atlas including the UMAP showing the sample source of the cells, violin plots of gene count, UMI count, and mitochondrial fraction for the overall

      dataset and by cluster, and expression profiles on UMAP of selected markers characterizing cluster identities. 

      We thank the reviewer for these suggestions, and have added several figures and supplemental files in response. We added a supplemental UMAP showing the sample that each cell originated (S1). We also added supplemental violin plots for each sample showing the gene count, unique molecular identifier (UMI) count, mitochondrial fraction, and the doublet scores (S2). We added feature plots of zebrafish marker genes for these major cell types and marker genes identified from our dataset to the supplement (S3:S57). We also provided two supplemental files with marker genes. These changes should clarify the work that went into labeling the clusters. Although some of the cluster labels are general, we decided it would be unwise to label clusters with speculated specific annotations. We only gave specific annotations to clusters with concrete markers and/or in situ hybridization (ISH) results that cemented an annotation.  As shown in the new supplemental figures and files, certain clusters had clear, specific markers while others did not. Therefore, we used caution when we annotated clusters without distinct markers. 

      The second set of data aims to correlate the scRNA-seq analysis with in situ hybridizations (ISH) in two different pipefish (gulf and bay) species to identify and characterize markers spatially, and validate cell types and signaling pathways active in them. While the approach is rational, the authors must complete the data and optimize labeling protocols to support their statements. One major concern is the quality of ISH stainings and images; embryos show a high degree of pigmentation that could hide part of the expression profile, and only subparts and hardly detectable tissues/stainings are presented. The authors should provide clear and good-quality images of ISH labeling on whole-mount specimens, highlighting the magnification regions and all other organs/structures (positive controls) expressing the marker of interest along the axis. Moreover, ISH probes have been designed and produced on gulf pipefish genome and cDNA respectively, while ISH labeling has been performed indifferently on bay or gulf pipefish embryos and larvae. The authors should specify stages and species on figure panels and should ensure sequence alignment of the probe-targeted sequences in the two species to validate ISH stainings in the bay pipefish. Moreover, spatiotemporal gene expression being a very dynamic process during embryogenesis, interpretations based on undefined embryonic and larval stages of pipefish development and compared to 3dpf zebrafish are insufficient to hypothesize on developmental specificities of pipefish features, such as on the absence of tooth primordia that could represent a very discrete and transient cell population. The ISH analyses would require a clean and precise spatiotemporal expression comparison of markers at the level of the entire pipefish and zebrafish specimens at well-defined stages, otherwise, the arguments proposed on teleost innovations and adaptations turn out to be very speculative. 

      We are appreciative of the reviewer’s feedback. We primarily used the in situ hybridization (ISH) data as supplementary to the scRNAseq library and we are aware that further evidence is necessary to identify origins of syngnathid’s evolutionary novelties. Our goal was to provide clues for the developmental genetic basis of syngnathid derived features.  We hope that our study will inspire future investigations and are excited for the prospect that future research could include this reviewer’s ideas. 

      All of the developmental stages and species information for the embryos used were in the figure captions as well as in supplemental file 6. Because we primarily used wild caught embryos, we did not have specific ages of most embryos. Syngnathid species are challenging to culture in the laboratory, and extracting embryos requires euthanizing the father which makes it difficult to obtain enough embryos for ISH. In addition, embryos do not survive long when removed from the brood pouch prematurely. We supplemented our ISH with bay pipefish caught off the Oregon coast because these fish have large broods. Wild caught pregnant male bay pipefish were immediately euthanized, and their broods were fixed. Because we did not have their age, we classified them based on developmental markers such as presence of somites and the extent of craniofacial elongation. Although these classification methods are not ideal, they are consistent with the syngnathid literature (Sommer et al. 2012). Since the embryos used for the ISH were primarily wild caught, we had a few different developmental stages represented in our ISH data. For our tooth primordia search, we used embryos from the same brood (therefore, same stage) for these experiments.

      We understand the concern for the degree of pigmentation in the samples. We completed numerous bleach trials before embarking on the in situ hybridization experiments. After completing a bleach trial with a probe created from the gene tnmd for ISH_,_ we noticed that the bleached embryos were missing expression domains found in the unbleached embryos. We were, therefore, concerned that using bleached embryos for our experiments would result incorrect conclusions about the expression domains of these genes. We sparingly used bleaching at older stages, hatched larvae, where it was fundamentally necessary to see staining. As stated above, the primary goal of this manuscript was to generate and annotate the first scRNA-seq atlas in a syngnathid, and the ISHs were utilized to support inferred cluster annotations only through a positive identification of marker gene expression in expected tissues/cells. Therefore, the obscuring of gene expression by pigmentation would have resulted in the absence of evidence for a possible cluster annotation, not an incorrect annotation.

      For the ease of viewing the ISHs, we improved annotations and clarity. We increased the brightness and contrast of images. In the original submission, we had to lower the image resolution to make the submission file smaller. We hope that these improvements plus the true image quality improves clarity of ISH results. We also included alignments in our supplementary files of bay pipefish sequences to the Gulf pipefish probes to showcase the high degree of sequence similarity. 

      Sommer, S., Whittington, C. M., & Wilson, A. B. (2012). Standardised classification of pre-release development in male-brooding pipefish, seahorses, and seadragons (Family Syngnathidae). BMC Developmental Biology, 12, 12–15. 

      To conclude, whereas the scRNA-seq dataset in this unconventional model organism will be useful for the community, the spatiotemporal and comparative expression analyses have to be thoroughly pushed forward to support the claims. Addressing these points is absolutely necessary to validate the data and to give new insights to understand the extraordinary evolution of the Syngnathidae family. 

      We really appreciate the reviewer’s enthusiasm for syngnathid research, and hope that the additional files and explanation of the supporting role of the ISHs have adequately addressed their concerns. We share the reviewer’s enthusiasm and are excited for future work that can extend this study. 

      Reviewer #2 (Public Review):

      Summary: 

      The authors present the first single-cell atlas for syngnathid fishes, providing a resource for future evolution & development studies in this group. 

      Strengths: 

      The concept here is simple and I find the manuscript to be well written. I like the in situ hybridization of marker genes - this is really nice. I also appreciate the gene co-expression analysis to identify modules of expression. There are no explicit hypotheses tested in the manuscript, but the discovery of these cell types should have value in this organism and in the determination of morphological novelties in seahorses and their relatives.  

      We are grateful for this reviewer’s appreciation of the huge amount of work that went into this study, and we agree that the in situ hybridizations (ISHs) support the scRNAseq study as we intended. We appreciate that the reviewer thinks that this work will add value to the syngnathid field.

      Weaknesses: 

      I think there are a few computational analyses that might improve the generality of the results. 

      (1) The cell types: The authors use marker gene analysis and KEGG pathways to identify cell types. I'd suggest a tool like SAMap (https://elifesciences.org/articles/66747) which compares single-cell data sets from distinct organisms to identify 'homologous' cell types - I imagine the zebrafish developmental atlases could serve as a reasonable comparative reference. 

      We appreciate the reviewer’s request, and in fact we would have loved to integrate our dataset with zebrafish. However, syngnathid’s unique craniofacial development makes it challenging to determine the appropriate stage for comparison. While 3 days post fertilization (dpf) zebrafish data were appropriate for comparisons of certain cell types (e.g. epidermal cells), it would have been problematic for other cell types (e.g. osteoblasts) that are not easily detectable until older zebrafish stages. Therefore, determining equivalent stages between these species is difficult and contains potential for error. Future research should focus on trying to better match stages across syngnathids and zebrafish (and other fish species such as stickleback). Studies of this nature promise to uncover the role of heterochrony in the evo-devo of syngnathid’s unique snouts.

      (2) Trajectory analyses: The authors suggest that their analyses might identify progenitor cell states and perhaps related differentiated states. They might explore cytoTRACE and/or pseudotime-based trajectory analyses to more fully delineate these ideas.

      We thank the reviewer for this suggestion! We added a trajectory analysis using cytoTRACE to the manuscript. It complemented our KEGG analysis well (L172-175; S73) and has improved the manuscript.

      (3) Cell-cell communication: I think it's very difficult to identify 'tooth primordium' cell types, because cell types won't be defined by an organ in this way. For instance, dental glia will cluster with other glia, and dental mesenchyme will likely cluster with other mesenchymal cell types. So the histology and ISH is most convincing in this regard. Having said this, given the known signaling interactions in the developing tooth (and in development generally) the authors might explore cell-cell communication analysis (e.g., CellChat) to identify cell types that may be interacting. 

      We agree! It would have been a wonderful addition to the paper to include a cell-cell communication analysis. One limitation of CellChat is that it only includes mouse and human orthologs. Given concerns of reviewer #3 for mouse-syngnathid comparisons, we decided to not pursue CellChat for this study. We are looking forward to future cell communication resources that include teleost fishes.

      Reviewer #3 (Public Review): 

      Summary: 

      This study established a single-cell RNA sequencing atlas of pipefish embryos. The results obtained identified unique gene expression patterns for pipefish-specific characteristics, such as fgf22 in the tip of the palatoquadrate and Meckel's cartilage, broadly informing the genetic mechanisms underlying morphological novelty in teleost fishes. The data obtained are unique and novel, potentially important in understanding fish diversity. Thus, I would enthusiastically support this manuscript if the authors improve it to generate stronger and more convincing conclusions than the current forms. 

      Thank you, we appreciate the reviewer’s enthusiasm!

      Weaknesses: 

      Regarding the expression of sfrp1a and bmp4 dorsal to the elongating ethmoid plate and surrounding the ceratohyal: are their expression patterns spatially extended or broader compared to the pipefish ancestor? Is there a much closer species available to compare gene expression patterns with pipefish? Did the authors consider using other species closely related to pipefish for ISH? Sfrp1a and bmp4 may be expressed in the same regions of much more closely related species without face elongation. I understand that embryos of such species are not always accessible, but it is also hard to argue responsible genes for a specific phenotype by only comparing gene expression patterns between distantly related species (e.g., pipefish vs. zebrafish). Due to the same reason, I would not directly compare/argue gene expression patterns between pipefish and mice, although I should admit that mice gene expression patterns are sometimes helpful to make a hypothesis of fish evolution. Alternatively, can the authors conduct ISH in other species of pipefish? If the expression patterns of sfrp1a and bmp4 are common among fishes with face elongation, the conclusion would become more solid. If these embryos are not available, is it possible to reduce the amount of Wnt and BMP signal using Crispr/Cas, MO, or chemical inhibitor? I do think that there are several ways to test the Wnt and/or BMP hypothesis in face elongation. 

      We appreciate the reviewer’s suggestion, and their recognition for challenges within this system. In response to this comment, we completed further in situ hybridization experiments in threespine stickleback, a short snouted fish that is much more closely related to syngnathids than is zebrafish, to make comparisons with pipefish craniofacial expression patterns (S76-S79). We added ISH data for the signaling genes (fgf22, bmp4, and sfrp1a) as well as prdm16. Through adding this additional ISH results, we speculated that craniofacial expression of bmp4, sfrp1a, and prdm16 is conserved across species. However, compared to the specific ceratohyal/ethmoid staining seen in pipefish, stickleback had broad staining throughout the jaws and gills. These data suggest that pipefish have co-opted existing developmental gene networks in the development of their derived snouts. We added this interpretation to the results and discussion of the manuscript (L244-L248; L262-277; L444-470).

      Recommendations for the authors:  

      Reviewing Editor (Recommendations for the Authors)

      We hope that the eLife assessment, as well as the revisions specified here, prove helpful to you for further revisions of your manuscript. 

      Revisions considered essential: 

      (1) Marker genes and single-cell dataset analyses. While these analyses have been performed to a good standard in broad terms, there is a majority view here that cell type annotations and trajectory analyses can be improved. In particular, there is question about the choice of marker genes for the current annotation. For one it can depend on the use of single marker genes (see tnnti1 example for clusters 17 and 31). Here, we recommend incorporating results from SAMap and trajectory analysis (e.g., cytoTRACE or standard pseudotime).

      Because of the reviewer comments, we became aware that we insufficiently communicated how cell clusters were annotated. We did mention in the manuscript that we did not use single marker genes to annotate clusters, but instead we used multiple marker genes for each cluster for the annotation process. We used both marker genes derived from our dataset and marker genes identified from zebrafish resources for cluster annotation. We chose single marker genes for each cluster for visualization purposes and for in situ hybridizations. However, it is clear from the reviewers’ comments that we needed to make more clear how the annotations were performed. To make this effort more clear in our revision, we included two new supplementary files – one with Seurat derived marker genes and one with marker genes derived from our DotPlot method. We also included extensive supplementary figures highlighting different markers. Using Daniocell, we identified 6 zebrafish markers per major cell type and showed their expression patterns in our atlas with FeaturePlots. We also included feature plots of the top 6 marker genes for each cluster. We hope that the addition of these 40+ plots (S3:S57) to the supplement fully addresses these concerns. 

      We appreciated the suggestion of cytotrace from reviewer #2! We ran cytotrace on three major cell lineages (neural, muscle, and connective; S73) which complemented our KEGG analysis in suggesting an undifferentiated fate for clusters 8, 10, and 16. We chose to not run SAMap because it is a scRNA-seq library integration tool. Although we compared our lectin epidermal findings to 3 dpf zebrafish scRNA-seq data, we did not integrate the datasets out of concern that we could draw erroneous conclusions for other cell types.  Future work that explores this technical challenge may uncover the role of heterochrony in syngnathid craniofacial development. We detail these changes more fully in our responses to reviewers.

      (2) The claims regarding evolutionary novelty and/or the genes involved are considered speculative. In part, this comes from relying too heavily on comparisons against zebrafish, as opposed to more closely related species. For example, the discussion regarding C-type lectin expression in the epidermis and KEGG enrichment (lines 358 - 364) seems confusing. Another good example here is the discussion on sfrp1a (lines 258 - 261). Here, the text seems to suggest craniofacial sfrp1a expression (or specifically ethmoid expression?) is connected to the development of the elongated snout in pipefish. However, craniofacial expression of sfrp1a is also reported in the arctic charr, which the authors grouped into fishes with derived craniofacial structures. Separately, sfrp2 expression was also reported in stickleback fish, for example. Do these different discussions truly support the notion that sfrp1a expression is all that unique in pipefish, rather than that pipefish and zebrafish are only distantly related and that sfrp1a was a marker gene first, and co-opted gene second? The authors should respond to the comments in the public review related to this aspect, and include more informative comparison and discussion. 

      A much more nuanced discussion with appropriate comparisons and caveats would be strongly recommended here.  

      We appreciate this insight and used it as a motivator to complete and add select comparative ISH data to this manuscript. We added in situ hybridization experiments from stickleback fish for craniofacial development genes (sfrp_1a, prdm16, bmp4_, and fgf22; S76-S79).  After adding stickleback ISH to the manuscript, we were able to make comparisons between pipefish and stickleback patterns and draw more informed conclusions (L244-L248; L262-277; L444-470). We added additional nuance to the discussion of the head, tooth (L485-489), and male pregnancy (L358-L391) sections to address concerns of study limitations. We describe in more detail these additional data in response to reviewers.

      (3) In situ hybridization results: as already included above, there is generally weak labeling of species, developmental stages, and other markings that can provide context. The collective feeling here is that as it is currently presented, the ISH results do not go too far beyond simply illustrative purposes. To take these results further, more detailed comparison may be needed. At a minimum, far better labeling can help avoid making the wrong impression. 

      Based on the reviewers’ comments, we made changes to improve ISH clarity and add select comparative ISH findings. ISH was used to further interpretation of the scRNAseq atlas. All the developmental stages and species information for the embryos used were in the figure captions as well as in supplemental file 4. Since we primarily used wild caught embryos, we did not have specific ages of most embryos. The technical challenges of acquiring and staging Syngnathus embryos are detailed above. Because we did not have their age, we classified them based on developmental markers (such as presence of somites and the extent of craniofacial elongation). Although these classification methods are not ideal, they are consistent with the syngnathid literature (Sommer et al. 2012).  

      We followed reviewer #1’s recommendations by adding an annotated graphic of a pipefish head, aligning bay and Gulf pipefish sequences for the probe regions, expanding out our supplemental figures for ISH into a figure for each probe, and improving labeling. These changes improved the description of the ISH experiments and have increased the quality of the manuscript.

      We would have loved to complete detailed comparative studies as suggested, but doing such a complete analysis was not feasible for this study. Therefore, we completed an additional focused analysis. We followed reviewer #3’s idea and added ISHs from threespine stickleback, a short snouted fish, for 4 genes (sfrp1a, prdm16, fgf22, and bmp4). While more extensive ISHs tracking all marker genes through a variety of developmental stages in pipefish and stickleback would have provided crucial insights, we feel that it is beyond the scope of this study and would require a significant amount of additional work. We, thus, primarily interpreted the ISH results as illustrative data points in our discussion. As we state in the response to reviewer 1, the generation and annotation of the first scRNA-seq atlas in a syngnathid is the primary goal of this manuscript.  The ISHs were utilized primarily to support inferred cluster annotations if a positive identification of marker gene expression in expected tissues/cells occurred. 

      Reviewer #1 (Recommendations For The Authors): 

      While the scRNA-seq dataset offers a valuable resource for evo-devo analyses in fish and the hypotheses are of interest, critical aspects should be strengthened to support the claims of the study. 

      Concerning the scRNA-seq dataset, the major points to be addressed are listed below: 

      - Supplementary file 3 reports the single markers used to validate cluster annotations. To confirm cluster identities, more markers specific to each cluster should be highlighted and presented on the UMAP. 

      We recognize the reviewer’s concern and had in reality used numerous markers to annotate the clusters. Based upon the reviewer’s comment we decided to make this clear by creating feature plots for every cluster with the top 6 marker genes. These plots showcase gene specificity in UMAP space. We also added feature plots for zebrafish marker genes for key cell types. Through these changes and the addition of 54 supplementary figures (S3:S57), we hope that it is clear that numerous markers validated cluster identity.

      For example, as clusters 17 and 37 share the same tnnti1 marker, which other markers permit to differentiate their respective identity. 

      This is a fair point. Cluster 17 and 37 both are marked by a tnni1 ortholog.

      Different paralogous co-orthologs mark each cluster (cluster 17: LOC125989146; cluster 37: LOC125970863). In our revision to the above comment, additional (6) markers per cluster were highlighted which should remedy this concern. 

      - L146: the low number of identified cartilaginous cells (only 2% of total connective tissue cells) appears aberrant compared to bone cell number, while Figure 1 presents a welldeveloped cartilaginous skeleton with poor or no signs of ossification. Please discuss this point. 

      We also found this to be interesting and added a brief discussion on this subject to the results section (L147-L149). Single cell dissociations can have variable success for certain cell types. It is possible that the cartilaginous cells were more difficult to dissociate than the osteoblast cells.

      - L162: pax3a/b are not specific to muscle progenitors as the genes are also expressed in the neural tube and neural crest derivatives during organogenesis. Please confirm cluster 10 identity.  

      Thank you for the reminder, we added numerous feature plots that explored zebrafish (from Daniocell) and pipefish markers (identified in our dataset). Examining zebrafish satellite muscle markers (myog, pabpc4, and jam2a) shows a strong correspondence with cluster #10.

      - L198: please specify in the text the pigment cell cluster number. 

      We completed this change.

      - L199: it is not clear why considering module 38 correlated to cluster 20 while modules 2/24 appear more correlated according to the p-value color code. 

      We thank the reviewer for pointing this confusing element out! Although the t-statistic value for module 38 (3.75) is lower than the t-statistics for modules 2 and 24 (5.6 and 5.2, respectively), we chose to highlight module 38 for its ‘connectivity dependence’ score. In our connectivity test, we examined whether removing cells from a specific cell cluster reduced the connectivity of a gene network. We found that removing cluster 20 led to a decrease in module 38’s connectivity (-.13, p=0) while it led to an increase in modules 2 and 24’s connectivity (.145, p=1; .145, p=9.14; our original supplemental files 9-10). Therefore, the connectivity analysis showed that module 38’s structure was more dependent on cluster 20 than in comparison with modules 2 and 24. Although you highlighted an interesting quandary, we decided that this is tangential to the paper and did not add this discussion to the manuscript. 

      - Please describe in the text Figure 4A. 

      Completed, we thank the reviewer for catching this! 

      Concerning embryo stainings, the major points to be addressed are listed below: 

      - Figure 1: please enhance the light/contrast of figures to highlight or show the absence of alcian/alizarin staining. Mineralized structures are hardly detectable in the head and slight differences can be seen between the two samples. The developmental stage should be added. Please homogenize the scale bar format (remove the unit on panels E and, G as the information is already in the text legend). It would be useful to illustrate the data with a schematic view of the structures presented in panels B, and E, and please annotate structures in the other panels.  

      We thank the reviewer for these suggestions to improve our figure. We increased the brightness and contrast for all our images. We also added an illustration of the head with labels of elements. As discussed, we used wild caught pregnant males and, therefore, do not know the exact age of the specimens. However, we described the developmental stage based on morphological observations. Slight differences in morphology between samples is expected. We and others have noticed that

      developmental rate varies, even within the same brood pouch, for syngnathid embryos. We observed several mineralization zones including in the embryos including the upper and lower jaws, the mes(ethmoid), and the pectoral fin. We recognize the cartilage staining is more apparent than the bone staining, though increasing image brightness and contrast did improve the visibility of the mineralization front.

      - All ISH stainings and images presented in Figures 4-6/ Figures S2-3 should be revised according to comments provided in the public review. 

      We thank the reviewer for providing thorough comments, we provided an in-depth response to the public review. We made several improvements to the manuscript to address their concerns. 

      - Figure 4: Figure 4B should be described before 4C in the text or inverse panels / L222 the Meckel's cartilage is not shown on Figure 4C. The schematic views in H should be annotated and the color code described / the ISH data must be completed to correlate spatially clusters to head structures. 

      We thank the reviewer for pointing this out, we fixed the issues with this figure and added annotations to the head schematics.

      - Figure 5: typo on panels 'alician' = alcian. 

      We completed this change. 

      - Figures S2-3: data must be better presented, polished / typo in captions 'relavant'= relevant. 

      Thank you for this critique, we created new supplementary figures to enhance interpretation of the data (S59-S71). In these new figures, we included a feature plot for each gene and respective ISHs.

      - Figure S3: soat2 = no evidence of muscle marker neither by ISH presented nor in the literature. 

      We realized this staining was not clear with the previous S2/S3 figures. Our new changes in these supplementary figures based on the reviewer’s ideas made these ISH results clearer. We observed soat2 staining in the sternohyoideus muscle (panel B in S71).

      Other points: 

      - The cartilage/bone developmental state (Alcian/alizarin staining) and/or ISH for classical markers of muscle development (such as pax3/myf5) could be used to clarify the This could permit the completion of a comparative analysis between the two species and the interpretation of novel and adaptative characters.  

      We appreciate this idea! We thought deeply about a well characterized comparative analysis between pipefish and zebrafish for this study. We discussed our concerns in our public response to reviewer 2. We found that it was challenging to stage match all cell types, and were concerned that we could make erroneous conclusions. For example, our pipefish samples were still inside the male brood pouch and possessed yolk sacs. However, we found osteoblast cells in our scRNAseq atlas, and in alizarin staining. Although zebrafish literature notes that the first zebrafish bone appears at 3 dpf (Kimmel et al. 1995), osteoblasts were not recognized until 5 dpf in two scRNAseq datasets (Fabian et al. 2022; Lange et al. 2023). A 5dpf zebrafish is considered larval and has begun hunting. Therefore, we chose to not integrate our data out of concern that osteoblast development may occur at different timelines between the fishes. 

      Fabian, P., Tseng, K.-C., Thiruppathy, M., Arata, C., Chen, H.-J., Smeeton, J., Nelson, N., & Crump, J. G. (2022). Lifelong single-cell profiling of cranial neural crest diversification in zebrafish. Nature Communications 2022 13:1, 13(1), 1–13. 

      Lange, M., Granados, A., VijayKumar, S., Bragantini, J., Ancheta, S., Santhosh, S., Borja, M., Kobayashi, H., McGeever, E., Solak, A. C., Yang, B., Zhao, X., Liu, Y., Detweiler, A. M., Paul,

      S., Mekonen, H., Lao, T., Banks, R., Kim, Y.-J., … Royer, L. A. (2023). Zebrahub – Multimodal Zebrafish Developmental Atlas Reveals the State-Transition Dynamics of Late-Vertebrate Pluripotent Axial Progenitors. BioRxiv, 2023.03.06.531398. 

      Kimmel, C., Ballard, S., Kimmel, S., Ullmann, B., Schilling, T. (1995). Stages of Embryonic Development of the Zebrafish. Developmental Dynamics 203:253:-310.

      'in situs' in the text should be replaced by 'in situ experiments'.  

      We made this change (L395, L663, L666, L762).

      - Lines 562-565: information on samples should be added at the start of the result section to better apprehend the following scRNA-seq data.

      We thank the reviewer for pointing out this issue. Although we had a few sentences on the samples in the first paragraph of the result section, we understand that it was missing some critical pieces of information. Therefore, we added these additional details to the beginning of the results section (L126-L132). 

      - Lines 629-665: PCR with primers designed on gulf pipefish genome could be performed in parallel on bay and gulf cDNA libraries, and amplification products could be sequenced to analyze alignment and validate the use of gulf pipefish ISH probes in bay pipefish embryos. Probe production could also be performed using gulf primers on bay pipefish cDNA pools. 

      After the submission of this manuscript, a bay pipefish genome was prepared by our laboratory. We used this genome to align our probes, these alignments demonstrate strong sequence conservation between the species. We included these alignments in our supplemental files.

      - L663: the bleaching step must be optimized on pipefish embryos. 

      We understand this concern and had completed several bleach optimization experiments prior to publication. Although we found that bleaching improved visibility of staining, we noticed with the probe tnmd that bleached embryos did not have complete staining of tendons and ligaments. The unbleached embryos had more extensive staining than the bleached embryos. We were concerned that bleaching would lead to failures to detect expression domains (false negatives) important for our analysis. Therefore, we did not use bleaching with our in situs experiments (except with hatched fish with a high degree of pigmentation). 

      - Indicate the number of specimens analyzed for each labeling condition.  

      We thank the reviewer for noticing this issue. We added this information to the methods (L766-767).

      - Describe the fixation and pre-treatment methods previous to ISH and skeleton stainings

      We thank the reviewer for pointing out this issue, we added these descriptions (L765-766; L772-774). 

      Reviewer #3 (Recommendations For The Authors): 

      (1) If sfrp1a expression is observed also in other fish species with derived craniofacial structures, it's important to discuss this more in the Discussion. This could be a common mechanism to modify craniofacial structures, although functional tests are ultimately required (but not in this paper, for sure). Can lines 421-428 involve the statement "a prolonged period of chondrocyte differentiation" underlies craniofacial diversity?

      This is a great idea, and we added a sentence that captures this ethos (L451-452).

      (2) Lines 334-346 need to be rephrased. It's hard to understand which genes are expressed or not in pipefish and zebrafish. Did "23 endocytosis genes" show significant enrichment in zebrafish epidermis, or are they expressed in zebrafish epidermis? 

      We thank the reviewer for this comment, we re-phrased this section for clarity (L365-368).

      (3) Figure 4 is missing the "D" panel and two "E" panels. 

      We thank the reviewer for noticing this, we fixed this figure.

      (4) Line 302: "whole-mount" or "whole mount"

      We thank the reviewer for the catch!

    1. Author response:

      Reviewer #1 (Public review):

      Comment 1: In the Results section, the rationale behind selecting the beta band for the central (C3, CP3, Cz, CP4, C4) regions and the theta band for the fronto-central (Fz, FCz, Cz) regions is not clearly explained in the main text. This information is only mentioned in the figure captions. Additionally, why was the beta band chosen for the S-ROI central region and the theta band for the S-ROI fronto-central region? Was this choice influenced by the MVPA results?

      We thank the reviewer for the question regarding the rationale for the S-ROI selection in our study. The beta band was chosen for the central region due to its established relevance in motor control (Engel & Fries, 2010), movement planning (Little et al., 2019) and motor inhibition (Duque et al., 2017). The fronto-central theta band (or frontal midline theta) was a widely recognized indicator in cognitive control research (Cavanagh & Frank, 2014), associated with conflict detection and resolution processes. Moreover, recent empirical evidence suggested that the fronto-central theta reflected the coordination and integration between stimuli and responses (Senoussi et al., 2022). Although we have described the cognitive processes linked to these different frequencies in the introduction and discussion sections, along with the potential patterns of results observed in Stroop-related studies, we did not specify the involved cortical areas. Therefore, we have specified these areas in the introduction to enhance the clarity of the revised version (in the fourth paragraph of the Introduction section).

      Regarding whether the selection of S-ROIs was influenced by the MVPA results, we would like to clarify here that we selected the S-ROIs based on prior research and then conducted the decoding analysis. Specifically, we first extracted the data representing different frequency indicators (three F-ROIs and three S-ROIs) as features, followed by decoding to obtain the MVPA results. Subsequently, the time-frequency analysis, combined with the specific time windows during which each frequency was decoded, provided detailed interaction patterns among the variables for each indicator. The specifics of feature selection are described in the revised version (in the first paragraph of the Multivariate Pattern Analysis section).

      Comment 2: In the Data Analysis section, line 424 states: “Only trials that were correct in both the memory task and the Stroop task were included in all subsequent analyses. In addition, trials in which response times (RTs) deviated by more than three standard deviations from the condition mean were excluded from behavioral analyses.” The percentage of excluded trials should be reported. Also, for the EEG-related analyses, were the same trials excluded, or were different criteria applied?

      We thank the reviewer for this suggestion. Beyond the behavioral exclusion criteria, trials with EEG artifacts were also excluded from the data for the EEG-related analyses. We have now reported the percentage of excluded trials for both behavioral and EEG data analyses in the revised version (in the second paragraph of the EEG Recording and Preprocessing section and the first paragraph of the Behavioral Analysis section).

      Comment 3: In the Methods section, line 493 mentions: “A 400-200 ms pre-stimulus time window was selected as the baseline time window.” What is the justification in the literature for choosing the 400-200 ms pre-stimulus window as the baseline? Why was the 200-0 ms pre-stimulus period not considered?

      We thank the reviewer for this question and would like to provide the following justification. First, although a baseline ending at 0 ms is common in ERP analyses, it may not be suitable for time-frequency analysis. Due to the inherent temporal smoothing characteristic of wavelet convolution in time-frequency decomposition, task-related early activities can leak into the pre-stimulus period (before 0 ms) (Cohen, 2014). This means that extending the baseline to 0 ms will include some post-stimulus activity in the baseline window, thereby increasing baseline power and compromising the accuracy of the results. Second, an ideal baseline duration is recommended to be around 10-20% of the entire trial of interest (Morales & Bowers, 2022). In our study, the epoch duration was 2000 ms, making 200-400 ms an appropriate baseline length. Third, given that the minimum duration of the fixation point before the stimulus in our experiment was 400 ms, we chose the 400 ms before the stimulus as the baseline point to ensure its purity. In summary, considering edge effects, duration requirements, and the need to exclude other influences, we selected a baseline correction window of -400 to -200 ms. To enhance the clarity of the revised version, we have provided the rationale for the selected time windows along with relevant references (in the first paragraph of the Time-frequency analysis section).

      Comment 4: Is the primary innovation of this study limited to the methodology, such as employing MVPA and RSA to establish the relationship between late theta activity and behavior?

      We thank the reviewer for this insightful question and would like to clarify that our research extends beyond mere methodological innovation; rather, it utilized new methods to explore novel theoretical perspectives. Specifically, our research presents three levels of innovation: methodological, empirical, and theoretical. First, methodologically, MVPA overcame the drawbacks of traditional EEG analyses based on specific averaged voltage intensities, providing new perspectives on how the brain dynamically encoded particular neural representations over time. Furthermore, RSA aimed to identify which indicators among the decoded were directly related to behavioral representation patterns. Second, in terms of empirical results, using these two methods, we have identified for the first time three EEG markers that modulate the Stroop effect under verbal working memory load: SP, late theta, and beta, with late theta being directly linked to the elimination of the behavioral Stroop effect. Lastly, from a theoretical perspective, we proposed the novel idea that working memory played a crucial role in the late stages of conflict processing, specifically in the stimulus-response mapping stage (the specific theoretical contributions are detailed in the second-to-last paragraph of the Discussion section).

      Comment 5: On page 14, lines 280-287, the authors discuss a specific pattern observed in the alpha band. However, the manuscript does not provide the corresponding results to substantiate this discussion. It is recommended to include these results as supplementary material.

      We thank the reviewer for this suggestion. We added a new figure along with the corresponding statistical results that displayed the specific result patterns for the alpha band (Supplementary Figure 1).

      Comment 6: On page 16, lines 323-328, the authors provide a generalized explanation of the findings. According to load theory, stimuli compete for resources only when represented in the same form. Since the pre-memorized Chinese characters are represented semantically in working memory, this explanation lacks a critical premise: that semantic-response mapping is also represented semantically during processing.

      We thank the reviewer for this insightful suggestion. We fully agree with the reviewer’s perspective. As stated in our revised version, load theory suggests that cognitive resources are limited and dependent on a specific type (in the second paragraph of the Discussion section). The previously memorized Chinese characters are stored in working memory in the form of semantic representations; meanwhile the stimulus-response mapping should also be represented semantically, leading to resource occupancy. We have included this logical premise in the revised version (in the third-to-last paragraph of the Discussion section).

      Comment 7: The classic Stroop task includes both a manual and a vocal version. Since stimulus-response mapping in the vocal version is more automatic than in the manual version, it is unclear whether the findings of this study would generalize to the impact of working memory load on the Stroop effect in the vocal version.

      We fully agree with the reviewer’s point that the verbal version of the Stroop task differs from the manual version in terms of the degree of automation in the stimulus-response mapping. Specifically, the verbal version relies on mappings that are established through daily language use, while the manual version involves arbitrary mappings created in the laboratory. Therefore, the stimulus-response mapping in the verbal response version is more automated and less likely to be suppressed. However, our previous research indicated that the degree of automation in the stimulus-response mapping was influenced by practice (Chen et al., 2013). After approximately 128 practice trials, semantic conflict almost disappears, suggesting that the level of automation in stimulus-response mapping for the verbal Stroop task is comparable to that of the manual version (Chen et al., 2010). Given that participants in our study completed 144 practice trials (in the Procedure section), we believe these findings can be generalized to the verbal version.

      Comment 8: While the discussion section provides a comprehensive analysis of the study’s results, the authors could further elaborate on the theoretical and practical contributions of this work.

      We thank the reviewer for the constructive suggestions. We recognize that the theoretical and practical contributions of the study were not thoroughly elaborated in the original manuscript. Therefore, we have now provided a more detailed discussion. Specifically, the theoretical contributions focus on advancing load theory and highlighting the critical role of working memory in conflict processing. The practical contributions emphasize the application of load theory and the development of intervention strategies for enhancing inhibitory control. A more detailed discussion can be found in the revised version (in the second-to-last paragraph of the Discussion section).

      Reviewer #2 (Public review):

      Comment 1: As the researchers mentioned, a previous study reported a diminished Stroop effect with concurrent working memory tasks to memorize meaningless visual shapes rather than memorize Chinese characters as in the study. My main concern is that lower-level graphic processing when memorizing visual shapes also influences the Stroop effect. The stage of Stroop conflict processing affected by the working memory load may depend on the specific content of the concurrent working memory task. If that’s the case, I sense that the generalization of this finding may be limited.

      We thank the reviewer for this insightful concern. As mentioned in the manuscript, this may be attributed to the inherent characteristics of Chinese characters. In contrast to English words, the processing of Chinese characters relies more on graphemic encoding and memory (Chen, 1993). Therefore, the processing of line patterns essentially occupies some of the resources needed for character processing, which aligns with our study’s hypothesis based on dimensional overlap. Additionally, regarding the results, even though the previous study presents lower-level line patterns, the results still showed that the working memory load modulated the later theta band. We hypothesize that, regardless of the specific content of the pre-presented working memory load, once the stimulus disappears from view, these loads are maintained as representations in the working memory platform. Therefore, they do not influence early perceptual processing, and resource competition only occurs once the distractors reach the working memory platform. Lastly, previous study has shown that spatial loads, which do not overlap with either the target or distractor dimensions, do not influence conflict effect (Zhao et al., 2010). Taken together, we believe that regardless of the specific content of the concurrent working memory tasks, as long as they occupy resources related to irrelevant stimulus dimensions, they can influence the late-stage processing of conflict effect. Perhaps our original manuscript did not convey this clearly, so we have rephrased it in a more straightforward manner (in the second paragraph of the Discussion section).

      Comment 2: The P1 and N450 components are sensitive to congruency in previous studies as mentioned by the researchers, but the results in the present study did not replicate them. This raised concerns about data quality and needs to be explained.

      We thank the reviewer for this insightful concern. For P1, we aimed to convey that the early perceptual processing represented by P1 is part of the conflict processing process. Therefore, we included it in our analysis. Additionally, as mentioned in the discussion, most studies find P1 to be insensitive to congruency. However, we inappropriately cited a study in the introduction that suggested P1 shows differences in congruency, which is among the few studies that hold this perspective. To prevent confusion for readers, we have removed this citation from the introduction.

      As for N450, most studies have indeed found it to be influenced by congruency. In our manuscript, we did not observe a congruency effect at our chosen electrodes and time window. However, significant congruency effects were detected at other central-parietal electrodes (CP3, CP4, P5, P6) during the 350-500 ms interval. The interaction between task type and consistency remained non-significant, consistent with previous results. Furthermore, with respect to the location of the electrodes chosen, existing studies on N450 vary widely, including central-parietal electrodes and frontal-central electrodes (for a review, see Heidlmayr et al., 2020). We speculate that this phenomenon may be related to the extent of practice. With fewer total trials, the task may involve more stimulus conflicts, engaging more frontal brain areas. On the other hand, with more total trials, the task may involve more response conflicts, engaging more central-parietal brain areas (Chen et al., 2013; van Veen & Carter, 2005). Due to the extensive practice required in our study, we identified a congruency N450 effect in the central-parietal region. We apologize for not thoroughly exploring other potential electrodes in the previous manuscript, and we have revised the results and interpretations regarding N450 accordingly in the revised version (in the N450 section of the ERP results and the third paragraph of the Discussion section).

      Reference

      Cavanagh, J. F., & Frank, M. J. (2014). Frontal theta as a mechanism for cognitive control. Trends in Cognitive Sciences, 18(8), 414–421. https://doi.org/10.1016/j.tics.2014.04.012

      Chen, M. J. (1993). A Comparison of Chinese and English Language Processing. In Advances in Psychology (Vol. 103, pp. 97–117). North-Holland. https://doi.org/10.1016/S0166-4115(08)61659-3

      Chen, X. F., Jiang, J., Zhao, X., & Chen, A. (2010). Effects of practice on semantic conflict and response conflict in the Stroop task. Psychol. Sci., 33, 869–871.

      Chen, Z., Lei, X., Ding, C., Li, H., & Chen, A. (2013). The neural mechanisms of semantic and response conflicts: An fMRI study of practice-related effects in the Stroop task. NeuroImage, 66, 577–584. https://doi.org/10.1016/j.neuroimage.2012.10.028

      Cohen, M. X. (2014). Analyzing Neural Time Series Data: Theory and Practice. The MIT Press. https://doi.org/10.7551/mitpress/9609.001.0001

      Duprez, J., Gulbinaite, R., & Cohen, M. X. (2020). Midfrontal theta phase coordinates behaviorally relevant brain computations during cognitive control. NeuroImage, 207, 116340. https://doi.org/10.1016/j.neuroimage.2019.116340

      Duque, J., Greenhouse, I., Labruna, L., & Ivry, R. B. (2017). Physiological Markers of Motor Inhibition during Human Behavior. Trends in Neurosciences, 40(4), 219–236. https://doi.org/10.1016/j.tins.2017.02.006

      Engel, A. K., & Fries, P. (2010). Beta-band oscillations—Signalling the status quo? Current Opinion in Neurobiology, 20(2), 156–165. https://doi.org/10.1016/j.conb.2010.02.015

      Heidlmayr, K., Kihlstedt, M., & Isel, F. (2020). A review on the electroencephalography markers of Stroop executive control processes. Brain and Cognition, 146, 105637. https://doi.org/10.1016/j.bandc.2020.105637

      Little, S., Bonaiuto, J., Barnes, G., & Bestmann, S. (2019). Human motor cortical beta bursts relate to movement planning and response errors. PLOS Biology, 17(10), e3000479. https://doi.org/10.1371/journal.pbio.3000479

      Morales, S., & Bowers, M. E. (2022). Time-frequency analysis methods and their application in developmental EEG data. Developmental Cognitive Neuroscience, 54, 101067. https://doi.org/10.1016/j.dcn.2022.101067

      Senoussi, M., Verbeke, P., Desender, K., De Loof, E., Talsma, D., & Verguts, T. (2022). Theta oscillations shift towards optimal frequency for cognitive control. Nature Human Behaviour, 6(7), Article 7. https://doi.org/10.1038/s41562-022-01335-5

      van Veen, V., & Carter, C. S. (2005). Separating semantic conflict and response conflict in the Stroop task: A functional MRI study. NeuroImage, 27(3), 497–504. https://doi.org/10.1016/j.neuroimage.2005.04.042

      Zhao, X., Chen, A., & West, R. (2010). The influence of working memory load on the Simon effect. Psychonomic Bulletin & Review, 17(5), 687–692. https://doi.org/10.3758/PBR.17.5.687

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In the work: "Endosomal sorting protein SNX4 limits synaptic vesicle docking and release" Josse Poppinga and collaborators addressed the synaptic function of Sortin-Nexin 4 (SNX4). Employing a newly developed in vitro KO model, with live imaging experiments, electrophysiological recordings, and ultrastructural analysis, the authors evaluate modifications in synaptic morphology and function upon loss of SNX4. The data demonstrate increased neurotransmitter release and alteration in synapse ultrastructure with a higher number of docked vesicles and shorter AZ. The evaluation of the presynaptic function of SNX4 is of relevance and tackles an open and yet unresolved question in the field of presynaptic physiology.

      Strengths:

      The sequential characterization of the cellular model is nicely conducted and the different techniques employed are appropriate for the morpho-functional analysis of the synaptic phenotype and the derived conclusions on SNX4 function at presynaptic site. The authors succeeded in presenting a novel in vitro model that resulted in chronical deletion of SNX4 in neurons. A convincing sequence of experimental techniques is applied to the model to unravel the role of SNX4, whose functions in neuronal cells and at synapses are largely unknown. The understanding of the role of endosomal sorting at the presynaptic site is relevant and of high interest in the field of synaptic physiology and in the pathophysiology of the many described synaptopathies that broadly result in loss of synaptic fidelity and quality control at release sites.

      We thank the reviewer for their positive evaluation of our manuscript.

      Weaknesses:

      The flow of the data presentation is mostly descriptive with several consistent morphological and functional modifications upon SNX loss. The paper would benefit from a wider characterization that would allow us to address the physiological roles of SNX4 at the synaptic site and speculate on the underlying molecular mechanisms. In addition, due to the described role of SNX4 in autophagy and the high interest in the regulation of synaptic autophagy in the field of synaptic physiology, an initial evaluation of the autophagy phenotype in the neuronal SNX4KO model is important, and not to be only restricted to the discussion section.

      We thank the reviewer for their suggestions and agree that broader characterization would help us speculate on the underlying mechanism. To address this, we have conducted additional independent experiments investigating the role of SNX4 in neuronal autophagy, as suggested by this reviewer. These experiments are now included in the main figures and are no longer limited to the discussion section. Please see the detailed responses to this reviewer's recommendations below.

      Reviewer #2 (Public Review):

      Summary:

      SNX4 is thought to mediate recycling from endosomes back to the plasma membrane in cells. In this study, the authors demonstrate the increases in the amounts of transmitter release and the number of docked vesicles by combining genetics, electrophysiology, and EM. They failed to find evidence for its role in synaptic vesicle cycling and endocytosis, which may be intuitively closer to the endosome function.

      Strengths:

      The electrophysiological data and EM data are in principle, convincing, though there are several issues in the study.

      We thank the reviewer for their positive evaluation of our manuscript.

      Weaknesses:

      It is unclear why the increase in the amounts of transmitter release and docked vesicles happened in the SNX4 KO mice. In other words, it is unclear how the endosomal sorting proteins in the end regulate or are connected to presynaptic, particularly the active zone function.

      We thank the reviewer for their suggestions and agree that further characterization would help to understand how endosomal sorting proteins regulate presynaptic neurotransmission. We have now added extra data on electrophysiological recordings clarifying SNX4’s role in the synapse. Please see the detailed responses to this reviewer's recommendations below.

      Reviewer #3 (Public Review):

      Summary:

      The study aims to determine whether the endosomal protein SNX4 performs a role in neurotransmitter release and synaptic vesicle recycling. The authors exploited a newly generated conditional knockout mouse to allow them to interrogate the SNX4 function. A series of basic parameters were assessed, with an observed impact on neurotransmitter release and active zone morphology. The work is interesting, however as things currently stand, the work is descriptive with little mechanistic insight. There are a number of places where the data appear to be a little preliminary, and some of the conclusions require further validation.

      Strengths:

      The strengths of the work are the state-of-the-art methods to monitor presynaptic function.

      We thank the reviewers for their positive evaluation of our manuscript.

      Weaknesses:

      The weaknesses are the fact that the work is largely descriptive, with no mechanistic insight into the role of SNX4. Further weaknesses are the absence of controls in some experiments and the design of specific experiments.

      We thank the reviewer for their suggestions and agree that addition of extra control groups and experiments would strengthen interpretation of the observed phenotype. To address this, we have now performed experiments to investigate the miniature excitatory postsynaptic currents and added extra control groups such as overexpression of SNX4 on control background. In addition, we assessed SNX4-mediated neuronal autophagy as a potential molecular mechanism by which SNX4 affects synaptic output. Please see the detailed responses to this reviewers’ recommendations below.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) The characterization of the neurite outgrowth presented in Figure 1 is a necessary starting point for the characterization of the model and the interpretation of the following data. Being the analysis conducted at 21 DIV, a significant portion of the neurite tree is out of the analyzed field. Adding sholl analysis will better indicate the complexity of the that appears to be influenced by SNX4 loss in the representative images shown in Figure 1f.

      We fully agree and have now performed a Sholl analysis of dendrite branches to investigate dendritic complexity. (Figure 1(i), page 2-3, line 86-88). SNX4 depletion does not affect dendrite length or dendrite branching.

      (2) Analogously, the characterization of synapse number is of relevance for the interpretation of the data. For a better flow of the data, Figure 4 might be presented as Figure 2 (without the repetition of panel h in Figure 1). An explanation of how VAMP2 puncta are processed is necessary in the method section. A double labelling with a postsynaptic marker would allow trafficking organelles to be distinguished from mature synaptic contacts. Indeed, the analysis of VAMP2 intensity along neurite in mature 21DIV neurons should reveal peaks in the intensity profile that represent synaptic contacts. For unexplained reasons, the profile is rather flat in the two experimental groups. Focusing on axonal branches will surely result in a peaked profile for VAMP2 labelling.

      We fully agree that the characterization of synapses is relevant for the interpretation of the data. We have now added a section in our Material and Methods how the VAMP2 puncta are processed (p14 line 517-520). Instead of labeling mature synapses using double labeling of VAMP2 and PSD95, we analyzed the number of active synapses in live neurons using SypHy (Fig. 3g). The reviewer is correct that the VAMP2 data presented in Fig 1I and Fig 4 is part of the same dataset and we have clarified this in the figure legend. In Fig 1I only the total number of VAMP2 puncta is plotted as a marker for synapse number, while in Fig 4 we assess VAMP2 as potential SNX4 sorting cargo (Ma et al., 2017). Because of these different aims, we prefer to keep the figures separate. The analysis of VAMP2 intensity along the distance of the soma is a Sholl analysis (Fig. 4d), represents the average VAMP2 intensity over distance from the soma of 35-41 neurons per group. In contrast to a line scan of a single neurite, this average profile lacks the peaks of individual synapses.

      (3) Miniature excitatory postsynaptic currents recordings would strengthen the synaptic characterization and complement the electrophysiological recordings shown in Figure 2. Analyzing frequency and amplitude parameters would complement the data on the number of synaptic connections defined by the pre and postsynaptic colocalization puncta as suggested above and may support the data shown in Figure 3 g that suggests a decreased number of active synapses in SNX4-KO cells.

      We fully agree that the characterization of miniature excitatory postsynaptic currents would strengthen the synaptic characterization and complement the other electrophysiological data. Therefore, we have now added additional experiments showing the mEPSCs (Fig. 2k-m, page 4) in SNX4 cKO neurons versus control. This data shows that the amplitude and frequency of spontaneous miniature EPSCs (mEPSCs) were not affected upon SNX4 depletion, consistent with a normal first evoked EPSC and RRP estimate. Furthermore, these data suggest that it is unlikely that the observed increase in neurotransmission is due to post-synaptic effects.

      (4) Recordings on the first evoked response shown in Figure 2 b and quantified in Figures c and d suggest that SNX4 overexpression per se exerts some effect on the Amplitude and the Charge of the first evoked response. This is also evident in the supplementary Figure 2 with lower frequency trains. An additional experimental group, namely control+SNX4 is needed for the correct interpretation of the observed phenotype. The possibility that SNX4 per se exerts an effect on evoked transmission could be discussed in terms of putative mechanisms and interactions.

      We thank the reviewer for their suggestion and agree that an additional experimental group (control + SNX4) would strengthen interpretation of the observed phenotype. We have now added a new experimental condition with overexpression of SNX4 on a control background (Supplementary Fig. 3, page 20). This data shows that the amplitude and charge of the first evoked response were not affected in control + SNX4 neurons compared to control, and no differences were detected in the response to the 40 Hz stimulation train (Supplementary Fig. 3a-e).  Together, these data suggest that SNX4 overexpression in itself does not affect the neurotransmission protocols studied in SNX4 cKO experiments.

      (5) To correctly interpret the SyPhy experiments and exclude an effect of SNX silencing on SV recycling, it is suggested to repeat the experiments shown in Figure 3 in the absence and in the presence of bafilomycin. Indeed, the quantifications shown in Figure 3 d and f do not represent "release fraction" as stated (lines 139/140) but they rather refer to an average difference between release fraction and recovered fraction. With the use of bafilomycin, the comparison of the deltaFmax/deltaFNH4Cl with and without bafilomycin would enable the release fraction to be correctly evaluated and compared.

      We appreciate the reviewer’s suggestion and agree on the importance of considering the impact of SV recycling when evaluating the released fraction. We agree that the presence of bafilomycin is critical to isolate the released component during stimulation. We have now rephrased this conclusion. To assess synaptic recycling in these assays, bafilomycin in not critically required and we show by multiple independent experiments, including SypHy and FM64 dye assays, that SV recycling is either not affected or the effect is too small to be detected by these methods.

      (6) In the ultrastructural analysis, additional quantifications are needed to exclude the accumulation of endosome-like structures. It is not clear if, in the evaluation of total SV number (Figure 5e), the authors counted all vesicles or vesicles < 50nm. This has to be explained and additional quantification of # of SV < 50nm and # SV > 50nm is informative, taking into account the endosomal nature of SNX4. Indeed, although the average size of SV is not changed (fig. 5 d), the density of "bigger vesicle" may result from endosomal-like structure accumulation. An additional suggested quantification is on vesicle # SV > 80nm as previously reported in the cited references dealing with endosomal proteins and presynaptic morphology.

      We fully agree that the characterization of vesicle size is important and that it was not clearly stated which vesicles were included in the total number of SV (Fig. 5e). We have now added this to the figure description. We have also added a histogram that contains the vesicle numbers of different bin sizes for SNX4 cKO synapses and control synapses (Supplementary Fig. 4, page 21) including # SVs > 80nm. (Whilst it seems that there are more “bigger” vesicles in the KO, further analysis revealed that this is mostly driven by one experiment and this effect is not consistent.)

      (7) Due to the high scientific interest in presynaptic autophagy for SV recycling and degradation, and the paucity of experimental work assessing the proteins involved, an initial evaluation of the neuronal autophagy process (by western blot analysis and immunocytochemistry) for the characterization of the model will better support the paragraph in the discussion (lines 314-322) and contribute to future work in the field. Although very rare, autophagosomes quantification at presynaptic sites can also be performed from the already acquired images. A double membrane structure with the material inside is evident in the representative control image presented!

      We appreciate the reviewer’s suggestion and agree that presynaptic autophagy is an interesting potential mechanism that would elaborate our current working model. To address the reviewers’ suggestion, we added multiple independent experiments to investigate basal autophagy markers such as ATG5 using western blot analysis, characterization of p62 levels using immunohistochemistry and performed additional morphometric analysis on the electron microscopy data (Supplementary Fig. 5). In SNX4 cKO neurons, there was no significant difference in P62 puncta numbers or P62 somatic intensity under basal conditions or after blocking autophagic P62 degradation by bafilomycin treatment, suggesting that autophagic flux remains normal. Also, no changes in total ATG5 protein levels were observed and ultrastructural analysis revealed no differences in the total number of autophagosomes. Collectively, these data indicate that SNX4 depletion does not impact the basal autophagic flux, ATG5 protein levels, or the number of autophagosomes.

      Minor points:

      (1) Dorrbaun et al. 2018 is missing from the reference list. In the legend to figure 1 there is an incorrect reference to Figure 6, rather than Figure 4.

      We have now adjusted the figure legend and added the reference (page 16, line 604).

      (2) Information on the construct employed for the rescue is missing. Is it a fluorescent tag construct? Representative images of the three autaptic neurons (control, KO, KO+SNX4) would nicely complement data presentation in Figure 2. 

      We have now elaborated on this in material and methods section (p12, line 418-421). Unfortunately, we did not obtain pictures of autaptic neurons used for electrophysiology experiments.

      Reviewer #2 (Recommendations For The Authors):

      (1) Figure 2d and f are somewhat inconsistent. Total charges for the 1st EPSCs differ almost 2-fold in the same condition.

      We appreciate the reviewer’s concern. The average EPSCs charge of the first evoked was 89, 122 and 57 pC for control, KO and rescued neurons respectfully. The average charge of the first pulse of 40Hz train was 41,58 and 32 pC for control, KO and rescued neurons respectfully, which is roughly 50% of the naïve response of the same cells. These trains were recorded after 2 or 3 other stimulation paradigms, which can have affected the total charge released in the 40Hz train. That said, the proportional difference between groups is high comparable, with a 37% increased average charge released in SNX4 cKO compared to control in the naïve response and 41% increased response in the first response of the 40 Hz train, and rescued cells show a 53% reduction in average released charge compared to control in the naïve response compared to a 44% reduction in the first response of the 40 Hz train. Although the absolute values differ between these readouts, we conclude that the biological comparison between groups is consistent.

      (2) Figure 2h. This type of analysis has a drawback. See Neher (2015) for the problems associated with this analysis.

      We fully agree with the reviewer’s comment. As noted in our discussion (page 9 line 285), while this analysis has its limitations, it can still provide an indication of the ready releasable pool.   

      (3) The EPSC phenotype may be due to postsynaptic effects. This should be excluded by additional experiments (mEPSC analysis) or further clarification.

      We fully agree that the characterization of miniature excitatory postsynaptic currents recording would strengthen the synaptic characterization and complement the electrophysiological recordings. Therefore, we have now added additional experiments showing the mEPSCs (Fig. 2k-m) in SNX4 cKO neurons versus control. This data shows that the amplitude and frequency of spontaneous miniature EPSCs (mEPSCs) were not affected upon SNX4 depletion, suggesting that it is unlikely that the observed increase in neurotransmission is due to post-synaptic effects.

      (4) The increased number of docked vesicles observed in EM and the increased slope (vesicle recruitment, Figure 2h) are not consistent with each other. Maybe the definition of docked vesicles is unclear in this version of the manuscript.

      As noted in our material & methods (page 15, line 547-548), SVs were defined as docked if there was no distance visible between the SV membrane and the active zone membrane. We have added the pixel size for clarification. Indeed, we do not observe an increase in release probability or first evoked response, which would correspond with an increased docked pool. However, we think that the increase in docked vesicles might contribute to an enhanced SV recruitment (see discussion).

      (5) Figure 3: Vesicle cycling was monitored in only a limited condition. It is known that there are multiple pathways of vesicle cycling. Ideally, these pathways should be dissected. At least, the authors mention the possibility that they have missed some "positive" conditions.

      We fully agree with the reviewer’s comment that vesicle recycling is complex with several parallel pathways involved. While we did not study individual endocytosis pathways, we used different assays covering various recycling pathways. The SypHy assay (Fig. 3c & f) combined with the 100 AP stimulation paradigm at room temperature predominantly addresses clathrin-mediated endocytosis. Additionally, the FM-64 dye assay at 37 degrees Celsius covers ultrafast endocytosis pathways as well as bulk endocytosis routes. Since neither assay showed major effects, we decided not to pursue further experiments focusing on different endocytosis pathways.

      Reviewer #3 (Recommendations For The Authors):

      Major points:

      (1) Since all of the work here is culture-focussed, the in vivo phenotype is not as relevant, however the in vitro properties are. The incomplete Cre-dependent removal of SNX4 is concerning (especially axonal SNX4 levels identified via immunofluorescence), however, the main concern is that there was no profiling of the other molecular changes within these cultures. This is important, since there may be considerable alterations in the expression of a number of presynaptic proteins which may explain the observed phenotypes. Ideally, these cultures could have been profiled in an unbiased manner via mass spectrometry to identify potential changes in the presynaptic proteome, or at the very least the levels of key fusion molecules would have been assessed via Western blotting.

      We thank the reviewer for their suggestion and agree that mass spectrometry would strengthen the interpretation of the observed phenotype. However, due to contractual constraints, we are unable to pursue a mass spectrometry follow-up experiment. We agree that characterizing key fusion molecules is of potential interest. Therefore, based on literature, we selected a likely candidate, VAMP2, which did not show any alterations in expression levels when knocking out SNX4. Given the previously described role of SNX4 in the degradation pathway, one would expect increased degradation of key fusion molecules if they are recycled by SNX4. Other literature indicates that reduced levels of key fusion molecules, such as synaptotagmin or SNAP-25 (Broadie et al., 1994; Washbourne et al., 2001) , do not mimic our phenotype.

      (2) The experiments reported in Figure 2, in particular those in 2c and 2d, suggest that overexpression of SNX4 has a dominant-negative effect on neurotransmitter release. This is strongly supported by the supplementary data during a stimulus train (particularly the start point of the 5 Hz train in Supplementary Figure 2). Therefore, the perceived rescue of EPSC charge in Figure 2f, 2g may be a result of SNX4 inhibiting neurotransmitter release. A determination of the impact of SNX4 overexpression (and level of overexpression) in WT neurons is essential to show that this is a bonefide rescue, rather than a direct inhibition by SNX4 overexpression.

      We thank the reviewer for their suggestion and agree that an additional experimental group (control + SNX4) would strengthen interpretation of the observed phenotype. We have now added a new experiment with an extra experimental condition with overexpression of SNX4 on a control background (Supplementary Fig. 3 page 21). This data shows that the amplitude and charge of the first evoked response were not affected in control + SNX4 neurons compared to control, and no differences were detected in the response to the 40 Hz stimulation train (Supplementary Fig. 3a-e).  Together, these data suggest that SNX4 overexpression in itself does not affect the neurotransmission protocols studied in SNX4 cKO experiments.

      (3) The experiments in Figure 3 clearly reveal a lack of effect of SNX4 depletion on synaptic vesicle endocytosis. However, the assumption that synaptic vesicle recycling is unaffected is a little premature. The fact that the second evoked SypHy peak is significantly larger than the first (Figures 3c-e) suggests that more vesicles may be recycling in KO neurons. Furthermore, the FM dye experiments do not aid interpretation, since there may be insufficient time (10 min) for new vesicles to be generated from endosomal intermediates experiments. Therefore, to confirm an absence of effect on recycling, the authors could either 1) perform the same experiment as 3c, but with 4 stimulation trains (to drive the system harder to reveal any phenotype) or 2) repeat the FM dye experiment but increase the time between loading and unloading to 30 min.

      We fully agree with the reviewers' comment that vesicle recycling is an important component to consider and is complex with several parallel pathways involved. We conducted multiple independent experiments covering the most significant recycling pathways. The SypHy assay (Fig. 3c & f) combined with the 100 AP stimulation paradigm at room temperature predominantly addresses clathrin-mediated endocytosis. Additionally, the FM-64 dye assay at 37 degrees Celsius covers ultrafast endocytosis pathways as well as bulk endocytosis routes. To further challenge the system and reveal recycling phenotypes, we included a second 100 AP stimulation in our SypHy assay. While only the increase of the second SypHy peak is significant, the absolute numbers do not differ much from the first peak (0,17 for control and 0,21 for KO second peak and 0,19 for control and 0,22 for KO first peak, Supplementary table1). We nevertheless do not see any effects on recycling after the second peak (mean decay time is 27 for control and 26 for KO Supplementary Table 1). A single 100 AP 40 Hz train depletes all the synchronous release (not shown) and most of the evoked charge (see Fig 2f), hence two of these trains with one minute recovery is already a very demanding protocol. Although increasing the time between loading and unloading to 30 minutes might uncover other recycling components, it has been shown that ultrafast endocytosis occurs within 30 seconds (Watanabe et al., 2013), suggesting that 10 minutes should provide enough time for synaptic vesicle recycling. This is also evident from the fact that we can significantly destain synapses loaded with FM dye by electrical stimulation (Fig 3j), indicating that synaptic vesicle recycling took place. Since neither assay showed major effects, we concluded that under these circumstances, synaptic recycling is not significantly affected. However, we cannot exclude the possibility that recycling deficits in SNX4 cKO neurons could be detected in other paradigms,

      (4) There is no obvious effect on VAMP2 levels or location in SNX4 KO neurons (Figure 4). However, when one considers that SNX4 is proposed to have a role in VAMP2 trafficking, it is surprising that an experiment examining the live trafficking of VAMP2-SypHy was not performed. This would have revealed activity-dependent alterations that would have been missed by simply measuring VAMP2 expression and localization, and potentially provided a molecular explanation for the enhanced neurotransmitter release during a stimulus train.

      We appreciate the reviewer’s suggestion and agree that it could be a valuable experiment However, overexpressing a VAMP2-pHluorin construct might obscure potential phenotypes related to VAMP2 trafficking. SNX4 is expected to be involved in VAMP2 recycling, even with activity-dependent changes. Mis-sorted VAMP2 would accumulate in acidic vesicles, which could be masked by the VAMP2-pHluorin construct. Similarly, mis-sorting of other SNX4 cargo, such as the transferrin receptor, has been identified through lysosomal degradation, as shown by Western blot analysis of expression levels of the endogenous protein. We did not detect any differences in endogenous levels of VAMP2 within 21 days of SNX4 deletion (Fig 4), indicating that SNX4-dependent endosome sorting is not essential for VAMP2 recycling.

      (5) The morphological data in Figure 5 report a series of small changes in docked vesicles and active zone length. In many cases, significance is obtained due to synapses being used as the experimental n, and thus inflating the statistical power. When one considers that no significant effect was observed on evoked release (apart from during a stimulus train), it suggests that the number of docked vesicles does not alter release probability in this system (which the authors point out). Instead, they suggest that an increased supply of vesicles is responsible, via increased recruitment to RRP/releasable pool (but not via increased recycling). If this is the case, it should have been reflected as an increase in the evoked SypHy response in Fig 2c,d (which is borderline significant). What may help is to determine the morphological landscape immediately after a stimulus strain, since this is the only condition where enhanced release is observed, and thus provide a morphological correlate to the physiological data.

      We fully agree with the reviewer’s suggestion that an ultrastructural characterization immediately after a stimulus train would be informative. Unfortunately, contract constraints prevent us from performing this experiment. For our ultrastructural morphological data, we treated synapses as individual experimental n since it is not possible to determine whether synapses in a micronetwork on one sapphire originate from the same neuron. We used 18 independent sapphires from 3 independent pups to ensure the technical and biological replication of our data and measuring independent neurons. We fully agree with the reviewers comment to be careful with ‘inflating the statistical power’ due to potential nesting effects when using synapses as experimental n. To mitigate the potential nesting effect of analyzing multiple synapses per neuron, the intracluster correlation (ICC) is calculated per variable and per nesting effect. If ICC was close to 0.1, indicating that a considerable portion of the total variance can be attributed to e.g. synapse or sapphire, multilevel analysis was performed to accommodate nested data (Aarts et al., 2014).

      Minor points

      (1) When a new mouse model is generated, it is usually accompanied by a thorough characterization of its properties. However, in this case, there was no information provided about the conditional SNX4 knockout mouse. This is surprising and at a minimum, the following should be provided a) the background strain, b) method of generation, c) the number of animals used to establish the colony, d) breeding strategy, e) backcrossing strategy, f) genotyping protocol.

      We apologize that a thorough characterization of our novel mouse model was lacking and therefore added this to our material & methods section (page 11, line 377-391).

      (2) There is a noticeable difference between WT and KO neurons during train stimulation in Figure 2f, however, this appears to be due to the fact that there is a far higher EPSC charge to begin with in KO neurons. Why is there such a disparity when there is no difference in response to single pulses (Figures 2b-d) or presynaptic plasticity (Figure 2e)?

      We understand the reviewer’s concern. We excluded an outlier (3x SD) in the KO dataset that drove the initial far higher EPSC charge in the graph (was already excluded for the statistics, Supplementary table 1). The average charge of the first pulse of 40Hz train is 41 pC and for KO neurons 58 pC, which did not differ significantly.  These trains of Fig. 2f were recorded after 2 or 3 other stimulation paradigms, which can have affected the total charge released in the 40Hz train. That said, the proportional difference between groups is high comparable between Fig 2b-d and 2f, with a 37% increased average charge released in SNX4 cKO compared to control in the naïve response (Fig. 2d) and 41% increased response in the first response of the 40 Hz train (Fig. 2f), and rescued cells show a 53% reduction in average released charge compared to control in the naïve response compared to a 44% reduction in the first response of the 40 Hz train. Although the absolute values differ between these readouts, we conclude that the biological comparison between groups is consistent.

      (3) Line 343-344 - "(Supplementary Figure 1a)" should be "(Figure 1a)".

      We thank the reviewer for this comment and adjusted this in the manuscript.

    1. Author response:

      Reviewer 2:

      (1) It appears that the purified γ-secretase complex generates the same amount of Aβ40 and Aβ42, which is quite different in cellular and biochemical studies. Is there any explanation for this?

      Roughly equal production of Aβ40 and Aβ42 is a phenomenon seen with purified enzyme assays, and the reason for this has not been identified. However, we suggest that what is meaningful in our studies is the relative difference between the effects of FAD-mutant vs. WT PSEN1 on each proteolytic processing step. All FAD mutations are deficient in multiple cleavage steps in γ-secretase processing of APP substrate, and these deficiencies correlate with stabilization of E-S complexes.

      (2) It has been reported the Aβ production lines from Aβ49 and Aβ48 can be crossed with various combinations (PMID: 23291095 and PMID: 38843321). How does the production line crossing impact the interpretation of this work?

      In the cited reports, such crossover was observed when using synthetic Aβ intermediates as substrate. In PMID 2391095 (Okochi M et al, Cell Rep, 2013), Aβ43 is primarily converted to Aβ40, but also to some extent to Aβ38. In PMID: 38843321 (Guo X et al, Science, 2024), Aβ48 is ultimately converted to Aβ42, but also to a minor degree to Aβ40. We have likewise reported such product line “crossover” with synthetic Aβ intermediates (PMID: 25239621; Fernandez MA et al, JBC, 2014). However, when using APP C99-based substrate, we did not detect any noncanonical tri- and tetrapeptide co-products of Aβ trimming events in the LC-MS/MS analyses (PMID: 33450230; Devkota S et al, JBC, 2021). In the original report on identification of the small peptide coproducts for C99 processing by γ-secretase using LC-MS/MS (PMID: 19828817; Takami M et al, J Neurosci, 2009), only very low levels of noncanonical peptides were observed. In the present study, we did not search for such noncanonical trimming coproducts, so we cannot rule out some degree of product line crossover.

      (3) In Figure 5, did the authors look at the protein levels of PS1 mutations and C99-720, as well as secreted Aβ species? Do the different amounts of PS1 full-length and PS1-NTF/CTF influence FILM results?

      This is a good question. Our preliminary investigation by Western Blot shows no correlation between C99 and PSEN1 expressions and FLIM results, but we will fully address the concern in our point-by-point responses submitted with a revised manuscript. 

      (4) It is interesting that both Aβ40 and Aβ42 Elisa kits detect Aβ43. Have the authors tested other kits in the market? It might change the interpretation of some published work.

      We have not tested other ELISA kits. In light of our findings, it would be a good idea for other investigators to test whatever ELISAs they use for specificity vis-à-vis Aβ43.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer 1:

      Weaknesses:

      The match between fractal and classical cycles is not one-to-one. For example, the fractal method identifies a correlation between age and cycle duration in adults that is not apparent with the classical method. This raises the question as to whether differences are due to one method being more reliable than another or whether they are also identifying different underlying biological differences. It is not clear for example whether the agreement between the two methods is better or worse than between two human scorers, which generally serve as a gold standard to validate novel methods. The authors provide some insight into differences between the methods that could account for differences in results. However, given that the fractal method is automatic it would be important to clearly identify criteria for recordings in which it will produce similar results to the classical method.

      We thank the reviewer for the insightful suggestions. In the revised Manuscript, we have added a number of additional analyses that provide a quantitative comparison between the classical and fractal cycle approaches aiming to identify the source of the discrepancies between classical and fractal cycle durations. Likewise, we assessed the intra-fractal and intra-classical method reliability.

      Reviewer 2:

      One weakness of the study, from my perspective, was that the IRASA fits to the data (e.g. the PSD, such as in Figure 1B), were not illustrated. One cannot get a sense of whether or not the algorithm is based entirely on the fractal component or whether the oscillatory component of the PSD also influences the slope calculations. This should be better illustrated, but I assume the fits are quite good.

      Thank you for this suggestion. In the revised Manuscript, we have added a new figure (Fig.S1 E, Supplementary Material 2), illustrating the goodness of fit of the data as assessed by the IRASA method.

      The cycles detected using IRASA are called fractal cycles. I appreciate the use of a simple term for this, but I am also concerned whether it could be potentially misleading? The term suggests there is something fractal about the cycle, whereas it's really just that the fractal component of the PSD is used to detect the cycle. A more appropriate term could be "fractal-detected cycles" or "fractal-based cycle" perhaps?

      We agree that these cycles are not fractal per se. In the Introduction, when we mention them for the first time, we name them “fractal activity-based cycles of sleep” and immediately after that add “or fractal cycles for short”. In the revised version, we renewed this abbreviation with each new major section and in Abstract. Nevertheless, given that the term “fractal cycles” is used 88 times, after those “reminders”, we used the short name again to facilitate readability. We hope that this will highlight that the cycles are not fractal per se and thus reduce the possible confusion while keeping the manuscript short.

      The study performs various comparisons of the durations of sleep cycles evaluated by the IRASA-based algorithm vs. conventional sleep scoring. One concern I had was that it appears cycles were simply identified by their order (first, second, etc.) but were not otherwise matched. This is problematic because, as evident from examples such as Figure 3B, sometimes one cycle conventionally scored is matched onto two fractal-based cycles. In the case of the Figure 3B example, it would be more appropriate to compare the duration of conventional cycle 5 vs. fractal cycle 7, rather than 5 vs. 5, as it appears is currently being performed.

      In cases where the number of fractal cycles differed from the number of classical cycles (from 34 to 55% in different datasets as in the case of Fig.3B), we did not perform one-to-one matching of cycles. Instead, we averaged the duration of the fractal and classical cycles over each participant and only then correlated between them (Fig.2C). For a subset of the participants (45 – 66% of the participants in different datasets) with a one-to-one match between the fractal and classical cycles, we performed an additional correlation without averaging, i.e., we correlated the durations of individual fractal and classical cycles (Fig.4S of Supplementary Material 2). This is stated in the Methods, section Statistical analysis, paragraph 2.

      There are a few statements in the discussion that I felt were either not well-supported. L629: about the "little biological foundation" of categorical definitions, e.g. for REM sleep or wake? I cannot agree with this statement as written. Also about "the gradual nature of typical biological processes". Surely the action potential is not gradual and there are many other examples of all-or-none biological events.

      In the revised Manuscript, we have removed these statements from both Introduction and Discussion.

      The authors appear to acknowledge a key point, which is that their methods do not discriminate between awake and REM periods. Thus their algorithm essentially detected cycles of slow-wave sleep alternating with wake/REM. Judging by the examples provided this appears to account for both the correspondence between fractal-based and conventional cycles, as well as their disagreements during the early part of the sleep cycle. While this point is acknowledged in the discussion section around L686. I am surprised that the authors then argue against this correspondence on L695. I did not find the "not-a-number" controls to be convincing. No examples were provided of such cycles, and it's hard to understand how positive z-values of the slopes are possible without the presence of some wake unless N1 stages are sufficient to provide a detected cycle (in which case, then the argument still holds except that its alterations between slow-wave sleep and N1 that could be what drives the detection).

      In the revised Manuscript, we have removed the “NaN analysis” from both Results and Discussion. We have replaced it with the correlation between the difference between the durations of the classical and fractal cycles and proportion of wake after sleep onset. The finding is as follows:

      “A larger difference between the durations of the classical and fractal cycles was associated with a higher proportion of wake after sleep onset in 3/5 datasets as well as in the merged dataset (Supplementary Material 2, Table S10).” Results, section “Fractal cycles and wake after sleep onset”, last two sentences. This is also discussed in Discussion, section “Fractal cycles and age”, paragraph 1, last sentence. 

      To me, it seems important to make clear whether the paper is proposing a different definition of cycles that could be easily detected without considering fractals or spectral slopes, but simply adjusting what one calls the onset/offset of a cycle, or whether there is something fundamentally important about measuring the PSD slope. The paper seems to be suggesting the latter but my sense from the results is that it's rather the former.

      Thank you for this important comment. Overall, our paper suggests that the fractal approach might reflect the cycling nature of sleep in a more precise and sensitive way than classical hypnograms. Importantly, neither fractal nor classical methods can shed light on the mechanism underlying sleep cycle generation due to their correlational approach. Despite this, the advantages of fractal over classical methods mentioned in our Manuscript are as follows:

      (1) Fractal cycles are based on a real-valued metric with known neurophysiological functional significance, which introduces a biological foundation and a more gradual impression of nocturnal changes compared to the abrupt changes that are inherent to hypnograms that use a rather arbitrary assigned categorical value (e.g., wake=0, REM=-1, N1=-2, N2=-3 and SWS=-4, Fig.2 A).

      (2) Fractal cycle computation is automatic and thus objective, whereas classical sleep cycle detection is usually based on the visual inspection of hypnograms, which is time-consuming, subjective and error-prone. Few automatic algorithms are available for sleep cycle detection, which only moderately correlated with classical cycles detected by human raters (r’s = 0.3 – 0.7 in different datasets here).

      (3) Defining the precise end of a classical sleep cycle with skipped REM sleep that is common in children, adolescents and young adults using a hypnogram is often difficult and arbitrary.   The fractal cycle algorithm could detect such cycles in 93% of cases while the hypnogram-based agreement on the presence/absence of skipped cycles between two independent human raters was 61% only; thus, 32% lower.

      (4) The fractal analysis showed a stronger effect size, higher F-value and R-squared than the classical analysis for the cycle duration comparison in children and adolescents vs young adults. The first and second fractal cycles were significantly shorter in the pediatric compared to the adult group, whereas the classical approach could not detect this difference.

      (5) Fractal – but not classical – cycle durations correlated with the age of adult participants.

      These bullets are now summarized in Table 5 that has been added to the Discussion of the revised manuscript.

      Reviewer #1 (Recommendations for the authors):

      The authors have added a lot of quantifications to provide a more complete comparison of classical and fractal cycles that address the points I raised.

      Regarding, the question of skipped REM cycles: I am not sure the comparison of skipped cycle accuracies between fractal and manual methods makes sense. To make a fair comparison fractal and 2nd scorer classifications should be compared to the same baseline dataset which doesn't seem to be the case since the number of skipped cycles is not the same. Moreover, it's not indicated whether the fractal method identifies any false positive skipped cycles.

      Thank you for this comment. In the revised Manuscript, we have reported the number of false positive skipped cycles identified by the fractal algorithm. Likewise, we have added the comparison between the fractal algorithm and the second scorer detection of cycles with skipped REM sleep (Results, the section “Skipped cycles”, last paragraph). The text has been revised as follows:

      “Visual inspection of the hypnograms from Datasets 1 – 6 was performed by two independent researchers. Scorer 1 and Scorer 2 detected that out of 226 first sleep cycles 58 (26%) and 64 (28%), respectively, lacked REM episodes. The agreement on the presence of skipped cycles between two human raters equaled 91% (58 cycles detected by both raters out of 64 cycles detected by either one or two scorers). The fractal cycle algorithm detected skipped cycles in 57 out of 58 (98%) cases detected by Scorer 1 with one false positive (which, however, was tagged as a skipped cycle by Scorer2), and in 58 out of 64 (91%) cases detected by Scorer 2 with no false positives.”

      Minor points

      I suggest reporting the values of inter-method / inter-scorer correlations with the classical method in the main text since otherwise interpreting the value for fractal vs classical is impossible.

      Thank you for this comment. In the revised Manuscript, we have moved this section to the main text (Table 3).

      Table 5 + text of discussion: cycle identification based on hypnograms is claimed to be. "based on arbitrary assigned categorical values" the categories are not arbitrary since they correspond to well-validate sleep states, only the number associated it and this does not seem to be very important since it's only for visualization purposes.

      Thank you for this comment. In the revised Manuscript, we have removed the phrase “arbitrary assigned“.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Comment 1: In the Results section, the rationale behind selecting the beta band for the central (C3, CP3, Cz, CP4, C4) regions and the theta band for the fronto-central (Fz, FCz, Cz) regions is not clearly explained in the main text. This information is only mentioned in the figure captions. Additionally, why was the beta band chosen for the S-ROI central region and the theta band for the S-ROI fronto-central region? Was this choice influenced by the MVPA results?

      We thank the reviewer for the question regarding the rationale for the S-ROI selection in our study. The beta band was chosen for the central region due to its established relevance in motor control (Engel & Fries, 2010), movement planning (Little et al., 2019) and motor inhibition (Duque et al., 2017). The fronto-central theta band (or frontal midline theta) was a widely recognized indicator in cognitive control research (Cavanagh & Frank, 2014), associated with conflict detection and resolution processes. Moreover, recent empirical evidence suggested that the fronto-central theta reflected the coordination and integration between stimuli and responses (Senoussi et al., 2022). Although we have described the cognitive processes linked to these different frequencies in the introduction and discussion sections, along with the potential patterns of results observed in Stroop-related studies, we did not specify the involved cortical areas. Therefore, we have specified these areas in the introduction to enhance the clarity of the revised version (in the fourth paragraph of the Introduction section).

      Regarding whether the selection of S-ROIs was influenced by the MVPA results, we would like to clarify here that we selected the S-ROIs based on prior research and then conducted the decoding analysis. Specifically, we first extracted the data representing different frequency indicators (three F-ROIs and three S-ROIs) as features, followed by decoding to obtain the MVPA results. Subsequently, the time-frequency analysis, combined with the specific time windows during which each frequency was decoded, provided detailed interaction patterns among the variables for each indicator. The specifics of feature selection are described in the revised version (in the first paragraph of the Multivariate Pattern Analysis section).

      Comment 2: In the Data Analysis section, line 424 states: “Only trials that were correct in both the memory task and the Stroop task were included in all subsequent analyses. In addition, trials in which response times (RTs) deviated by more than three standard deviations from the condition mean were excluded from behavioral analyses.” The percentage of excluded trials should be reported. Also, for the EEG-related analyses, were the same trials excluded, or were different criteria applied?

      We thank the reviewer for this suggestion. Beyond the behavioral exclusion criteria, trials with EEG artifacts were also excluded from the data for the EEG-related analyses. We have now reported the percentage of excluded trials for both behavioral and EEG data analyses in the revised version (in the second paragraph of the EEG Recording and Preprocessing section and the first paragraph of the Behavioral Analysis section).

      Comment 3: In the Methods section, line 493 mentions: “A 400-200 ms pre-stimulus time window was selected as the baseline time window.” What is the justification in the literature for choosing the 400-200 ms pre-stimulus window as the baseline? Why was the 200-0 ms pre-stimulus period not considered?

      We thank the reviewer for this question and would like to provide the following justification. First, although a baseline ending at 0 ms is common in ERP analyses, it may not be suitable for time-frequency analysis. Due to the inherent temporal smoothing characteristic of wavelet convolution in time-frequency decomposition, task-related early activities can leak into the pre-stimulus period (before 0 ms) (Cohen, 2014). This means that extending the baseline to 0 ms will include some post-stimulus activity in the baseline window, thereby increasing baseline power and compromising the accuracy of the results. Second, an ideal baseline duration is recommended to be around 10-20% of the entire trial of interest (Morales & Bowers, 2022). In our study, the epoch duration was 2000 ms, making 200-400 ms an appropriate baseline length. Third, given that the minimum duration of the fixation point before the stimulus in our experiment was 400 ms, we chose the 400 ms before the stimulus as the baseline point to ensure its purity. In summary, considering edge effects, duration requirements, and the need to exclude other influences, we selected a baseline correction window of -400 to -200 ms. To enhance the clarity of the revised version, we have provided the rationale for the selected time windows along with relevant references (in the first paragraph of the Time-frequency analysis section).

      Comment 4: Is the primary innovation of this study limited to the methodology, such as employing MVPA and RSA to establish the relationship between late theta activity and behavior?

      We thank the reviewer for this insightful question and would like to clarify that our research extends beyond mere methodological innovation; rather, it utilized new methods to explore novel theoretical perspectives. Specifically, our research presents three levels of innovation: methodological, empirical, and theoretical. First, methodologically, MVPA overcame the drawbacks of traditional EEG analyses based on specific averaged voltage intensities, providing new perspectives on how the brain dynamically encoded particular neural representations over time. Furthermore, RSA aimed to identify which indicators among the decoded were directly related to behavioral representation patterns. Second, in terms of empirical results, using these two methods, we have identified for the first time three EEG markers that modulate the Stroop effect under verbal working memory load: SP, late theta, and beta, with late theta being directly linked to the elimination of the behavioral Stroop effect. Lastly, from a theoretical perspective, we proposed the novel idea that working memory played a crucial role in the late stages of conflict processing, specifically in the stimulus-response mapping stage (the specific theoretical contributions are detailed in the second-to-last paragraph of the Discussion section).

      Comment 5: On page 14, lines 280-287, the authors discuss a specific pattern observed in the alpha band. However, the manuscript does not provide the corresponding results to substantiate this discussion. It is recommended to include these results as supplementary material.

      We thank the reviewer for this suggestion. We added a new figure along with the corresponding statistical results that displayed the specific result patterns for the alpha band (Supplementary Figure 1).

      Comment 6: On page 16, lines 323-328, the authors provide a generalized explanation of the findings. According to load theory, stimuli compete for resources only when represented in the same form. Since the pre-memorized Chinese characters are represented semantically in working memory, this explanation lacks a critical premise: that semantic-response mapping is also represented semantically during processing.

      We thank the reviewer for this insightful suggestion. We fully agree with the reviewer’s perspective. As stated in our revised version, load theory suggests that cognitive resources are limited and dependent on a specific type (in the second paragraph of the Discussion section). The previously memorized Chinese characters are stored in working memory in the form of semantic representations; meanwhile the stimulus-response mapping should also be represented semantically, leading to resource occupancy. We have included this logical premise in the revised version (in the third-to-last paragraph of the Discussion section).

      Comment 7: The classic Stroop task includes both a manual and a vocal version. Since stimulus-response mapping in the vocal version is more automatic than in the manual version, it is unclear whether the findings of this study would generalize to the impact of working memory load on the Stroop effect in the vocal version.

      We fully agree with the reviewer’s point that the verbal version of the Stroop task differs from the manual version in terms of the degree of automation in the stimulus-response mapping. Specifically, the verbal version relies on mappings that are established through daily language use, while the manual version involves arbitrary mappings created in the laboratory. Therefore, the stimulus-response mapping in the verbal response version is more automated and less likely to be suppressed. However, our previous research indicated that the degree of automation in the stimulus-response mapping was influenced by practice (Chen et al., 2013). After approximately 128 practice trials, semantic conflict almost disappears, suggesting that the level of automation in stimulus-response mapping for the verbal Stroop task is comparable to that of the manual version (Chen et al., 2010). Given that participants in our study completed 144 practice trials (in the Procedure section), we believe these findings can be generalized to the verbal version.

      Comment 8: While the discussion section provides a comprehensive analysis of the study’s results, the authors could further elaborate on the theoretical and practical contributions of this work.

      We thank the reviewer for the constructive suggestions. We recognize that the theoretical and practical contributions of the study were not thoroughly elaborated in the original manuscript. Therefore, we have now provided a more detailed discussion. Specifically, the theoretical contributions focus on advancing load theory and highlighting the critical role of working memory in conflict processing. The practical contributions emphasize the application of load theory and the development of intervention strategies for enhancing inhibitory control. A more detailed discussion can be found in the revised version (in the second-to-last paragraph of the Discussion section).

      Reviewer #2 (Public review):

      Comment 1: As the researchers mentioned, a previous study reported a diminished Stroop effect with concurrent working memory tasks to memorize meaningless visual shapes rather than memorize Chinese characters as in the study. My main concern is that lower-level graphic processing when memorizing visual shapes also influences the Stroop effect. The stage of Stroop conflict processing affected by the working memory load may depend on the specific content of the concurrent working memory task. If that’s the case, I sense that the generalization of this finding may be limited.

      We thank the reviewer for this insightful concern. As mentioned in the manuscript, this may be attributed to the inherent characteristics of Chinese characters. In contrast to English words, the processing of Chinese characters relies more on graphemic encoding and memory (Chen, 1993). Therefore, the processing of line patterns essentially occupies some of the resources needed for character processing, which aligns with our study’s hypothesis based on dimensional overlap. Additionally, regarding the results, even though the previous study presents lower-level line patterns, the results still showed that the working memory load modulated the later theta band. We hypothesize that, regardless of the specific content of the pre-presented working memory load, once the stimulus disappears from view, these loads are maintained as representations in the working memory platform. Therefore, they do not influence early perceptual processing, and resource competition only occurs once the distractors reach the working memory platform. Lastly, previous study has shown that spatial loads, which do not overlap with either the target or distractor dimensions, do not influence conflict effect (Zhao et al., 2010). Taken together, we believe that regardless of the specific content of the concurrent working memory tasks, as long as they occupy resources related to irrelevant stimulus dimensions, they can influence the late-stage processing of conflict effect. Perhaps our original manuscript did not convey this clearly, so we have rephrased it in a more straightforward manner (in the second paragraph of the Discussion section).

      Comment 2: The P1 and N450 components are sensitive to congruency in previous studies as mentioned by the researchers, but the results in the present study did not replicate them. This raised concerns about data quality and needs to be explained.

      We thank the reviewer for this insightful concern. For P1, we aimed to convey that the early perceptual processing represented by P1 is part of the conflict processing process. Therefore, we included it in our analysis. Additionally, as mentioned in the discussion, most studies find P1 to be insensitive to congruency. However, we inappropriately cited a study in the introduction that suggested P1 shows differences in congruency, which is among the few studies that hold this perspective. To prevent confusion for readers, we have removed this citation from the introduction.

      As for N450, most studies have indeed found it to be influenced by congruency. In our manuscript, we did not observe a congruency effect at our chosen electrodes and time window. However, significant congruency effects were detected at other central-parietal electrodes (CP3, CP4, P5, P6) during the 350-500 ms interval. The interaction between task type and consistency remained non-significant, consistent with previous results. Furthermore, with respect to the location of the electrodes chosen, existing studies on N450 vary widely, including central-parietal electrodes and frontal-central electrodes (for a review, see Heidlmayr et al., 2020). We speculate that this phenomenon may be related to the extent of practice. With fewer total trials, the task may involve more stimulus conflicts, engaging more frontal brain areas. On the other hand, with more total trials, the task may involve more response conflicts, engaging more central-parietal brain areas (Chen et al., 2013; van Veen & Carter, 2005). Due to the extensive practice required in our study, we identified a congruency N450 effect in the central-parietal region. We apologize for not thoroughly exploring other potential electrodes in the previous manuscript, and we have revised the results and interpretations regarding N450 accordingly in the revised version (in the N450 section of the ERP results and the third paragraph of the Discussion section).

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Comment 1: In the Introduction, line 108 states: “Second, alpha oscillations (8-13 Hz) can serve as a neural inverse index of mental activity or alertness, while a decrease in alpha power reflects increased alertness or enhanced attentional inhibition of distractors (Arakaki et al., 2022; Tafuro et al., 2019; Zhou et al., 2023; Zhu et al., 2023).” Please clarify which specific psychological process related to conflict processing is reflected by alpha oscillations.

      We appreciate your suggestion and we have clearly highlighted the role of alpha oscillations in attentional engagement during conflict processing in the revised version (in the third-to-last paragraph of the introduction).

      Comment 2: In Figures 3C and 3E, a space is needed between “amplitude” and the preceding parenthesis. Similar adjustments are required in Figures 4A, 4B, 4C, 5C, and 6C. Additionally, in Figures 3B and 3D, a space should be added between the numbers and “ms.” This issue also appears in Figure 8. Please review all figures for these formatting inconsistencies.

      We apologize for the inconsistency in formatting and have corrected them throughout the revised version.

      Comment 3: There are some clerical errors in the manuscript that need correction. For instance, on page 19, line 403: “Participants were asked to answer by pressing one of two response buttons (“S with the left ring finger and “L” with the left ring finger).” This should be corrected to: “L” with the right ring finger. I recommend that the authors carefully proofread the manuscript to identify and correct such errors.

      We sincerely apologize for the errors present in the manuscript and have now carefully proofread it (in the Procedure section).

      Comment 4: On page 13, line 254, the elimination of the Stroop effect should not be interpreted as an improvement in processing.

      We greatly appreciate your suggestion. We agree that the elimination of the Stroop effect should not be confused with improvements in processing. We have corrected this in the revised version (the second paragraph of the Discussion section).

      Reviewer #3 (Recommendations for the authors):

      Comment 1: In the introduction section, the N450 was introduced as “a frontal-central negative deflection”, but in the methods part the N450 was computed using central-parietal electrodes. This inconsistency is confusing and needs to be clarified.

      We apologize for this confusion. We have provided a detailed explanation regarding the differences in electrodes and the rationale behind choosing central-parietal electrodes in our response to Reviewer 2’s second comment. To clarify, we have updated the introduction to consistently label them as central-parietal deflections (in the third paragraph of the Introduction section).

      Comment 2: I speculate the “beta” was mistakenly written as “theta” in line 212.

      We sincerely apologize for this mistake. We have corrected this error (in the RSA results section).

      Comment 3: The speculation that “changes in beta bands may be influenced by theta bands, thereby indirectly influencing the behavioral Stroop effect” needs to be rationalized.

      We appreciate your suggestion. What we intended to convey is that we found an interaction effect in the beta bands; however, the RSA results did not show a correlation with the behavioral interaction effect. We speculate that beta activity might be influenced by the theta bands. On the one hand, we realize that the idea of beta bands indirectly influencing the behavioral Stroop effect was inappropriate, and we have removed this point in the revised version. On the other hand, we have provided rational evidence for the idea that beta bands may be influenced by theta bands. This is based on the biological properties of theta oscillations, which support communication between different cortical neural signals, and their functional role in integrating and transmitting task-relevant information to response execution (in the third-to-last paragraph of the Discussion section).

      Comment 4: Typo in line 479: [10,10].

      We sincerely apologize for this mistake. We have corrected this error: [-10,10] (in the Multivariate pattern analysis section).

      Reference

      Cavanagh, J. F., & Frank, M. J. (2014). Frontal theta as a mechanism for cognitive control. Trends in Cognitive Sciences, 18(8), 414–421. https://doi.org/10.1016/j.tics.2014.04.012

      Chen, M. J. (1993). A Comparison of Chinese and English Language Processing. In Advances in Psychology (Vol. 103, pp. 97–117). North-Holland. https://doi.org/10.1016/S0166-4115(08)61659-3

      Chen, X. F., Jiang, J., Zhao, X., & Chen, A. (2010). Effects of practice on semantic conflict and response conflict in the Stroop task. Psychol. Sci., 33, 869–871.

      Chen, Z., Lei, X., Ding, C., Li, H., & Chen, A. (2013). The neural mechanisms of semantic and response conflicts: An fMRI study of practice-related effects in the Stroop task. NeuroImage, 66, 577–584. https://doi.org/10.1016/j.neuroimage.2012.10.028

      Cohen, M. X. (2014). Analyzing Neural Time Series Data: Theory and Practice. The MIT Press. https://doi.org/10.7551/mitpress/9609.001.0001

      Duprez, J., Gulbinaite, R., & Cohen, M. X. (2020). Midfrontal theta phase coordinates behaviorally relevant brain computations during cognitive control. NeuroImage, 207, 116340. https://doi.org/10.1016/j.neuroimage.2019.116340

      Duque, J., Greenhouse, I., Labruna, L., & Ivry, R. B. (2017). Physiological Markers of Motor Inhibition during Human Behavior. Trends in Neurosciences, 40(4), 219–236. https://doi.org/10.1016/j.tins.2017.02.006

      Engel, A. K., & Fries, P. (2010). Beta-band oscillations—Signalling the status quo? Current Opinion in Neurobiology, 20(2), 156–165. https://doi.org/10.1016/j.conb.2010.02.015

      Heidlmayr, K., Kihlstedt, M., & Isel, F. (2020). A review on the electroencephalography markers of Stroop executive control processes. Brain and Cognition, 146, 105637. https://doi.org/10.1016/j.bandc.2020.105637

      Little, S., Bonaiuto, J., Barnes, G., & Bestmann, S. (2019). Human motor cortical beta bursts relate to movement planning and response errors. PLOS Biology, 17(10), e3000479. https://doi.org/10.1371/journal.pbio.3000479

      Morales, S., & Bowers, M. E. (2022). Time-frequency analysis methods and their application in developmental EEG data. Developmental Cognitive Neuroscience, 54, 101067. https://doi.org/10.1016/j.dcn.2022.101067

      Senoussi, M., Verbeke, P., Desender, K., De Loof, E., Talsma, D., & Verguts, T. (2022). Theta oscillations shift towards optimal frequency for cognitive control. Nature Human Behaviour, 6(7), Article 7. https://doi.org/10.1038/s41562-022-01335-5

      van Veen, V., & Carter, C. S. (2005). Separating semantic conflict and response conflict in the Stroop task: A functional MRI study. NeuroImage, 27(3), 497–504. https://doi.org/10.1016/j.neuroimage.2005.04.042

      Zhao, X., Chen, A., & West, R. (2010). The influence of working memory load on the Simon effect. Psychonomic Bulletin & Review, 17(5), 687–692. https://doi.org/10.3758/PBR.17.5.687

    1. Author response:

      The following is the authors’ response to the previous reviews.

      We are grateful to the editors and reviewers for their careful reading and constructive comments. We have now done our best to respond to them fully through additional analyses and text revisions. In the sections below, the original reviewer comments are in black, and our responses are in red.

      To summarize, the major changes in this round of review are as follows:

      (1) We have included a new introductory figure (Figure 1) to explain the distinction between feature-based tasks and property-based tasks.

      (2) We have included a section on “key predictions” and a section on “overview of this study” in the Introduction to clearly delineate our key predictions and provide a overview of our study.

      (3) We have included additional analyses to address the reviewers’ concerns about circularity in Experiments 1 & 2. We show that distance-to-center or visual homogeneity computations performed on object representations obtained from deep networks (instead of the perceptual dissimilarities from Experiment 1) also yields comparable predictions of target-present and target-absent responses in Experiment 2. 

      (4) We have extensively reworked the manuscript wherever possible to address the specific concerns raised by the reviewers.

      We hope that the revised manuscript adequately addresses the concerns raised in this round of review, and we look forward to a positive assessment.

      eLife Assessment

      This study uses carefully designed experiments to generate a useful behavioural and neuroimaging dataset on visual cognition. The results provide solid evidence for the involvement of higher-order visual cortex in processing visual oddballs and asymmetry. However, the evidence provided for the very strong claims of homogeneity as a novel concept in vision science, separable from existing concepts such as target saliency, is inadequate.

      Thank you for your positive assessment. We agree that visual homogeneity is similar to existing concepts such as target saliency, memorability etc. We have proposed it as a separate concept because visual homogeneity has an independent empirical measure (the reciprocal of target-absent search time in oddball search, or the reciprocal of same response time in a same-different task, etc) that may or may not be the same as other empirical measures such as saliency and memorability. Investigating these possibilities is beyond the scope of our study but would be interesting for future work. We have now clarified this in the revised manuscript (Discussion, p. 42).

      However, we’d like to emphasize that the question of whether visual homogeneity is novel or related to existing concepts misses entirely the key contribution of our study.

      Our key contribution is a quantitative, falsifiable model for how the brain could be solving property-based tasks like same-different, oddball or symmetry. Most theories of decision making consider feature-based tasks where there is a well-defined feature space and decision variable. Property-based tasks pose a significant challenge to standard theories since it is not clear how these tasks could be solved. In fact, oddball search, same-different and symmetry tasks have been considered so different that they are rarely even mentioned in the same study. Our study represents a unifying framework showing that all three tasks can be understood as solving the same underlying fundamental problem, and presents evidence in favor of this solution.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors define a new metric for visual displays, derived from psychophysical response times, called visual homogeneity (VH). They attempt to show that VH is explanatory of response times across multiple visual tasks. They use fMRI to find visual cortex regions with VH-correlated activity. On this basis, they declare a new visual region in human brain, area VH, whose purpose is to represent VH for the purpose of visual search and symmetry tasks.

      Thank you for your accurate and positive assessment.

      Strengths:

      The authors present carefully designed experiments, combining multiple types of visual judgments and multiple types of visual stimuli with concurrent fMRI measurements. This is a rich dataset with many possibilities for analysis and interpretation.

      Thank you for your accurate and positive assessment.

      Weaknesses:

      The datasets presented here should provide a rich basis for analysis. However, in this version of the manuscript, I believe that there are major problems with the logic underlying the authors' new theory of visual homogeneity (VH), with the specific methods they used to calculate VH, and with their interpretation of psychophysical results using these methods. These problems with the coherency of VH as a theoretical construct and metric value make it hard to interpret the fMRI results based on searchlight analysis of neural activity correlated with VH.

      We respectfully disagree with your concerns, and have done our best to respond to them fully below.

      In addition, the large regions of VH correlations identified in Experiments 1 and 2 vs. Experiments 3 and 4 are barely overlapping. This undermines the claim that VH is a universal quantity, represented in a newly discovered area of visual cortex, that underlies a wide variety of visual tasks and functions.

      We respectfully disagree with your assertion. First of all, there is partial overlap between the VH regions, for which there are several other obvious explanations that must be considered first before dismissing VH outright as a flawed construct. We acknowledge these alternatives in the Results (p. 27), and the relevant text is reproduced below.

      “We note that it is not straightforward to interpret the overlap between the VH regions identified in Experiments 2 & 4. The lack of overlap could be due to stimulus differences (natural images in Experiment 2 vs silhouettes in Experiment 4), visual field differences (items in the periphery in Experiment 2 vs items at the fovea in Experiment 4) and even due to different participants in the two experiments. There is evidence supporting all these possibilities: stimulus differences (Yue et al., 2014), visual field differences (Kravitz et al., 2013) as well as individual differences can all change the locus of neural activations in object-selective cortex (Weiner and Grill-Spector, 2012a; Glezer and Riesenhuber, 2013). We speculate that testing the same participants on search and symmetry tasks using similar stimuli and display properties would reveal even larger overlap in the VH regions that drive behavior.”

      Maybe I have missed something, or there is some flaw in my logic. But, absent that, I think the authors should radically reconsider their theory, analyses, and interpretations, in light of detailed comments below, in order to make the best use of their extensive and valuable datasets combining behavior and fMRI. I think doing so could lead to a much more coherent and convincing paper, albeit possibly supporting less novel conclusions.

      We respectfully disagree with your assessment, and we hope that our detailed responses below will convince you of the merit of our claims.

      THEORY AND ANALYSIS OF VH

      (1) VH is an unnecessary, complex proxy for response time and target-distractor similarity.<br /> VH is defined as a novel visual quality, calculable for both arrays of objects (as studied in Experiments 1-3) and individual objects (as studied in Experiment 4). It is derived from a center-to-distance calculation in a perceptual space. That space in turn is derived from multi-dimensional scaling of response times for target-distractor pairs in an oddball detection task (Experiments 1 and 2) or in a same different task (Experiments 3 and 4).  Proximity of objects in the space is inversely proportional to response times for arrays in which they were paired. These response times are higher for more similar objects. Hence, proximity is proportional to similarity. This is visible in Fig. 2B as the close clustering of complex, confusable animal shapes.

      VH, i.e. distance-to-center, for target-present arrays is calculated as shown in Fig. 1C, based on a point on the line connecting target and distractors. The authors justify this idea with previous findings that responses to multiple stimuli are an average of responses to the constituent individual stimuli. The distance of the connecting line to the center is inversely proportional to the distance between the two stimuli in the pair, as shown in Fig. 2D. As a result, VH is inversely proportional to distance between the stimuli and thus to stimulus similarity and response times. But this just makes VH a highly derived, unnecessarily complex proxy for target-distractor similarity and response time. The original response times on which the perceptual space is based are far more simple and direct measures of similarity for predicting response times.

      Thank you for carefully thinking through our logic. We agree that a distance-to-centre calculation is entirely unnecessary as an explanation for target-present visual search. The difficulty of target-present search is already known to be directly proportional to the similarity between target and distractor, so there is nothing new to explain here.

      However, this is a narrow and selective interpretation of our findings because you are focusing only on our results on target-present searches, which are only half of all our data. The other half is the target-absent responses which previously have had no clear explanation. You are also missing the fact that we are explaining same-different and symmetry tasks as well using the same visual homogeneity computation.

      We urge you to think more deeply about the problem of how to decide whether an oddball is present or not in the first place. How do we actually solve this task? There must be some underlying representation and decision process. Our study shows that a distance-to-centre computation can actually serve as a decision variable to solve disparate property-based visual tasks. These tasks pose a major challenge to standard models of decision making, because the underlying representation and decision variable have been unclear. Our study resolves this challenge by proposing a novel computation that can be used by the brain to solve all these disparate tasks, and bring these tasks into the ambit of standard theories of decision making.  

      Our results also explain several interesting puzzles in the literature. If oddball search was driven only by target-distractor similarity, the time taken to respond when a target is absent should not vary at all, and should actually take longer than all target-present searches. But in fact, systematic variations in target-absent times have been observed always in the literature, but have never been explained using any theoretical models. Our results explain why target-absent times vary systematically – it is due to visual homogeneity.

      Similarly, in same-different tasks, participants are known to take longer to make a “different” response when the two items differ only slightly. By this logic, they should take the longest to make a “same” response, but in fact, paradoxically, participants are actually faster to make “same” responses. This fast-same effect has been noted several times, but never explained using any models. Our results provide an explanation of why “same” responses to an image vary systematically – it is due to visual homogeneity. 

      Finally, in symmetry tasks, symmetric objects evoke fast responses, and this has always been taken as evidence for special symmetry computations in the brain. But we show that the same distance-to-center computation can explain both responses to symmetric and asymmetric objects. Thus there is no need for a special symmetry computation in the brain.

      (2) The use of VH derived from Experiment 1 to predict response times in Experiment 2 is circular and does not validate the VH theory.<br /> The use of VH, a response time proxy, to predict response times in other, similar tasks, using the same stimuli, is circular. In effect, response times are being used to predict response times across two similar experiments using the same stimuli. Experiment 1 and the target present condition of Experiment 2 involve the same essential task of oddball detection. The results of Experiment 1 are converted into VH values as described above, and these are used to predict response times in experiment 2 (Fig. 2F). Since VH is a derived proxy for response values in Experiment 1, this prediction is circular, and the observed correlation shows only consistency between two oddball detection tasks in two experiments using the same stimuli.

      You are indeed correct in noting that both Experiment 1 & 2 involve oddball search, and so at the superficial level, it looks circular that the oddball search data of Experiment 1 is being used to explain the oddball search data of Experiment 2.

      However a deeper scrutiny reveals more fundamental differences: Experiment 1 consisted of only oddball search with the target appearing on the left or right, whereas Experiment 2 consisted of oddball search with the target either present or completely absent. In fact, we were merely using the search dissimilarities from Experiment 1 to reconstruct the underlying object representation, because it is well known that neural dissimilarities are predicted well by search dissimilarities (Sripati & Olson, 2009; Zhivago et al, 2014).

      To thoroughly refute any lingering concern about circularity, we reasoned that the model predictions for Experiment 2 could have been obtained by a distance-to-center computation on any brain like object representation. To this end, we used object representations from deep neural networks pretrained on object categorization, whose representations are known to match well with the brain, and asked if a distance-to-centre computation on these representations could predict the search data in Experiment 2. This was indeed the case, and these results are now included an additional section in Supplementary Material (Section S1).

      (3) The negative correlation of target-absent response times with VH as it is defined for target-absent arrays, based on distance of a single stimulus from center, is uninterpretable without understanding the effects of center-fitting. Most likely, center-fitting and the different VH metric for target-absent trials produce an inverse correlation of VH with target-distractor similarity.

      Unfortunately, as we have mentioned above, target-distractor similarity cannot explain how target-absent searches behave, since there is no distractor in such searches.

      We do understand your broader concern about the center-fitting algorithm itself. We performed a number of additional analyses to confirm the generality of our results and reject alternate explanations – these are summarized in a new section titled “Confirming the generality of visual homogeneity” (p. 12), and the section is reproduced below for your convenience.   

      “Confirming the generality of visual homogeneity

      We performed several additional analyses to confirm the generality of our results, and to reject alternate explanations.

      First, it could be argued that our results are circular because they involve taking oddball search times from Experiment 1 and using them to explain search response times in Experiment 2. This is a superficial concern since we are using the search dissimilarities from Experiment 1 only as a proxy for the underlying neural representation, based on previous reports that neural dissimilarities closely match oddball search dissimilarities (Sripati and Olson, 2010; Zhivago and Arun, 2014). Nonetheless, to thoroughly refute this possibility, we reasoned that we would get similar predictions of the target present/absent responses in Experiment using any other brain-like object representation. To confirm this, we replaced the object representations derived from Experiment 1 with object representations derived from deep neural networks pretrained for object categorization, and asked if distance-to-center computations could predict the target present/absent responses in Experiment 2. This was indeed the case (Section S1). 

      Second, we wondered whether the nonlinear optimization process of finding the best-fitting center could be yielding disparate optimal centres each time. To investigate this, we repeated the optimization procedure with many randomly initialized starting points, and obtained the same best-fitting center each time (see Methods).

      Third, to confirm that the above model fits are not due to overfitting, we performed a leave-one-out cross validation analysis. We left out all target-present and target-absent searches involving a particular image, and then predicted these searches by calculating visual homogeneity estimated from all other images. This too yielded similar positive and negative correlations (r = 0.63, p < 0.0001 for target-present, r = -0.63, p < 0.001  for target-absent).

      Fourth, if heterogeneous displays indeed elicit similar neural responses due to mixing, then their average distance to other objects must be related to their visual homogeneity. We confirmed that this was indeed the case, suggesting that the average distance of an object from all other objects in visual search can predict visual homogeneity (Section S1).

      Fifth, the above results are based on taking the neural response to oddball arrays to be the average of the target and distractor responses. To confirm that averaging was indeed the optimal choice, we repeated the above analysis by assuming a range of relative weights between the target and distractor. The best correlation was obtained for almost equal weights in the lateral occipital (LO) region, consistent with averaging and its role in the underlying perceptual representation (Section S1).

      Finally, we performed several additional experiments on a larger set of natural objects as well as on silhouette shapes. In all cases, present/absent responses were explained using visual homogeneity (Section S2).”

      The construction of the VH perceptual space also involves fitting a "center" point such that distances to center predict response times as closely as possible. The effect of this fitting process on distance-to-center values for individual objects or clusters of objects is unknowable from what is presented here. These effects would depend on the residual errors after fitting response times with the connecting line distances. The center point location and its effects on distance-to-center of single objects and object clusters are not discussed or reported here.

      While it is true that the optimal center needs to be found by fitting to the data, there no particular mystery to the algorithm: we are simply performing a standard gradient-descent to maximize the fit to the data. We have described the algorithm clearly and are making our codes public. We find the algorithm to yield stable optimal centers despite many randomly initialized starting points. We find the optimal center to be able to predict responses to entirely novel images that were excluded during model training. We are making no assumption about the location of centre with respect to individual points. Therefore, we see no cause for concern regarding the center-finding algorithm. 

      Yet, this uninterpretable distance-to-center of single objects is chosen as the metric for VH of target-absent displays (VHabsent). This is justified by the idea that arrays of a single stimulus will produce an average response equal to one stimulus of the same kind. But it is not logically clear why response strength to a stimulus should be a metric for homogeneity of arrays constructed from that stimulus, or even what homogeneity could mean for a single stimulus from this set. And it is not clear how this VHabsent metric based on single stimuli can be equated to the connecting line VH metric for stimulus pairs, i.e. VHpresent, or how both could be plotted on a single continuum.

      Most visual tasks, such as finding an animal, are thought to involve building a decision boundary on some underlying neural representation. Even visual search has been portrayed as a signal-detection problem where a particular target is to be discriminated from a distractor. However none of these formulations work in the case of property-based visual tasks, where there is no unique feature to look for.

      We are proposing that, when we view a search array, the neural response to the search array can be deduced from the neural responses to the individual elements using well known rules, and that decisions about an oddball target being present or absent can be made by computing the distance of this neural response from some canonical mean firing rate of a population of neurons. This distance to center computation is what we denote as visual homogeneity. We have revised our manuscript throughout to make this clearer and we hope that this helps you understand the logic better. 

      It is clear, however, what *should* be correlated with difficulty and response time in the target-absent trials, and that is the complexity of the stimuli and the numerosity of similar distractors in the overall stimulus set. Complexity of the target, similarity with potential distractors, and number of such similar distractors all make ruling out distractor presence more difficult. The correlation seen in Fig. 2G must reflect these kinds of effects, with higher response times for complex animal shapes with lots of similar distractors and lower response times for simpler round shapes with fewer similar distractors.

      You are absolutely correct that the stimulus complexity should matter, but there are no good empirically derived measures for stimulus complexity, other than subjective ratings which are complex on their own and could be based on any number of other cognitive and semantic factors. But considering what factors are correlated with target-absent response times is entirely different from asking what decision variable or template is being used by participants to solve the task.

      The example points in Fig. 2G seem to bear this out, with higher response times for the deer stimulus (complex, many close distractors in the Fig. 2B perceptual space) and lower response times for the coffee cup (simple, few close distractors in the perceptual space). While the meaning of the VH scale in Fig. 2G, and its relationship to the scale in Fig. 2F, are unknown, it seems like the Fig. 2G scale has an inverse relationship to stimulus complexity, in contrast to the expected positive relationship for Fig. 2F. This is presumably what creates the observed negative correlation in Fig. 2G.

      Taken together, points 1-3 suggest that VHpresent and VHabsent are complex, unnecessary, and disconnected metrics for understanding target detection response times. The standard, simple explanation should stand. Task difficulty and response time in target detection tasks, in both present and absent trials, are positively correlated with target-distractor similarity.

      We strongly disagree. Your assessment seems to be based on only considering target-present searches, which are of course driven by target-distractor similarity. Your  argument is flawed because systematic variations in target-absent trials cannot be linked to any target-distractor similarity since there are no targets in the first place in such trials.

      We have shown that target-absent response times are in fact, independent of experimental context, which means that they index an image property that is independent of any reference target (Results, p. 15; Section S4). This property is what we define as visual homogeneity.

      I think my interpretations apply to Experiments 3 and 4 as well, although I find the analysis in Fig. 4 especially hard to understand. The VH space in this case is based on Experiment 3 oddball detection in a stimulus set that included both symmetric and asymmetric objects. But the response times for a very different task in Experiment 4, a symmetric/asymmetric judgment, are plotted against the axes derived from Experiment 3 (Fig. 4F and 4G). It is not clear to me why a measure based on oddball detection that requires no use of symmetry information should be predictive of within-stimulus symmetry detection response times. If it is, that requires a theoretical explanation not provided here.

      We were simply using an oddball detection task to construct the underlying object representation, on the basis of observations that search dissimilarities are strongly correlated with neural   dissimilarities. In Section S1, we show that similar results could have been obtained using other object representations such as deep networks, as long as the representation is brain-like.

      (4) Contrary to the VH theory, same/different tasks are unlikely to depend on a decision boundary in the middle of a similarity or homogeneity continuum.

      We have provided empirical proof for our claims, by showing that target-present response times in a visual search task are correlated with “different” responses in the same-different task, and that target-absent response times in the visual search task are correlated with “same” responses in the same-different task (Section S4).

      The authors interpret the inverse relationship of response times with VHpresent and VHabsent, described above, as evidence for their theory. They hypothesize, in Fig. 1G, that VHpresent and VHabsent occupy a single scale, with maximum VHpresent falling at the same point as minimum VHabsent. This is not borne out by their analysis, since the VHpresent and VHabsent value scales are mainly overlapping, not only in Experiments 1 and 2 but also in Experiments 3 and 4. The authors dismiss this problem by saying that their analyses are a first pass that will require future refinement. Instead, the failure to conform to this basic part of the theory should be a red flag calling for revision of the theory.

      Again, the opposite correlations between target present/absent search times with VH are the crucial empirical validation of our claims that a distance-to-center calculation explain how we perform these property-based tasks. The VH predictions do not fully explain the data. We have explicitly acknowledged this shortcoming, so we are hardly dismissing it as a problem. 

      The reason for this single scale is that the authors think of target detection as a boundary decision task, along a single scale, with a decision boundary somewhere in the middle, separating present and absent. This model makes sense for decision dimensions or spaces where there are two categories (right/left motion; cats vs. dogs), separated by an inherent boundary (equal left/right motion; training-defined cat/dog boundary). In these cases, there is less information near the boundary, leading to reduced speed/accuracy and producing a pattern like that shown in Fig. 1G.

      Finding an oddball, deciding if two items are same or different and symmetry tasks are disparate visual tasks that do not fit neatly into standard models of decision making. The key conceptual advance of our study is that we propose a plausible neural representation and decision variable that allow all three property-based visual tasks to be reconciled with standard models of decision making.

      This logic does not hold for target detection tasks. There is no inherent middle point boundary between target present and target absent. Instead, in both types of trial, maximum information is present when target and distractors are most dissimilar, and minimum information is present when target and distractors are most similar. The point of greatest similarity occurs at then limit of any metric for similarity. Correspondingly, there is no middle point dip in information that would produce greater difficulty and higher response times. Instead, task difficulty and response times increase monotonically with similarity between targets and distractors, for both target present and target absent decisions. Thus, in Figs. 2F and 2G, response times appear to be highest for animals, which share the largest numbers of closely similar distractors.        

      Your alternative explanation rests on vague factors like “maximum information” which cannot be quantified. By contrast we are proposing a concrete, falsifiable model for three property-based tasks – same/different, oddball present/absent and object symmetry. Any argument based solely on item similarity to explain visual search or symmetry responses cannot explain systematic variations observed for target-absent arrays and for symmetric objects, for the reasons explained earlier.

      DEFINITION OF AREA VH USING fMRI

      (1) The area VH boundaries from different experiments are nearly completely non-overlapping.

      In line with their theory that VH is a single continuum with a decision boundary somewhere in the middle, the authors use fMRI searchlight to find an area whose responses positively correlate with homogeneity, as calculated across all of their target present and target absent arrays. They report VH-correlated activity in regions anterior to LO. However, the VH defined by symmetry Experiments 3 and 4 (VHsymmetry) is substantially anterior to LO, while the VH defined by target detection Experiments 1 and 2 (VHdetection) is almost immediately adjacent to LO. Fig. S13 shows that VHsymmetry and VHdetection are nearly non-overlapping. This is a fundamental problem with the claim of discovering a new area that represents a new quantity that explains response times across multiple visual tasks. In addition, it is hard to understand why VHsymmetry does not show up in a straightforward subtraction between symmetric and asymmetric objects, which should show a clear difference in homogeneity.

      We respectfully disagree. The partial overlap between the VH regions identified in Experiments 1 & 2 can hardly be taken as evidence against the quantity VH itself, because there are several other obvious alternate explanations for this partial overlap, as summarized earlier as well. The VH region does show up in a straightforward subtraction  between symmetric and asymmetric objects (Section S7), so we are not sure what the Reviewer is referring to here.

      (2) It is hard to understand how neural responses can be correlated with both VHpresent and VHabsent.

      The main paper results for VHdetection are based on both target-present and target-absent trials, considered together. It is hard to interpret the observed correlations, since the VHpresent and VHabsent metrics are calculated in such different ways and have opposite correlations with target similarity, task difficulty, and response times (see above). It may be that one or the other dominates the observed correlations. It would be clarifying to analyze correlations for target-present and target-absent trials separately, to see if they are both positive and correlated with each other.

      Thanks for raising this point. We have now confirmed that the positive correlation between VH and neural response holds even when we do the analysis separately for target-present and -absent searches (correlation between neural response in VH region and visual homogeneity (n = 32, r = 0.66, p < 0.0005 for target-present searches & n = 32, r = 0.56, p < 0.005 for target-absent searches).

      (3) Definition of the boundaries and purpose of a new visual area in the brain requires circumspection, abundant and convergent evidence, and careful controls.

      Even if the VH metric, as defined and calculated by the authors here, is a meaningful quantity, it is a bold claim that a large cortical area just anterior to LO is devoted to calculating this metric as its major task. Vision involves much more than target detection and symmetry detection. Cortex anterior to LO is bound to perform a much wider range of visual functionalities. If the reported correlations can be clarified and supported, it would be more circumspect to treat them as one byproduct of unknown visual processing in cortex anterior to LO, rather than treating them as the defining purpose for a large area of visual cortex.

      We totally agree with you that reporting a new brain region would require careful interpretation and abundant and converging evidence. However, this requires many studies worth of work, and historically category-selective regions like the FFA have achieved consensus only after they were replicated and confirmed across many studies. We believe our proposal for the computation of a quantity like visual homogeneity is conceptually novel, and our study represents a first step that provides some converging evidence (through replicable results across different experiments) for such a region. We have reworked our manuscript to make this point clearer (Discussion, p 32).

      Reviewer #3 (Public Review):

      Summary:

      This study proposes visual homogeneity as a novel visual property that enables observers perform to several seemingly disparate visual tasks, such as finding an odd item, deciding if two items are same, or judging if an object is symmetric. In Exp 1, the reaction times on several objects were measured in human subjects. In Exp 2, visual homogeneity of each object was calculated based on the reaction time data. The visual homogeneity scores predicted reaction times. This value was also correlated with the BOLD signals in a specific region anterior to LO. Similar methods were used to analyze reaction time and fMRI data in a symmetry detection task. It is concluded that visual homogeneity is an important feature that enables observers to solve these two tasks.

      Thank you for your accurate and positive assessment.

      Strengths:

      (1) The writing is very clear. The presentation of the study is informative.

      (2) This study includes several behavioral and fMRI experiments. I appreciate the scientific rigor of the authors.

      We are grateful to you for your balanced assessment and constructive comments.

      Weaknesses:

      (1) My main concern with this paper is the way visual homogeneity is computed. On page 10, lines 188-192, it says: "we then asked if there is any point in this multidimensional representation such that distances from this point to the target-present and target-absent response vectors can accurately predict the target-present and target-absent response times with a positive and negative correlation respectively (see Methods)". This is also true for the symmetry detection task. If I understand correctly, the reference point in this perceptual space was found by deliberating satisfying the negative and positive correlations in response times. And then on page 10, lines 200-205, it shows that the positive and negative correlations actually exist. This logic is confusing. The positive and negative correlations emerge only because this method is optimized to do so. It seems more reasonable to identify the reference point of this perceptual space independently, without using the reaction time data. Otherwise, the inference process sounds circular. A simple way is to just use the mean point of all objects in Exp 1, without any optimization towards reaction time data.

      We disagree with you since the same logic applies to any curve-fitting procedure. When we fit data to a straight line, we are finding the slope and intercept that minimizes the error between the data and the straight line, but we would hardly consider the process circular when a good fit is achieved – in fact we take it as a confirmation that the data can be fit linearly. In the same vein, we would not have observed a good fit to the data, if there did not exist any good reference point relative to which the distances of the target-present and target-absent search arrays predicted these response times.

      In Section S2, we show that the visual homogeneity estimates for each object is strongly correlated with the average distance of each object to all other objects (r = 0.84, p<0.0005, Figure S1).

      We have performed several additional analyses to confirm the generality of our results and to reject alternate explanations (see Results, p. 12, Section titled “Confirming the generality of visual homogeneity”). In particular, to confirm that the results we obtained are not due to overfitting, we performed a cross-validation analysis, where we removed all searches involving a particular image and predicted these response times using visual homogeneity. This too revealed a significant model correlation confirming that our results are not due to overfitting.

      (2) Visual homogeneity (at least given the current from) is an unnecessary term. It is similar to distractor heterogeneity/distractor variability/distractor statics in literature. However, the authors attempt to claim it as a novel concept. The title is "visual homogeneity computations in the brain enable solving generic visual tasks". The last sentence of the abstract is "a NOVEL IMAGE PROPERTY, visual homogeneity, is encoded in a localized brain region, to solve generic visual tasks". In the significance, it is mentioned that "we show that these tasks can be solved using a simple property WE DEFINE as visual homogeneity". If the authors agree that visual homogeneity is not new, I suggest a complete rewrite of the title, abstract, significance, and introduction.

      We respectfully disagree that visual homogeneity is an unnecessary term. Please see our comments to Reviewer 1 above. Just like saliency and memorability can be measured empirically, we propose that visual homogeneity can be empirically measured as the reciprocal of the target-absent search time in a search task, or as the reciprocal of the “same” response time in a same-different task. Understanding how these three quantities interact will require measuring them empirically for an identical set of images, which is beyond the scope of this study but an interesting possibility for future work.

      (3) Also, "solving generic tasks" is another overstatement. The oddball search tasks, same-different tasks, and symmetric tasks are only a small subset of many visual tasks. Can this "quantitative model" solve motion direction judgment tasks, visual working memory tasks? Perhaps so, but at least this manuscript provides no such evidence. On line 291, it says "we have proposed that visual homogeneity can be used to solve any task that requires discriminating between homogeneous and heterogeneous displays". I think this is a good statement. A title that says "XXXX enable solving discrimination tasks with multi-component displays" is more acceptable. The phrase "generic tasks" is certainly an exaggeration.

      Thank you for your suggestion. We have now replaced the term “generic tasks” with the term property-based tasks, which we feel is more appropriate and reflect the fact that oddball search, same-different and symmetry tasks all involve looking for a specific image property.

      (4) If I understand it correctly, one of the key findings of this paper is "the response times for target-present searches were positively correlated with visual homogeneity. By contrast, the response times for target-absent searches were negatively correlated with visual homogeneity" (lines 204-207). I think the authors have already acknowledged that the positive correlation is not surprising at all because it reflects the classic target-distractor similarity effect. But the authors claim that the negative correlations in target-absent searches is the true novel finding.

      (5) I would like to make it clear that this negative correlation is not new either. The seminal paper by Duncan and Humphreys (1989) has clearly stated that "difficulty increases with increased similarity of targets to nontargets and decreased similarity between nontargets" (the sentence in their abstract). Here, "similarity between nontargets" is the same as the visual homogeneity defined here. Similar effects have been shown in Duncan (1989) and Nagy, Neriani, and Young (2005). See also the inconsistent results in Nagy & Thomas, 2003, Vicent, Baddeley, Troscianko & Gilchrist, 2009. More recently, Wei Ji Ma has systematically investigated the effects of heterogeneous distractors in visual search. I think the introduction part of Wei Ji Ma's paper (2020) provides a nice summary of this line of research. I am surprised that these references are not mentioned at all in this manuscript (except Duncan and Humphreys, 1989).

      You are right in noting that Duncan and Humphreys (1989) propose that searches are more difficult when nontargets are dissimilar. However, since our searches have identical distractors, the similarity between nontargets is always constant across target-absent searches, and therefore this cannot predict any systematic variation in target-absent search that is observed in our data. By contrast, our results explain both target-absent searches and target-present searches.

      Thank you for pointing us to previous work. These studies show that it is not just the average distractor similarity but the statistics of the distractor similarity that drive visual search. However these studies do not explain why target-absent searches should vary systematically. 

      (6) If the key contribution is the quantitative model, the study should be organized in a different way. Although the findings of positive and negative correlations are not novel, it is still good to propose new models to explain classic phenomena. I would like to mention the three studies by Wei Ji Ma (see below). In these studies, Bayesian observer models were established to account for trial-by-trial behavioral responses. These computational models can also account for the set-size effect, behavior in both localization and detection tasks. I see much more scientific rigor in their studies. Going back to the quantitative model in this paper, I am wondering whether the model can provide any qualitative prediction beyond the positive and negative correlations? Can the model make qualitative predictions that differ from those of Wei Ji's model? If not, can the authors show that the model can quantitatively better account for the data than existing Bayesian models? We should evaluate a model either qualitatively or quantitatively.

      Thank you for pointing us to prior work by Wei Ji Ma. These studies systematically examined visual search for a target among heterogeneous distractors using simple parametric stimuli and a Bayesian modeling framework. By contrast, our experiments involve searching for single oddball targets among multiple identical distractors, so it is not clear to us that the Wei Ji Ma models can be easily used to generate predictions about these searches used in our study. 

      We are not sure what you mean by offering quantitative predictions beyond positive and negative correlations. We have tried to explain systematic variation in target-present and target-absent response times using a model of how these decisions are being made. Our model explains a lot of systematic variation in the data for both types of decisions.

      (7) In my opinion, one of the advantages of this study is the fMRI dataset, which is valuable because previous studies did not collect fMRI data. The key contribution may be the novel brain region associated with display heterogeneity. If this is the case, I would suggest using a more parametric way to measure this region. For example, one can use Gabor stimuli and systematically manipulate the variations of multiple Gabor stimuli, the same logic also applies to motion direction. If this study uses static Gabor, random dot motion, object images that span from low-level to high-level visual stimuli, and consistently shows that the stimulus heterogeneity is encoded in one brain region, I would say this finding is valuable. But this sounds like another experiment. In other words, it is insufficient to claim a new brain region given the current form of the manuscript.

      We agree that parametric stimulus manipulations are important for studying early visual areas where stimulus dimensions are known (e.g. orientation, spatial frequency). Using parametric stimulus manipulations for more complex stimuli is fraught with issues because the underlying representation may not be encoding the dimensions being manipulated. This is the reason why we attempted to recover the underlying neural representation using dissimilarities measured using visual search, and then asked whether a decision making process operating on this underlying representation can explain how decisions are made. Therefore we disagree that parametric stimulus manipulations are the only way to obtain insight into such tasks.

      We have proposed a quantitative model that explains how decisions about target present and absent can be made through distance-to-center computations on an underlying object representation. We feel that the behavioural and the brain imaging results strongly point to a novel computation that is being performed in a localized region in the brain. These results represent an important first step in understanding how complex, property-based tasks are performed by the brain. We have revised our manuscript to make this point clearer.

      REFERENCES

      - Duncan, J., & Humphreys, G. W. (1989). Visual search and stimulus similarity. Psychological Review, 96(3), 433-458. doi: 10.1037/0033-295x.96.3.433

      - Duncan, J. (1989). Boundary conditions on parallel processing in human vision. Perception, 18(4), 457-469. doi: 10.1068/p180457

      - Nagy, A. L., Neriani, K. E., & Young, T. L. (2005). Effects of target and distractor heterogeneity on search for a color target. Vision Research, 45(14), 1885-1899. doi: 10.1016/j.visres.2005.01.007

      - Nagy, A. L., & Thomas, G. (2003). Distractor heterogeneity, attention, and color in visual search. Vision Research, 43(14), 1541-1552. doi: 10.1016/s0042-6989(03)00234-7

      - Vincent, B., Baddeley, R., Troscianko, T., & Gilchrist, I. (2009). Optimal feature integration in visual search. Journal of Vision, 9(5), 15-15. doi: 10.1167/9.5.15

      - Singh, A., Mihali, A., Chou, W. C., & Ma, W. J. (2023). A Computational Approach to Search in Visual Working Memory.

      - Mihali, A., & Ma, W. J. (2020). The psychophysics of visual search with heterogeneous distractors. BioRxiv, 2020-08.

      - Calder-Travis, J., & Ma, W. J. (2020). Explaining the effects of distractor statistics in visual search. Journal of Vision, 20(13), 11-11.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      The authors have not made substantive changes to address my major concerns. Instead, they have responded with arguments about why their original manuscript was good as written. I did not find these arguments persuasive. Given that, I've left my public review the same, since it still represents my opinions about the paper. Readers can judge which viewpoints are more persuasive.

      We respectfully disagree: we have tried our best to address your concerns with additional analysis wherever feasible, and by acknowledging any limitations.

      Reviewer #3 (Recommendations For The Authors):

      (1) As I mentioned above, please consider rewriting title, abstract, introduction, and significance. Please remove the word "visual homogeneity" and instead use distractor heterogeneity/distractor variability/distractor statistics as often used in literature.

      To clarify, visual homogeneity is NOT the same as distractor homogeneity. Visual homogeneity refers to a distance-to-center computation and represents an image-computable property that can vary systematically even when all distractors are identical. By contrast distractor heterogeneity varies only when distractors are different from each other.

      (2) Better to remove the phrase "generic tasks".

      Thanks for your suggestions. We now refer to these tasks as property-based tasks. 

      (3) Better to explicitly specify the predictions made by the quantitative model beyond positive and negative correlations.

      The predictions of the quantitative model are to explain systematic variation in the response times. We are not sure what else is there to predict in the response times.

      (4) If the quantitative model is the key contribution, better to highlight the details and algorithmic contribution of the model, and show the advantage of this model either qualitatively and quantitatively.

      Please see our responses above. Our quantitative model explains behavior and brain imaging data on three disparate tasks – the same/different, oddball visual search and symmetry tasks. 

      (5) If the new brain region is the key contribution, better to downplay the quantitative model.

      Please see our responses above. Our quantitative model explains behavior and brain imaging data on three disparate tasks – the same/different, oddball visual search and symmetry tasks.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer 1 (Public Review):

      The authors explain that an action potential that reaches an axon terminal emits a small electrical field as it ”annihilates”. This happens even though there is no gap junction, at chemical synapses. The generated electrical field is simulated to show that it can affect a nearby, disconnected target membrane by tens of microvolts for tenths of a microsecond. Longer effects are simulated for target locations a few microns away.

      To simulate action potentials (APs), the paper does not use the standard Hodgkin-Huxley formalism because it fails to explain AP collision. Instead, it uses the Tasaki and Matsumoto (TM) model which is simplified to only model APs with three parameters and as a membrane transition between two states of resting versus excited. The authors expand the strictly binary, discrete TM method to a Relaxing Tasaki Model (RTM) that models the relaxation of the membrane potential after an AP. They find that the membrane leak can be neglected in determining AP propagation and that the capacitive currents dominate the process.

      The strength of the work is that the authors identified an important interaction between neurons that is neglected by the standard models. A weakness of the proposed approach is the assumptions that it makes. For instance, the external medium is modeled as a homogeneous conductive medium, which may be further explored to properly account for biological processes.

      The authors provide convincing evidence by performing experiments to record action potential propagation and collision properties and then developing a theoretical framework to simulate the effect of their annihilation on nearby membranes. They provide both experimental evidence and rigorous mathematical and computer simulation findings to support their claims. The work has the potential of explaining significant electrical interaction between nerve centers that are connected via a large number of parallel fibers.

      We thank the reviewer for the distinct analysis of our work and the assessment that we ’identified an important interaction between neurons that is neglected by standard models’.

      Indeed, we modeled the external (extracellular) medium as homogeneous conductive medium and, compared to real biological systems, this is a simplification. Our intention is to keep our formal model as general as possible, however, it can be extended to account for specific properties. Accessory structures at axon terminals (such as the pinceau at Purkinje cells) most likely evolved to shape ephaptic coupling. In addition, the extracellular medium is neither homogeneous nor isotropic, and to fully mimic a particular neural connection this has to be implemented in a model as well. We agree and look forward to see how specific modification of the external medium in biological systems will affect ephaptic coupling. We hope to facilitate progress on this question by providing our source code for further exploration. Using the tools that have been developed by the BRIAN community one can generate or import arbitrary complex cell morphologies (e.g. NeuroML files). Our source code adds the TM- and RTM model, which allows exploring the direct impact of extracellular properties on target neurons.

      Reviewer 2 (Public Review):

      In this study, the authors measured extracellular electrical features of colliding APs travelling in different directions down an isolated earthworm axon. They then used these features to build a model of the potential ephaptic effects of AP annihilation, i.e. the electrical signals produced by colliding/annihilating APs that may influence neighbouring tissue. The model was then applied to some different hypothetical scenarios involving synaptic connections. The conclusion was that an annihilating AP at a presynaptic terminal can ephaptically influence the voltage of a postsynaptic cell (this is, presumably, the ’electrical coupling between neurons’ of the title), and that the nature of this influence depends on the physical configuration of the synapse.

      As an experimental neuroscientist who has never used computational approaches, I am unable to comment on the rigour of the analytical approaches that form the bulk of this paper. The experimental approaches appear very well carried out, and here I just have one query - an important assumption made is that the conduction velocity of anti- and orthodromically propagating APs is identical in every preparation, but this is never empirically/statistically demonstrated.

      My major concern is with the conclusions drawn from the synaptic modelling, which, disappointingly, is never benchmarked against any synaptic data. The authors state in their Introduction that a ’quantitative physical description’ of ephaptic coupling is ’missing’, however, they do not provide such a description in this manuscript. Instead, modelled predictions are presented of possible ephaptic interactions at different types of synapses, and these are then partially and qualitatively compared to previous published results in the Discussion. To support the authors’ assertion that AP annihilation induces electrical coupling between neurons, I think they need to show that their model of ephaptic effects can quantitatively explain key features of experimental data pertaining to synaptic function. Without this, the paper contains some useful high-precision quantitative measurements of axonal AP collisions, some (I assume) high-quality modelling of these collisions, and some interesting theoretical predictions pertaining to synaptic interactions, but it does not support the highly significant implications suggested for synaptic function.

      We thank the reviewer for highlighting the potential and the limitation of our model. We demonstrated with empirical data that measured conduction velocities of anti- and orthodromic propagating APs are indeed very similar and values are provided in Appendix 3 – table 1.

      In order to address how our model ’of ephaptic effects can quantitatively explain key features of experimental data’, we used the measured modulation of AP rates in Purkinje fibers by Blot and Babour (2014) and our results are now included in the manuscript. In our model, we implemented the ephaptic coupling of the Basket cell (with an annihilating AP) and predicted the modulation of AP rate in the Purkinje cell. Our model predictions are compared to the measured modulation of AP-rates in Purkinje cells and is added as Fig. 5 to the main manuscript (line 264 to 284 ). With this example, we show that ephaptic coupling as described with our RTM model can quantitatively describe key features of experimental data. Both, the rapid inhibition and the rebound activity is described by our model with implementation of non-excitable parts at the pinceau of the Basket cell. Future, experimental research can use the provided formalism to investigate in more detail the ephaptic coupling in systems like the Mauthner cell and the Purkinje cell by exploring how accessory structures and concomitant physical parameters, e.g. the extracellular properties impact ephaptic coupling.

      Reviewer 3 (Public Review):

      This manuscript aims to exploit experimental measurements of the extracellular voltages produced by colliding action potentials to adjust a simplified model of action potential propagation that is then used to predict the extracellular fields at axon terminals. The overall rationale is that when solving the cable equation (which forms the substrate for models of action potential propagation in axons), the solution for a cable with a closed end can be obtained by a technique of superposition: a spatially reflected solution is added to that for an infinite cable and this ensures by symmetry that no axial current flows at the closed boundary. By this method, the authors calculate the expected extracellular fields for axon terminals in different situations. These fields are of potential interest because, according to the authors, their magnitude can be larger than that of a propagating action potential and may be involved in ephaptic signalling. The authors perform direct measurements of colliding action potentials, in the earthworm giant axon, to parameterise and test their model.

      Although simplified models can be useful and the trick of exploiting the collision condition is interesting, I believe there are several significant problems with the rationale, presentation, and application, such that the validity and potential utility of the approach is not established.

      Simplified model vs. Hogdkin and Huxley

      The authors employ a simplified model that incorporates a two-state membrane (in essence resting and excited states) and adds a recovery mechanism. This generates a propagating wave of excitation and key observables such as propagation speed and action potential width (in space) can be adjusted using a small number of parameters. However, even if a Hodgkin-Huxley model does contain a much larger number of parameters that may be less easy to adjust directly, the basic formalism is known to be accurate and typical modifications of the kinetic parameters are very well understood, even if no direct characterisations already exist or cannot be obtained. I am therefore unconvinced by the utility of abandoning the HodgkinHuxley version.

      In several places in the manuscript, the simplified model fits the data well whereas the Hodgkin-Huxley model deviates strongly (e.g. Fig. 3CD). This is unsatisfying because it seems unlikely that the phenomenon could not be modelled accurately using the HH formulation. If the authors really wish to assert that it is ”not suitable to predict the effects caused by AP [collision]” (p9) they need to provide a good deal more analysis to establish the mechanism of failure.

      We are not as convinced as the reviewer that, at the current state of parameter estimation, the HH model is suited for predicting ephaptic coupling after ’adjusting’ parameters. There are strong arguments against such an approach. A major function of a model is to make testable predictions rather than to just mimic a biological phenomenon. The predictive power of a model heavily depends on how reasonable model parameters can be estimated or measured. As the reviewer correctly points out in the specific comments (”... the parameters adjusted to fit the model are the membrane capacitance and intracellular resistance. These have a physical reality and could easily be measured or estimated quite accurately...”), our model contains only parameters that can be assessed experimentally, thus it has a better predictive power compared to the HH model with a multitude of parameters for which ”no direct characterisations already exist or cannot be obtained” (citing reviewer from above).

      Already the founders of the HH model were well aware of the limitations, as stated by Hodgkin and Huxley in 1952 (J Physiol 117:500–544):

      An equally satisfactory description of the voltage clamp data could no doubt have been achieved with equations of very different form ... The success of the equations is no evidence in favour of the mechanism of permeability change that we tentatively had in mind when formulating them.

      A catchy but sloppy description for the problem of overfitting with too many parameters is given by the quote of John von Neumann: With four parameters I can fit an elephant, and with five I can make him wiggle his trunk.

      We do not rule out the possibility that the HH model eventually can be used to predict ephaptic coupling. However, at the moment, parameter estimation for the HH model prevents its usability for predicting ephaptic coupling.

      (In)applicability of the superposition principle

      The reflecting boundary at the terminal is implemented using the symmetry of the collision of action potentials. However, at a closed cable there is no reflecting boundary in the extracellular space and this implied assumption is particularly inappropriate where the extracellular field is one objective of the modelling, as here. I believe this assumption is not problematic for the calculation of the intracellular voltage, because extracellular voltage gradients can usually be neglected1, but the authors need to explain how the issue was dealt with for the calculation of the extracellular fields of terminals. I assume they were calculated from the membrane currents of one-half of the collision solution, but this does not seem to be explained. It might be worth showing a spatial profile of the calculated field.

      We disagree with the reviewer’s statement ’...at a closed cable there is no reflecting boundary in the extracellular space and this implied assumption is particularly inappropriate...’. We do not imply this assumption in our model! We do not assume any symmetry or boundary condition in the extracellular space. Instead, the extracellular field is calculated for an infinite homogeneous volume conductor (Eq.

      6).

      We conduct separate calculations for (1) source membrane current, (2) resulting extracellular field, and (3) impact upon a target neuron. The boundary condition used for our calculations only refers to the axial current being zero at the axon terminal. Consequently all the internal current that enters the last compartment must leave the last compartment as membrane current and contributes to the extracellular current and field.

      The extracellular field around the axon terminal is not symmetric, as can be seen by it’s impact upon a target in Figure 4—figure supplement 1 which is also not symmetric. The symmetry of the extracellular field when APs are colliding (Cf. symmetry in Fig 1C) is merly the result of the symmetric stimulation and counterpropagation of two APs. We now are describing more specifically the bounday condition for colliding and terminating APs already in the introduction: ’A suitable boundary condition (intracellular, axial current equals zero) can be generated experimentally by a collision of two counter-propagating APs ... Within any cable model, the very same boundary condition also exists within the axon at the synaptic terminal due to the broken translation symmetry for the current loops ...’ Later, at the result section (Discharge of colliding APs), we continue with ’AP propagation is blocked when the axial current is shut down at a boundary condition, e.g. by reaching the axon terminal or by AP collision....’ and implement this condition in our calculations for the axon terminals.

      Missing demonstrations

      Central analytical results are stated rather brusquely, notably equations (3) and (4) and the relation between them. These merit an expanded explanation at the least. A better explanation of the need for the collision measurements in parameterising the models should also be provided.

      We thank the reviewer for pointing out the insufficient explanation of the equations 3 and 4. We rephrased the paragraph ’Discharge of colliding APs’ in order to clarify the origin and the function of the two equations (eq. 3: how much charge is expelled and eq. 4: the resulting extracellular potential that is used for model validation).

      Later, in the Discussion, we rephrased the paragraph where we describe the annihilation process and explain further that one term of eq. 4 sometimes is refered to ’activating function’ when using microelectrodes for stimulation.

      With respect to the ’explanation of the need for the collision measurement’, we think that the explanations we give at several locations in the manuscript are sufficient as is. We explain and elaborate in the introduction: ’We explore the behaviour of APs at boundaries ... In this study, we first focus on collisions of APs. Our experimental observation of colliding APs provides unique access to the spatial profile of the extracellular potential around APs that are blocked by collisions and thus annihilate..... Recording propagating APs allows to determine both the propagation velocity and the amplitude of the extracellular electric potentials. The collision experiment provides additional information ... In the results we recall: ’The width of the collision is a measure of the characteristic length λ⋆ of the AP and is uniquely revealed by a collision sweep experiment.’

      Adjusted parameters

      I am uncomfortable that the parameters adjusted to fit the model are the membrane capacitance and intracellular resistance. These have a physical reality and could easily be measured or estimated quite accurately. With a variation of more than 20-fold reported between the different models in Appendix 2 we can be sure that some of the models are based upon quite unrealistic physical assumptions, which in turn undermines confidence in their generality.

      The fact that the parameters of our model have physical realities is clearly in favor of our models. We rephrased the legend of the table, now explaining the procedure for the model fitting and the rational behind. Although the values of g⋆ can differ by a factor of 15 and the resulting amplitude is very different, the relationship ri cm \= vpλ⋆ is very similar, independently of the model used and this confirms our analytical framework.

      p8 - the values of both the extracellular (100 Ohm m) and intracellular resistivity (1 Ohm m) appear to be in error, especially the former.

      We have the following justification for the resistivity values we used. For the intracellular resistivity, literature values range from 0.4 - 1.5 Ohm m, and therefore we selected 1 Ohm m. See: Carpenter et al (1975) doi: 10.1085/jgp.66.2.139; Cole et al (1975) doi: 10.1085/jgp.66.2.133; Bekkers (2014) doi: 10.1007/978-1-46147320-6 35-2.

      Estimating extracellular resistivity is less straight forward, since it depends crucially on the structure around the synapse which consists of conducting saline and insulating fatty tissue. Ranges from 3 to 600 Ohm m are reported (Linden et al (2011) doi: 10.1016/j.neuron.2011.11.006) and Bakiri et al (2011) doi: 10.1113/jphysiol.2010.201376). Weiss et al (2008; doi: 10.1073/pnas.0806145105) report extracellular resistivities in the Mauthner Cap between 50-600 Ohm m in SI. Since the pinceau is structurally similar to the Mauthner cells axon cap, we argue that a value of 100 Ohm m is a reasonable choice for our calculations. Additionally, we derived a value from Blot and Barbour (doi:c10.1038/nn.3624), rephrased the paragraph in the main text and added our calculation to the supplementary material (Appendix 1).

      (In)applicability to axon terminals

      The rationale of the application of the collision formalism to axon terminals is somewhat undermined by the fact that they tend not to be excitable. There is experimental evidence for this in the Calyx of Held and the cerebellar pinceau.

      The solution found via collision is therefore not directly applicable in these cases.

      We do not agree with the reviewer’s statement that ’the solution found via collision is (therefore) not directly applicable...’. Our model is well suited for application on axon terminals that are not excitable, e.g. the pinceau of the basket cell, as the reviewer points out. We have included a calculation for this case and present the results in the new Fig. 5 (main text line 264 to 284 ).

      Comparison with experimental data

      More effort should be made to compare the modelling with the extracellular terminal fields that have been reported in the literature.

      As outlined above (see: Reponse to reviewer 2), we now compare directly the predictions of our models with measured modulation of AP rates in Purkinje fibers (Blot and Babour 2014) and our results are included in the manuscript (Fig. 5 and main text line 264 to 284). See also our response to reviewer 2 in which we address how our model ’of ephaptic effects can quantitatively explain key features of experimental data’.

      Choice of term ”annihilation”

      The term annihilation does not seem wholly appropriate to me. The dictionary definitions are something along the lines of complete destruction by an external force or mutual destruction, for example of an electron and a positron. I don’t think either applies exactly here. I suggest retaining the notion of collision which is well understood in this context.

      Experimentally, we generated a collision of APs and showed that colliding APs dissapear and do not pass each other. For this process the term annihilation is used in our and in other studies (see e.g. Berg et al (2017) doi: 10.1103/PhysRevX.7.028001; Johnson et al (2018) doi: 10.3389/fphys.2018.00779; Follmann (2015) doi: 10.1103/PhysRevE.92.032707; Shrivastava et al (2018) doi: 10.1098/rsif.2017.0803). The physical processes involved in the termination of an AP at a closed end are essentially identical to those of two colliding APs. This we think justifies using the term annihilation for those processes.

      Recommendations for the authors:

      We believe the work is of high quality and should motivate future experimental work. We are including the review comments here for your information. The main piece of feedback we are offering is that the broad claims need to be adjusted to the strength of evidence provided: as is, the manuscript provides compelling predictions but the claim that these predictions are in full agreement with data remains to be substantiated. A technical concern raised by the reviewers is that the reflecting boundary condition may need further justification. The authors may wish to respond to this issue in a rebuttal and/or adjust the manuscript as necessary.

      We substantiated our claim that our predictions are in full agreement with experimental data. We added to the manuscript a section in which we compare our models’ predictions to published, experimental data. To this aim, we extracted date from the publication of Blot and Babour (2014), we elaborated on the parameters used and run our model accordingly. We added to the Results/Model of ephaptic coupling a paragraph on ’The modulation of activity in Purkinje cells...’ (line 264), where we describe our results and we also included another figure to the main text for illustration (Fig. 5).

      We clarified the term ’boundary condition’ by rephrasing parts of the introduction and we explain the rational behind in ’Discharge of colliding APs (...AP propagation is blocked when axial current is shut down...) and in ’Model of ephaptic coupling (Within any cable model, the same boundary...). See also our response to the general comments of reviewer 3 above.

      Reviewer 1 (Recommendations For The Authors):

      Major:

      Accessing data and code requires signing in, which should not be required. The link provided also seems to be not accessible yet - could be pending review.

      The repository is now publicly availible. We did provide an access code within the letter to the editor, this code is no longer required.

      Line 74: how about morphology? Authors should clarify and emphasize in the introduction that the TM model is a spatially continuous model with partial differential equations as opposed to discrete morphological models to simulate HH equations.

      The reviewer is correct that the TM model is continous. However, so is the HH model. The difference between HH and TM is only that the TM model can be solved analytically, which yields a spatially homogeneous analytical solution. It should be noted that this analytical solution can only be valid for a homogeneous (therefore infinite) nerve. Every numerical computation, be it HH or TM, requires a finite number of discrete compartments. In our calculations, we used identical compartment models for HH, TM and RTM model. In each compartment, the differential equations are solved numerically. Since there is no fundamental difference between these models, we obstain from changing the text.

      Minor:

      Major typo: ventral nerve cord, not ”chord”. Repeated in several places.

      Thank you for indicating this typo to us.

      Line 25: inhibition, excitation, and modulation?

      We changed the line to: ... leads to modulation, e.g. excitation or inhibition

      Line 70: better term for ”length” of AP would be ”duration”. Also, the sentence could be simplified to use either ”its” or ”of the AP”

      Space and time are not interchangable. Thus, the term lenght can not be replaced by duration. We simplified the structure of the sentence as suggested.

      Fig 1A/B: it’s strange that panel B precedes panel A.

      Exchanged

      Fig 1C: don’t see the ”horizontal line”; also regarding ”The recording was at a medial position”, the caption is not clear until one reads the main text.

      We changed the legend to: ... The collision is captured in the recording line at y-position 0 mm, while orthodromic propagation is at the top and antidromic propagation is at the bottom. (D) The peak amplitude as a function of the distance to the collision. Examples of four sweeps at three positions along the nerve cord....

      Line 127: the per distance measures could be named as ”specific” conductivity, etc.

      We explicitly provide the units thereby defining the quantities unambigously.

      Line 176: typo ”ad-hoc”.

      Thank you.

      Fig 4B: should clarify that the circle in the schematic is not the soma but a synaptic bouton.

      We rephrased to ’...(B,C) when the AP is annihilating at a bouton of a neuron terminal (upper neuron in end-to-shaft geometry, similar to the Basket cell–Purkinje cell synapse)...’, and we added a label to Fig 4B.

      Reviewer 2 (Recommendations For The Authors):

      Can the authors’ model be quantitatively compared with experimental data of ephaptic interactions at synapses (e.g. the Blot & Barbour study described in the Discussion)?

      We did so as outlined in our response to the reviewer above.

      Can statistical evidence be provided that the velocities of anti- and orthodromic APs are indeed identical in the earthworm nerve recordings?

      These data and statistics are available in Appendix 2, now 3 – table 1

      Why not reorder ABCD in Fig1 so the subpanels run from left to right?

      We adjusted the labels accordingly.

    1. Author response:

      Reviewer #1 (Public review):

      Summary:

      This paper contains what could be described as a "classic" approach towards evaluating a novel taste stimuli in an animal model, including standard behavioral tests (some with nerve transections), taste nerve physiology, and immunocytochemistry of the tongue. The stimulus being tested is ornithine, from a class of stimuli called "kokumi", which are stimuli that enhance other canonical tastes, increasing essentially the hedonic attributes of these other stimuli; the mechanism for ornithine detection is thought to be GPRC6A receptors expressed in taste cells. The authors showed evidence for this in an earlier paper with mice; this paper evaluates ornithine taste in a rat model.

      Strengths:

      The data show the effects of ornithine on taste: in two-bottle and briefer intake tests, adding ornithine results in a higher intake of most, but not all, stimuli tests. Bilateral nerve cuts or the addition of GPRC6A antagonists decrease this effect. Small effects of ornithine are shown in whole-nerve recordings.

      Weaknesses:

      The conclusion seems to be that the authors have found evidence for ornithine acting as a taste modifier through the GPRC6A receptor expressed on the anterior tongue. It is hard to separate their conclusions from the possibility that any effects are additive rather than modulatory. Animals did prefer ornithine to water when presented by itself. Additionally, the authors refer to evidence that ornithine is activating the T1R1-T1R3 amino acid taste receptor, possibly at higher concentrations than they use for most of the study, although this seems speculative. It is striking that the largest effects on taste are found with the other amino acid (umami) stimuli, leading to the possibility that these are largely synergistic effects taking place at the tas1r receptor heterodimer.

      We would like to thank Reviewer #1 for the valuable comments. Our basis for considering ornithine as a taste modifier stems from our observation that a low concentration of ornithine (1 mM), which does not elicit a preference on its own, enhances the preference for umami substances, sucrose, and soybean oil through the activation of the GPRC6A receptor. Notably, this receptor is not typically considered a taste receptor. The reviewer suggested that the enhancement of umami taste might be due to potentiation occurring at the TAS1R receptor heterodimer. However, we propose that a different mechanism may be at play, as an antagonist of GPRC6A almost completely abolished this enhancement. In the revised manuscript, we will endeavor to provide additional information on the role of ornithine as a taste modifier acting through the GPRC6A receptor.

      Reviewer #2 (Public review):

      Summary:

      The authors used rats to determine the receptor for a food-related perception (kokumi) that has been characterized in humans. They employ a combination of behavioral, electrophysiological, and immunohistochemical results to support their conclusion that ornithine-mediated kokumi effects are mediated by the GPRC6A receptor. They complemented the rat data with some human psychophysical data. I find the results intriguing, but believe that the authors overinterpret their data.

      Strengths:

      The authors examined a new and exciting taste enhancer (ornithine). They used a variety of experimental approaches in rats to document the impact of ornithine on taste preference and peripheral taste nerve recordings. Further, they provided evidence pointing to a potential receptor for ornithine.

      Weaknesses:

      The authors have not established that the rat is an appropriate model system for studying kokumi. Their measurements do not provide insight into any of the established effects of kokumi on human flavor perception. The small study on humans is difficult to compare to the rat study because the authors made completely different types of measurements. Thus, I think that the authors need to substantially scale back the scope of their interpretations. These weaknesses diminish the likely impact of the work on the field of flavor perception.

      We would like to thank Reviewer #2 for the valuable comments and suggestions. Regarding the question of whether the rat is an appropriate model system for studying kokumi, we have chosen this species for several reasons: it is readily available as a conventional experimental model for gustatory research; the calcium-sensing receptor (CaSR), known as the kokumi receptor, is expressed in taste bud cells; and prior research has demonstrated the use of rats in kokumi studies involving gamma Glu-Val-Gly (Yamamoto and Mizuta, Chem. Senses, 2022). We acknowledge that fundamentally different types of measurements were conducted in the human psychophysical study and the rat study. Kokumi can indeed be assessed and expressed in humans; however, we do not currently have the means to confirm that animals experience kokumi in the same way that humans do. Therefore, human studies are necessary to evaluate kokumi, a conceptual term denoting enhanced flavor, while animal studies are needed to explore the potential underlying mechanisms of kokumi. We believe that a combination of both human and animal studies is essential, as is the case with research on sugars. While sugars are known to elicit sweetness, it is unclear whether animals perceive sweetness identically to humans, even though they exhibit a strong preference for sugars. In the revised manuscript, we will incorporate additional information to address the comments raised by the reviewer. We will also carefully review and revise our previous statements to ensure accuracy and clarity.

      Reviewer #3 (Public review):

      Summary:

      In this study, the authors set out to investigate whether GPRC6A mediates kokumi taste initiated by the amino acid L-ornithine. They used Wistar rats, a standard laboratory strain, as the primary model and also performed an informative taste test in humans, in which miso soup was supplemented with various concentrations of L-ornithine. The findings are valuable and overall the evidence is solid. L-Ornithine should be considered to be a useful test substance in future studies of kokumi taste and the class C G protein-coupled receptor known as GPRC6A (C6A) along with its homolog, the calcium-sensing receptor (CaSR) should be considered candidate mediators of kokumi taste.

      Strengths:

      The overall experimental design is solid based on two bottle preference tests in rats. After determining the optimal concentration for L-Ornithine (1 mM) in the presence of MSG, it was added to various tastants, including inosine 5'-monophosphate; monosodium glutamate (MSG); mono-potassium glutamate (MPG); intralipos (a soybean oil emulsion); sucrose; sodium chloride (NaCl); citric acid and quinine hydrochloride. Robust effects of ornithine were observed in the cases of IMP, MSG, MPG, and sucrose, and little or no effects were observed in the cases of sodium chloride, citric acid, and quinine HCl. The researchers then focused on the preference for Ornithine-containing MSG solutions. The inclusion of the C6A inhibitors Calindol (0.3 mM but not 0.06 mM) or the gallate derivative EGCG (0.1 mM but not 0.03 mM) eliminated the preference for solutions that contained Ornithine in addition to MSG. The researchers next performed transections of the chord tympani nerves (with sham operation controls) in anesthetized rats to identify the role of the chorda tympani branches of the facial nerves (cranial nerve VII) in the preference for Ornithine-containing MSG solutions. This finding implicates the anterior half-two thirds of the tongue in ornithine-induced kokumi taste. They then used electrical recordings from intact chorda tympani nerves in anesthetized rats to demonstrate that ornithine enhanced MSG-induced responses following the application of tastants to the anterior surface of the tongue. They went on to show that this enhanced response was insensitive to amiloride, selected to inhibit 'salt tastant' responses mediated by the epithelial Na+ channel, but eliminated by Calindol. Finally, they performed immunohistochemistry on sections of rat tongue demonstrating C6A positive spindle-shaped cells in fungiform papillae that partially overlapped in its distribution with the IP3 type-3 receptor, used as a marker of Type-II cells, but not with (i) gustducin, the G protein partner of Tas1 receptors (T1Rs), used as a marker of a subset of type-II cells; or (ii) 5-HT (serotonin) and Synaptosome-associated protein 25 kDa (SNAP-25) used as markers of Type-III cells.

      Weaknesses:

      The researchers undertook what turned out to be largely confirmatory studies in rats with respect to their previously published work on Ornithine and C6A in mice (Mizuta et al Nutrients 2021).

      The authors point out that animal models pose some difficulties of interpretation in studies of taste and raise the possibility in the Discussion that umami substances may enhance the taste response to ornithine (Line 271, Page 9). They miss an opportunity to outline the experimental results from the study that favor their preferred interpretation that ornithine is a taste enhancer rather than a tastant.

      At least two other receptors in addition to C6A might mediate taste responses to ornithine: (i) the CaSR, which binds and responds to multiple L-amino acids (Conigrave et al, PNAS 2000), and which has been previously reported to mediate kokumi taste (Ohsu et al., JBC 2010) as well as responses to Ornithine (Shin et al., Cell Signaling 2020); and (ii) T1R1/T1R3 heterodimers which also respond to L-amino acids and exhibit enhanced responses to IMP (Nelson et al., Nature 2001). While the experimental results as a whole favor the authors' interpretation that C6A mediates the Ornithine responses, they do not make clear either the nature of the 'receptor identification problem' in the Introduction or the way in which they approached that problem in the Results and Discussion sections. It would be helpful to show that a specific inhibitor of the CaSR failed to block the ornithine response. In addition, while they showed that C6A-positive cells were clearly distinct from gustducin-positive, and thus T1R-positive cells, they missed an opportunity to clearly differentiate C6A-expressing taste cells and CaSR-expressing taste cells in the rat tongue sections.

      It would have been helpful to include a positive control kokumi substance in the two-bottle preference experiment (e.g., one of the known gamma-glutamyl peptides such as gamma-glu-Val-Gly or glutathione), to compare the relative potencies of the control kokumi compound and Ornithine, and to compare the sensitivities of the two responses to C6A and CaSR inhibitors.

      The results demonstrate that enhancement of the chorda tympani nerve response to MSG occurs at substantially greater Ornithine concentrations (10 and 30 mM) than were required to observe differences in the two bottle preference experiments (1.0 mM; Figure 2). The discrepancy requires careful discussion and if necessary further experiments using the two-bottle preference format.

      We would like to thank Reviewer #3 for the valuable comments and helpful suggestions. We propose that ornithine has two stimulatory actions: one acting on GPRC6A, particularly at lower concentrations, and another on amino acid receptors such as T1R1/T1R3 at higher concentrations. Consequently, ornithine is not preferable at lower concentrations but becomes preferable at higher concentrations. For our study on kokumi, we used a low concentration (1 mM) of ornithine. The possibility mentioned in the Discussion that 'the umami substances may enhance the taste response to ornithine' is entirely speculative. We will reconsider including this description in the revised version. As the reviewer suggested, in addition to GPRC6A, ornithine may bind to CaSR and/or T1R1/T1R3 heterodimers. However, we believe that ornithine mainly binds to GPRC6A, as a specific inhibitor of this receptor almost completely abolished the enhanced response to umami substances, and our immunohistochemical study indicated that GPRC6A-expressing taste cells are distinct from CaSR-expressing taste cells (see Supplemental Fig. 3). We conducted essentially the same experiments using gamma-Glu-Val-Gly in Wistar rats (Yamamoto and Mizuta, Chem. Senses, 2022) and compared the results in the Discussion. The reviewer may have misunderstood the chorda tympani results: we added the same concentration (1 mM) used in the two-bottle preference test to MSG (Fig. 5-B). Fig. 5-A shows nerve responses to five concentrations of plain ornithine. In the revised manuscript, we will strive to provide more precise information reflecting the reviewer’s comments.

    1. Author response:

      We thank both reviewers for their considerate reviews. In this provisional response we would like to make a few key points.

      Given that we introduced a bespoke likelihood model for the second dataset, Reviewer 1 asks whether "every unique dataset requires a tailored prior or likelihood to produce the best results". Our intention is to advocate for the horseshoe prior model as a 'standard' first analysis for any cell count dataset. If extra knowledge about the data is available, or if any data artefacts are detected, more elaborate likelihoods could be introduced as needed in a follow-up analysis. Our introduction of the zero-inflated Poisson likelihood for the second dataset was one such example, but many alternatives could exist. This iterative approach to model building, sometimes referred to as a `Bayesian workflow' is seen as good practise in Bayesian data analysis literature. In the revised version of the paper, we will try to explain the recommendations and modelling philosophy behind this method while emphasising that tailoring or bespoke modelling is not required for our `standard analysis', what we would regard as the Bayesian replacement for a t-test on counts.

      Reviewer 1 notes that "the differences between the results produced by the two Bayesian models in case study 2 are not discussed". We agree that this discrepancy, arising from the specific assumptions of each model is an interesting issue which we should better explore in the paper. In Figure 6 we plotted the actual data values alongside posterior and confidence intervals to explain how the results from the ZIP likelihood and Horseshoe prior compare with those from a t-test. However, our example regions did not highlight cases where differences could be noted between the the two Bayesian models. In the revised version of the paper, we will extend Figure 6 to include further brain regions, such as those mentioned by the referee, and will use that as an opportunity to discuss the broader issue of what to do when the Bayesian models give conflicting results.

      We agree with reviewer 2's point that the model description terminology could be made clearer for the target eLife audience. We tried to strike a balance between introducing the reader to the conventional technical terminology used in the Bayesian data analysis necessary for understanding the model while avoiding exhaustive statistical terminology. We erred too much on the side of the latter instead of providing clear links between the model construction and experimental data. In the revised version of the paper, we will augment any technical terms with more biological language and provide a Glossary for reader reference.

    1. Author response:

      Reviewer #1:

      We agree with Reviewer 1 that the flexibility of SPRAWL also makes it difficult to interpret its outputs. We consider SPRAWL to be a hypothesis-generation tool to answer simple questions of subcellular localization in a statistically robust manner. In this paper we include examples of how it can be incorporated with other tools and wetlab experimentation to build biological intuition. Our hope is that the SPRAWL software, or even the underlying simple statistical ideas are of use to others in the field.

      Reviewer #2:

      We agree with Reviewer #2 that this manuscript does not demonstrate biological significance of the observed results of applying SPRAWL to massively multiplexed FISH datasets. We agree it would require additional wetlab experiments such as cell-type specific and isoform-resolved fluorescence in-situ hybridization, which we consider beyond the scope of this paper. We believe that the observed correlations of subcellular localization detected by SPRAWL and the differential 3’ UTR usage detected by ReadZS are compelling, although not conclusive, as are the Timp3 experimental studies.

      Our understanding is that Baysor is primarily a cell-segmentation algorithm, which is not what SPRAWL attempts to achieve. Baysor states that it identifies “cells of a distinct type will give rise to small molecular neighborhoods with stereotypical transcriptional composition, making it possible to interpret such neighborhoods without performing explicit cell segmentation” which we understand to mean that Baysor identifies spatial groupings of cells with “stereotypical transcriptional composition” rather than subcellular RNA localization. We do not think that SPRAWL and Baysor are comparable, but instead Baysor could be used as an upstream step to SPRAWL to potentially improve cell segmentation.

      Reviewer #3:

      We thank Reviewer #3 for identifying discrepancies in the paper which we addressed to the best of our abilities.

    1. Author response:

      Reviewer 1:

      Many thanks for your positive review and clear overview of our paper. We also agree with your interpretation of our results that ‘the information that is decodable and the information that is task-relevant may relate in very different ways’ and we could have emphasised this point more in the paper.

      With regards to the qualitative similarities between our models and our data, we agree that due to the fact that one can achieve any desired level of activity, decoding accuracy, performance, etc in a model, we focussed on changes over learning of key metrics that are commonly used in the field. Although this can appear qualitative at times because the raw values can differ between the data and our models, our main results are ultimately strongly quantitative (e.g., Fig. 3c,d, and Fig. 5f). We note that we could have fine tuned the models to have similar activity levels, decoding accuracies etc to our data, and on the face of it this may have made the results appear more convincing, but we felt that such trivial fine tuning does not change any of our key results in any fundamental way and is not the aim of computational modelling. The model one chooses to analyse will always be abstracted from biology in some way, by definition.

      Reviewer 2:

      Thank you very much for your kind comments and clear overview of our paper. We also hope that our paper ‘provides a valuable analysis of the effect of two parameters on representations of irrelevant stimuli in trained RNNs.’

      With regards to our suggested mechanism of suppressing dynamically irrelevant stimuli, we are sorry that we did not provide a sufficient enough explanation of suppressing color representations when they are irrelevant. We hopefully provide a longer explanation here. Our mechanism of suppression of dynamically irrelevant stimuli does not suggest that it becomes un-suppressed later, only the behaviourally relevant variable should be decodable when it is needed (i.e., XOR). Although color decodability did increase slightly in the data and some of the models from the color period to the shape period, it was typically not significant and was therefore not a result that we emphasise in the paper (although this could be analysed further to see if additional mechanisms might explain it). We emphasise throughout that color decoding is typically similar between color and shape periods (either high or low) and either decreases or increases over time in both periods. We also focus on whether color decodability increases or decreases over learning during the color period when it is irrelevant (which we call ‘early color decoding’). Importantly, decoding of color or shape is not needed to perform the task, only decoding of XOR is needed to perform the task. For example, in our two-neuron networks, we observe perfect XOR decoding and only 75% decoding of color and shape, and decoding during the shape period is the same as the network at initialisation before any training. The mechanism we suggest of suppressing dynamically irrelevant stimuli does not predict that that stimulus should be un-suppressed later, only the behaviourally relevant variable should be decodable (i.e., XOR). Instead, what we try to explain is that color inputs can generate 0 firing rate during the color period, when that input does not need to be used and is therefore irrelevant (and color decoding decreases during the color period over learning), but these inputs can be combined with shape inputs later to create a perfectly decodable XOR response.

      With regards to interpretation of our results based on metabolic cost constraints, we feel that this is an unnecessarily strong criticism to say that it ‘is not backed up by the presented data/analyses.’ All of our models were trained with only a metabolic cost constraint, a noise strength, and a task performance term. Therefore, the results of the models are directly attributable to the strength of metabolic cost that we use. Additionally, although one could in principle pick any of infinitely many different parameters to change and measure the response in an optimized network, varying metabolic cost and noise are two of the most fundamental phenomena that neural circuits must contend with, and many studies have analysed the impact they have on neural circuit dynamics. Furthermore, in line with previous studies (Yang et al., 2019, Whittington et al., 2022, Sussillo et al., 2015, Orhan et al., 2019, Kao et al., 2021, Cueva et al., 2020, Driscoll et al., 2022, Song et al., 2016, Masse et al., 2019, Schimel et al., 2023), we operationalized metabolic cost in our models through L2 firing rate regularization. This cost penalizes high overall firing rates. (Such an operationalization of metabolic cost also makes sense for our models because network performance is based on firing rates rather than subthreshold activities.) There are however alternative conceivable ways to operationalize a metabolic cost; for example L1 firing rate regularization has been used previously when optimizing neural networks and promotes more sparse neural firing. Interestingly, although our L2 is generally conceived to be weaker than L1 regularization, we still found that it encouraged the network to use purely sub-threshold activity in our task. The regularization of synaptic weights may also be biologically relevant because synaptic transmission uses the most energy in the brain compared to other processes (Faria-Pereira et al., 2022, Harris et al., 2012). Additionally, even subthreshold activity could be regularized as it also consumes energy (although orders of magnitude less than spiking (Zhu et al., 2019)). Therefore, future work will be needed to examine how different metabolic costs affect the dynamics of task-optimized networks.

      With regards to color representations in PFC only qualitatively matching those in our models, in line with the comment from Reviewer 1, we agree that due to the fact that one can achieve any desired level of activity, decoding accuracy, performance, etc in a model, we focussed on changes over learning of key metrics that are commonly used in the field. Although this can appear qualitative at times because the raw values can differ between the data and our models, our main results are ultimately strongly quantitative (e.g., Fig. 3c,d, and Fig. 5f). We note that we could have fine tuned the models to have similar activity levels, decoding accuracies etc to our data, and on the face of it this may have made the results appear more convincing, but we felt that such trivial fine tuning does not change any of our key results in any fundamental way and is not the aim of computational modelling. The model one chooses to analyse will always be abstracted from biology in some way, by definition. Finally, of course we note that changes in color decoding could result from other causes, but we focussed on two key phenomena that neural circuits must contend with: noise and metabolic costs. Therefore, it is likely that these two variables play a strong role in stimulus representations in neural circuits

      Reviewer 3:

      Thank you very much for your thorough and clear overview of our paper and we agree that it is important to investigate phenomena and manipulations in computational models that are almost impossible to do in vivo and we are pleased you found our mathematical analyses rigorous and nicely documented.

      Although we agree that it can be useful to study the responses of individual neurons, we focussed on population analyses of all available neurons without omitting or specifically selecting neurons based on their dynamics. We are also not suggesting that the activities of individual ‘neurons’ in the models and data should be similar since our models are highly abstract firing rate models. But rather, the overall computational strategy, which one can access through population decoding and cross-generalised decoding, was what we were interested in comparing between the models and the data and is arguably the correct level of analysis of such models (an data) given our key questions (Vyas et al., 2020, Churchland et al., 2012, Mante et al., 2013, Ebitz et al., 2021).

      We also certainly agree and are more than open to the fact that suppression of irrelevant stimuli may already be happening on the inputs arriving in PFC. Indeed, we actually suggest this as the mechanism in Fig. 5 (together with recurrent circuit dynamics that make use of these inputs).

      With regards to the dynamics of the two-neuron networks not being ‘informative of what happens in brain networks’, we agree that these models are very simplified and may only contain very fundamental similarities with biological neurons. However, we only used them to illustrate the fundamental mechanism of generating 0 firing rate during the color epoch so that it is more easily understandable for readers as they can see the entire 2-dimensional state space and the entire computational strategy can be seen (Fig. 5a-d). We also note that we did this for both rectified linear and tanh networks, thus showing that such a mechanism is preserved across fundamentally different firing rate nonlinearities. Additionally, after illustrating this fundamental mechanism of networks receiving color information but generating 0 firing rate, we show that the exact same mechanism is at play in the large networks we use throughout the paper (Fig. 5e). We also only compare the large networks to our neural recordings. We do agree though that it would be interesting to further compare fundamental similarities and differences between our models and our neural recordings (always at the right level of analysis that makes sense for our chosen models) to show that the mechanisms we uncover in our models are also strongly relevant for our data.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The authors have used full-length single-cell sequencing on a sorted population of human fetal retina to delineate expression patterns associated with the progression of progenitors to rod and cone photoreceptors. They find that rod and cone precursors contain a mix of rod/cone determinants, with a bias in both amounts and isoform balance likely deciding the ultimate cell fate. Markers of early rod/cone hybrids are clarified, and a gradient of lncRNAs is uncovered in maturing cones. Comparison of early rods and cones exposes an enriched MYCN regulon, as well as expression of SYK, which may contribute to tumor initiation in RB1 deficient cone precursors.

      Strengths:

      (1) The insight into how cone and rod transcripts are mixed together at first is important and clarifies a long-standing notion in the field.

      (2) The discovery of distinct active vs inactive mRNA isoforms for rod and cone determinants is crucial to understanding how cells make the decision to form one or the other cell type. This is only really possible with full-length scRNAseq analysis.

      (3) New markers of subpopulations are also uncovered, such as CHRNA1 in rod/cone hybrids that seem to give rise to either rods or cones.

      (4) Regulon analyses provide insight into key transcription factor programs linked to rod or cone fates.

      (5) The gradient of lncRNAs in maturing cones is novel, and while the functional significance is unclear, it opens up a new line of questioning around photoreceptor maturation.

      (6) The finding that SYK mRNA is naturally expressed in cone precursors is novel, as previously it was assumed that SYK expression required epigenetic rewiring in tumors.

      Weaknesses:

      (1) The writing is very difficult to follow. The nomenclature is confusing and there are contradictory statements that need to be clarified.

      (2) The drug data is not enough to conclude that SYK inhibition is sufficient to prevent the division of RB1 null cone precursors. Drugs are never completely specific so validation is critical to make the conclusion drawn in the paper.

      We thank the reviewer for describing the study’s strengths and weaknesses.  In the upcoming revision, we will:

      (1) improve the writing and clarify the nomenclature and contradictory statements, particularly those noted in the Reviewer’s Recommendations for Authors; and

      (2) scale back the claims related to the role of SYK in the cone precursor response to RB1 loss; we agree that genetic perturbation of SYK is required to prove it’s role and will perform such analyses in a separate study.

      Reviewer #2 (Public review):

      Summary:

      The authors used deep full-length single-cell sequencing to study human photoreceptor development, with a particular emphasis on the characteristics of photoreceptors that may contribute to retinoblastoma.

      Strengths:

      This single-cell study captures gene regulation in photoreceptors across different developmental stages, defining post-mitotic cone and rod populations by highlighting their unique gene expression profiles through analyses such as RNA velocity and SCENIC. By leveraging full-length sequencing data, the study identifies differentially expressed isoforms of NRL and THRB in L/M cone and rod precursors, illustrating the dynamic gene regulation involved in photoreceptor fate commitment. Additionally, the authors performed high-resolution clustering to explore markers defining developing photoreceptors across the fovea and peripheral retina, particularly characterizing SYK's role in the proliferative response of cones in the RB loss background. The study provides an in-depth analysis of developing human photoreceptors, with the authors conducting thorough analyses using full-length single-cell RNA sequencing. The strength of the study lies in its design, which integrates single-cell full-length RNA-seq, long-read RNA-seq, and follow-up histological and functional experiments to provide compelling evidence supporting their conclusions. The model of cell type-dependent splicing for NRL and THRB is particularly intriguing. Moreover, the potential involvement of the SYK and MYC pathways with RB in cone progenitor cells aligns with previous literature, offering additional insights into RB development.

      Weaknesses:

      The manuscript feels somewhat unfocused, with a lack of a strong connection between the analysis of developing photoreceptors, which constitutes the bulk of the manuscript, and the discussion on retinoblastoma. Additionally, given the recent publication of several single-cell studies on the developing human retina, it is important for the authors to cross-validate their findings and adjust their statements where appropriate.

      We thank the reviewer for summarizing the main findings and for noting the compelling support for the conclusions, the intriguing cell type-dependent splicing of rod and cone lineage factors, and the insights into retinoblastoma development. 

      We concur that some studies of developing photoreceptors were not well connected to retinoblastoma, which diminished the focus.  However, we suggest that it was valuable to highlight how deep, long read sequencing provided new insights into retinoblastoma. For example, our demonstration of similar rod- and cone-related gene expression in early cones and RB cells addressed concerns with the proposed cone cell-of-origin, adding disease relevance.

      We will address the Reviewer’s request to cross-validate our findings with those of other single-cell studies of developing human retina and to adjust the related statements in our upcoming revision.

      Reviewer #3 (Public review):

      Summary:

      The authors use high-depth, full-length scRNA-Seq analysis of fetal human retina to identify novel regulators of photoreceptor specification and retinoblastoma progression.

      Strengths:

      The use of high-depth, full-length scRNA-Seq to identify functionally important alternatively spliced variants of transcription factors controlling photoreceptor subtype specification, and identification of SYK as a potential mediator of RB1-dependent cell cycle reentry in immature cone photoreceptors.

      Human developing fetal retinal tissue samples were collected between 13-19 gestational weeks and this provides a substantially higher depth of sequencing coverage, thereby identifying both rare transcripts and alternative splice forms, and thereby representing an important advance over previous droplet-based scRNA-Seq studies of human retinal development.

      Weaknesses:

      The weaknesses identified are relatively minor. This is a technically strong and thorough study, that is broadly useful to investigators studying retinal development and retinoblastoma.

      We thank the reviewer for describing the strengths of the study. Our upcoming revision will address the minor concerns that were raised separately in the Reviewer’s Recommendations for Authors.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Minor Concern (Original Comment 1):

      “We think that this is sufficient to address our concern. Some citations may be in order to underpin the new text.”

      We appreciate the reviewer’s assessment that the revised text clarifies the complexity of the upstream circuitry beyond the retina, including inputs from the thalamus. As recommended, we have now included additional citations in the revised manuscript to support these points.

      Major Concern (Original Comment 5):

      “We do not feel that this important concern has been addressed. The stats are definitively negative. There is no statistical evidence from these data that multisensory integration is occurring in this assay. The anesthesia, paralysis, and low n may provide explanations for this negative result, but it is still a negative result (p=0.5269). To show two examples of multisensory integration for subthreshold stimuli fits the narrative, but this result is not supported. Examples where individual stimuli caused APs (and combined stimuli did not) also occurred, presumably, and at a rate that is statistically indistinguishable to the examples shown in Figure 5. As such, if results from this assay are going to be in the manuscript, acoustic-only and tectum-only examples should be shown as well, although they would not fit the narrative. To be meaningful, this experiment would have to show that multisensory integration is happening in this circuit. Frustrating though it must be, the experiment has given a negative result to that question.”

      We understand the reviewer’s concern regarding Figure 5C and the firing of action potentials (APs) in response to multisensory stimuli. We acknowledge that our assay is not suited to answer this question definitively and that our results do not provide statistical support for this hypothesis. In response, we have removed the examples previously shown in Figure 5C, along with the related description in the Results section (lines 420–426), to avoid implying unsupported integration in suprathreshold conditions.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this study, the authors describe the construction of an extremely large-scale anatomical model of juvenile rat somatosensory cortex (excluding the barrel region), which extends earlier iterations of these models by expanding across multiple interconnected cortical areas. The models are constructed in such a way as to maintain biological detail from a granular scale - for example, individual cell morphologies are maintained, and synaptic connectivity is founded on anatomical contacts. The authors use this model to investigate a variety of properties, from cell-type specific targeting (where the model results are compared to findings from recent large-scale electron microscopy studies) to network metrics. The model is also intended to serve as a platform and resource for the community by being a foundation for simulations of neuronal circuit activity and for additional anatomical studies that rely on the detailed knowledge of cellular identity and connectivity.

      Strengths:

      As the authors point out, the combination of scale and granularity of their model is what makes this study valuable and unique. The comparisons with recent electron microscopy findings are some of the most compelling results presented in the study, showing that certain connectivity patterns can arise directly from the anatomical configuration, while other discrepancies highlight where more selective targeting rules (perhaps based on molecular cues) are likely employed. They also describe intriguing effects of cortical thickness and curvature on circuit connectivity and characterize the magnitude of those effects on different cortical layers.

      The detailed construction of the model is drawn on a wide range of data sources (cellular and synaptic density measures, neuronal morphologies, cellular composition measures, brain geometry, etc.) that are integrated together; other data sources are used for comparison and validation. This consolidation and comparison also represent a valuable contribution to the overall understanding of the modeled system.

      We thank the reviewer for the kind comments.

      Weaknesses:

      The scale of the model, which is a primary strength, also can carry some drawbacks. In order to integrate all the diverse data sources together, many specific decisions must be made about, for example, translating findings from different species or regions to the modeled system, or deciding which aspects of the system can be assumed to be the same and which should vary. All these decisions will have effects on the predicted results from the model, which could limit the types of conclusions that can be made (both by the others and by others in the community who may wish to use the model for their own work).

      We agree that this is a downside of the principle of biophysically detailed modeling that is best addressed by continuous refinement in collaboration with the community. We would like to once again invite any interested party to participate in this process.

      As an example, while it is interesting that broad brain geometry has effects on network structure (Figure 7), it is not clear how those effects are actually manifested. I am not sure if some of the effects could be due to the way the model is constructed - perhaps there may be limited sets of morphologies that fit into columns of particular thicknesses, and those morphologies may have certain idiosyncrasies that could produce different statistics of connectivities where they are heavily used. That may be true to biology, but it may also be somewhat artifactual if, for example, the only neurons in the library that fit into that particular part of the cortex differ from the typical neurons that are actually found in that region (but may not have been part of the morphological sampling).

      We agree that the limited pool of morphological reconstructions can lead to artifactual results in the way the reviewer pointed out. To investigate that hypothesis, we added a supplementary figure (S14) where we characterize (1): to what degree the morphological composition of a columnar subvolume reflects the overall composition of the model; and (2): The level of morphological diversity in each columnar subvolume. We discuss the results at the end of section 2.6. Briefly, while we cannot fully rule out the possibility of an artificial result, we found a high and virtually uniform level of morphological diversity in all columns and layers. This makes it unlikely that individual idiosyncratic morphologies strongly affect the local connectivity. However, we acknowledge that the minimum level of morphological diversity required is unknown. We believe that at this stage all we can do is characterize this and leave final interpretation to the reader.

      I also wonder how much the assumption that the layers have the same relative thicknesses everywhere in the cortex affects these findings, since layer thicknesses do in fact vary across the cortex.

      We agree that layer thickness variation would affect circuit properties. Variability of layer thickness can be split into two components: variability stemming from differences in total thickness, which our model covers, and variability of relative, i.e., normalized layer thickness, which we miss. In this region of cortex, though, data on the relative thickness of cortical layers is sparse. The Waxholm Atlas does not distinguish somatosensory cortical layers in its labels [Kleven et al, 2023]. Yusufoğulları (2015) compares layer thicknesses of rat hindlimb and barrel field regions. After normalization against total thickness, the relative difference increased towards the superficial layers from 0 in L6 to 33% in L1. Variability of normalized thicknesses within developed rat barrel cortex, based on layer boundaries reported in Narayanan et al. (2017) vary by 2% to 5% over approximately 2 mm. One major effect of such variability would be to scale the number of neurons in a given layer locally by the corresponding factors. For comparison, the resulting variability in neuron counts due to differences in conicality (Fig. 7D1) was around +-25%. A further effect of variable relative layer thickness would be its impact on the selection of suitable morphologies to be placed in the volume.

      In summary, adjustment of layer thickness is a refinement which should be done in future versions of the model, once more data is available. The discussion section has been updated to acknowledge this limitation. However, as outlined at the beginning of this point-by-point reply, we will not conduct such updates to the model in the context of this manuscript, as it describes the version of the model used for a number of follow-up studies.

      In addition, the complexity of the model means that some complicated analyses and decisions are only presented in this manuscript with perhaps a single panel and not much textual explanation. I find, for example, that the panels of Figure S2 seem to abstract or simplify many details to the point where I am not clear about what they are actually illustrating - how does Figure S2D represent the results of "the process illustrated in B"? Why are there abrupt changes in connectivity at region borders (shown as discontinuous colors), when dendrites and axons span those borders and so would imply interconnectivity across the borders? What do the histograms in E1 and E2 portray, and how are they related to each other?

      We apologize for the confusion. We have updated the figure caption of Figure S2 to better explain its contents.

      Overall, the model presented in this study represents an enormous amount of work and stands as a unique resource for the community, but also is made somewhat unwieldy for the community to employ due to the weight of its manifold specific construction decisions, size, and complexity.

      Reviewer #2 (Public Review):

      Summary:

      The authors build a colossal anatomical model of juvenile rat non-barrel primary somatosensory cortex, including inputs from the thalamus. This enhances past models by incorporating information on the shape of the cortex and estimated densities of various types of excitatory and inhibitory neurons across layers. This is intended to enable an analysis of the micro- and mesoscopic organisation of cortical connectivity and to be a base anatomical model for large-scale simulations of physiology.

      Strengths:

      • The authors incorporate many diverse data sources on morphology and connectivity.

      • This paper takes on the challenging task of linking micro- and mesoscale connectivity.

      • By building in the shape of the cortex, the authors were able to link cortical geometry to connectivity. In particular, they make an unexpected prediction that cortical conicality affects the modularity of local connectivity, which should be testable.

      • The author's analysis of the model led to the interesting prediction that layer 5 neurons connect local modules, which may be testable in the future, and provide a basis to link from detailed anatomy to functional computations.

      • The visualisation of the anatomy in various forms is excellent.

      • A subnetwork of the model is openly shared (but see question below).

      We thank the reviewer for their kind comments.

      Weaknesses:

      • Why was non-barrel S1 of the juvenile rat cortex selected as the target for this huge modelling effort? This is not explained.

      We have added an explanation of this decision to the third paragraph of the introduction.

      • There is no effort to determine how specific or generalisable the findings here are to other parts of the cortex. Although there is a link to physiological modelling in another paper, there is no clear pathway to go from this type of model to understand how the specific function of the modelled areas may emerge here (and not in other cortical areas).

      With respect to generality against specific findings, our philosophy is as follows: Despite the fact that most of our source data comes from juvenile rat somatosensory cortex, we also had to generalize many data sources across organisms, ages or regions. Hence, in this iteration we focused on investigating the general features of the (multi-region) mammalian cortex, e.g., high-order motifs, connected by L5 neurons across subregions or the effect of curvature on the connectivity. In the future, more specific data sources can be used to build diverging versions of the model, e.g. one for adult vs. juvenile rat. They can then be used to contrast the ages and focus on more specific findings. We already defined a number of structural metrics that can be used to contrast more specific versions of the model quantitatively.

      We now clarify this pathway to understanding more specific function in the last paragraph of the discussion.

      • In a few places the manuscript could be improved by being more specific in the language, for example:

      - "our anatomy-based approach has been shown to be powerful", I would prefer instead to read about specific contributions of past papers to the field, and how this builds on them.

      - similarly: "ensuring that the total number of synapses in a region-to-region pathway matches biology." Biology here is a loose term and implies too much confidence in the matching to some ground truth. Please instead describe the source of the data, including the type of experiment.

      We have removed or rewritten the mentioned parts. We now clarify that we work based on biological estimates from experiments and cite the experiment sources. We also provide brief descriptions of the types of data and how they were derived.

      • Some of the decisions seem a little ad-hoc, and the means to assess those decisions are not always available to the reader e.g.

      - pg. 10. "Based on these results, we decided that the local connectome sufficed to model connectivity within a region.". What is the basis for this decision? Can it be formalised?

      - "In the remaining layers the results of the objective classification were used to validate the class assignments of individual pyramidal cells. We found the objective classification to match the expert classification closely (i.e., for 80-90% of the morphologies). Consequently, we considered the expert classification to be sufficiently accurate to build the model." The description of the validation is a little informal. How many experts were there? What are their initials? Was inter-rater or intra-rater reliability assessed? What are these numbers? The match with Kanari's classification accuracy should be reported exactly. There are clearly experts among the author list, but we are all fallible without good controls in place, and they should be more explicit about those controls here, in my opinion.

      - "Morphology selection was then performed as previously (Markram et al., 2015), that is, a morphology was selected randomly from the top 10% scorers for a given position." A lot of the decisions seem a little ad-hoc, without justification other than this group had previously done the same thing. For example, why 10% here? Shouldn't this be based on selecting from all of the reasonable morphologies?

      We have clarified that the density of local connectivity is verified against the validation datasets by comparing the diagonals in Figure 4B, in addition to the quantification of Figure 4C.

      For the classification, we have now published a detailed preprint describing the objective confirmation of expert classification by a variety of methods (see Kanari et al. 2024 https://www.biorxiv.org/content/10.1101/2024.09.13.612635v1). We cannot include the full methodology in the current paper, due to its large extent. For the benefit of the reader, we have included the appropriate citation and extended the short description of the methodology. As described in this paper, the classification accuracy varies per layer, cell type, etc. We have now described in more details these results, that can be accessed in details in out preprint.

      • I would like to know if one of the key results relating to modularity and cortical geometry can be further explored. In particular, there seem to be sharp changes in the data at the end of the modelled cortical regions, which need to be explored or explained further.

      We now explore these results further in supplementary figure S15, which we discuss in the results Section 2.6.

      • The shape of the juvenile cortex - a key novelty of this work - was based on merely a scalar reduction of the adult cortex. This is very surprising, and surely an oversimplification. Huge efforts have gone into modelling the complex nonlinear development of the cortex, by teams including the developing Human Connectome Project. For such a fundamental aspect of this work, why isn't it possible to reconstruct the shape of this relatively small part of the juvenile rat cortex?

      We agree that a more complex approach should be used in the future. However, as outlined at the beginning of this point-by-point reply, we will not conduct such updates to the model in the context of this manuscript, as it describes the version of the model used for a number of follow-up studies.

      • The same relative laminar depths are used for all subregions. This will have a large impact on the model. However, relative laminar depths can change drastically across the cortex (see e.g. many papers by Palomero-Gallagher, Zilles, and colleagues). The authors should incorporate the real laminar depths, or, failing that, show evidence to show that the laminar depth differences across the subregions included in the model are negligible.

      This point has also been raised by reviewer #1 above. For convenience, we repeat our reply below.

      We agree that layer thickness variation would affect circuit properties. Variability of layer thickness can be split into two components: variability stemming from differences in total thickness, which our model covers, and variability of relative, i.e., normalized layer thickness, which we miss. In this region of cortex, though, data on the relative thickness of cortical layers is sparse. The Waxholm Atlas does not distinguish somatosensory cortical layers in its labels [Kleven et al, 2023]. Yusufoğulları (2015) compares layer thicknesses of rat hindlimb and barrel field regions. After normalization against total thickness, the relative difference increased towards the superficial layers from 0 in L6 to 33% in L1. Variability of normalized thicknesses within developed rat barrel cortex, based on layer boundaries reported in Narayanan et al. (2017) vary by 2% to 5% over approximately 2 mm. One major effect of such variability would be to scale the number of neurons in a given layer locally by the corresponding factors. For comparison, the resulting variability in neuron counts due to differences in conicality (Fig. 7D1) was around +-25%. A further effect of variable relative layer thickness would be its impact on the selection of suitable morphologies to be placed in the volume.

      In summary, adjustment of layer thickness is a refinement which should be done in future versions of the model, once more data is available. The discussion section has been updated to acknowledge this limitation. However, as outlined at the beginning of this point-by-point reply, we will not conduct such updates to the model in the context of this manuscript, as it describes the version of the model used for a number of follow-up studies.

      • The authors perform an affine mapping between mouse and rat cortex. This is again surprising. In human imaging, affine mappings are insufficient to map between two individual brains of the same species and nonlinear transformations are instead used. That an affine transformation should be considered sufficient to map between two different species is then very surprising. For some models, this may be fine, but there is a supposed emphasis here on biological precision in terms of anatomical location.

      We agree that this is a weakness that we will address in future revisions of the model.

      • One of the most interesting conclusions, that the connectivity pattern observed is in part due to cooperative synapse formation, is based on analyses that are unfortunately not shown.

      We originally decided not to show this part as we underestimated the interest in this particular result. We have now included the result in supplementary figure S10 and discuss the figure in the results.

      • Open code:

      - Why is only a subvolume available to the community?

      We have now made the entire model available under doi.org/10.7910/DVN/HISHXN. The Data and Code availability section has been updated to clarify this.

      - Live nature of the model. This is such a colossal model, and effort, that I worry that it may be quite difficult to update in light of new data. For example, how much person and computer time would it take to update the model to account for different layer sizes across subregions? Or to more precisely account for the shape of the juvenile rat cortex?

      To provide more information to people interested in participating in model refinements, we have added a new Figure 9. We discuss potential opportunities for refinement at the end of the discussion section.

      Reviewer #3 (Public Review):

      This manuscript reports a detailed model of the rat non-barrel somatosensory cortex, consisting of 4.2 million morphologically and biophysically detailed neuron models, arranged in space and connected according to highly sophisticated rules informed by diverse experimental data. Due to its breadth and sophistication, the model will undoubtedly be of interest to the community, and the reporting of anatomical details of modeling in this paper is important for understanding all the assumptions and procedures involved in constructing the model. While a useful contribution to this field, the model and the manuscript could be improved by employing data more directly and comparing simple features of the model's connectivity - in particular, connection probabilities - with relevant experimental data.

      The manuscript is well-written overall but contains a substantial number of confusing or unclear statements, and some important information is not provided.

      Below, major concerns are listed, followed by more specific but still important issues.

      Major issues

      (1) Cortical connectivity.

      Section 2.3, "Local, mid-range and extrinsic connectivity modeled separately", and Figure 4: I am confused about what is done here and why. The authors have target data for connectivity (Figure 4B1). But then they use an apposition-based algorithm that results in connectivity that is quite different from the data (Figure 4B2, C). They then use a correction based on the data (Figure 4E) to arrive at a more realistic connectivity. Why not set the connectivity based on the data right away then? That would seem like a more straightforward approach.

      We have completely re-written our description and discussion of connectivity in the model. We now more explicitly motivate our connectivity modeling choices in the first paragraph of section 2.3 of the results and in the second paragraph of the discussion.

      The same comment applies to Section 2.4., "Specificity of axonal targeting": the distributions of synapses on different types of target cell compartments were not well captured by the original model based on axon-dendrite overlap and pruning, so the authors introduced further pruning to match data specificity. While details of this process and what worked and what didn't may be interesting to some, overall it is not surprising, as it has been well known that cell types exhibit connectivity that is much more specific than "Peters rule" or its simple variations. The question is, since one has the data, why not use the data in the first place to set up the connectivity, instead of using the convoluted process of employing axon-dendrite overlap followed by multiple corrections?

      We would like to point out that we are not employing “Peters rule”, we now make this explicit in the revision in the first paragraph of section 2.3 of the results. Furthermore, we would argue that the match to the Motta et al. data indicates that our approach is more than just a “simple variation”. Finally, we believe that there is important insight in: 1. The specific ways in which the algorithm had to be changed to match the Schneider-Mizell data, e.g. that the connectivity of SST positive neurons did not have to be adapted at all. 2. That the specificity of the other two types could still be matched by a selection of a subset of axonal appositions (i.e., of potential synapses).

      Most importantly, what is missing from the whole paper is the characterization of connection probabilities, at least for the local circuit within one area. Such connection probabilities can be obtained from the data that the authors already use here, such as the MICRONS dataset. Another good source of such data is Campagnola et al., Science, 2022. Both datasets are for mouse V1, but they provide a comprehensive characterization across all cortical layers, thus offering a good benchmark for comparison of the model with the data. It would be important for the authors to show how connection probabilities realized in their model for different cell types compared to these data.

      We now report connection probabilities in the reworked figure 4 and compare them to reported connection probabilities from many different sources and labs in supplementary figure S8. We prefer a comparison to a wide range of sources to relying on a single report.

      (2) Section 2.5, "Structure of thalamic inputs" and Figure 6.

      The text in section 2.5 should provide more details on what was done - namely, that the thalamic axons were generated based on the axon density profiles and then synapses were established based on their overall with cortical dendrites. Figure S10 where the target axon densities from data and the model axon densities are compared is not even mentioned here. Now, Figure S10 only shows that the axon densities were generated in a way that matches the data reasonably well. However, how can we know that it results in connectivity that agrees with data? Are there data sources that can be used for that purpose? For example, the authors show that in their model "the peaks of the mean number of thalamic inputs per neuron occur at lower depths than the peaks of the synaptic density". Is this prediction of the model consistent with any available data?

      Most importantly, the authors should show how the different cell types in their model are targeted by the thalamic inputs in each layer. Experimental studies have been done suggesting specificity in targeting of interneuron types by thalamic axons, such as PV cells being targeted strongly whereas SST and VIP cells being targeted less.

      We have updated the Results section to provide context for the thalamic axon placement, and referred the reader to the methods for more detail. A reference to Figure S10 has now been added to this section as well.

      As for validations of the structure of the thalamo-cortical inputs: We found that the existing literature on the topic, such as Cruikshank et al., 2007, 2010 and more recently Sermet et al., 2019, is predominately on the physiological strengths of the pathways. We acknowledge that the authors provide compelling arguments that their findings are likely partially due to differences in the anatomical innervation strengths. On the other hand, Sporns, 2013 cautioned against mixing up structural and functional connectivity. Overall, we believe that it is simply cleaner to perform this validation in the accompanying manuscript (“Part II: Physiology and Experimentation”), using the full physiological model. Note that we have actually performed that validation in the manuscript (see preprint under the following doi: 10.1101/2023.05.17.541168, Figure 3H1).

      Note that a higher physiological strength onto PV+ neurons is observed.

      (3) "We have therefore made not only the model but also most of our tool chain openly available to the public (Figure 1; step 7)."

      In fact it is not the whole model that is made publicly available, but only about 5% of it (211,000 out of 4,200,000 neurons). Also, why is "most" of the tool chain made openly available, and not the whole tool chain?

      We have now made the entire model available under doi.org/10.7910/DVN/HISHXN. This has also been added to the Key resource table.

      With regard to the tool chain, everything is on our public github (https://github.com/BlueBrain/) except for the algorithm for detecting axonal appositions. For that tool there are currently unresolved potential copyright issues with former collaboration partners. We are working to resolve them.

      Other issues

      "At each soma location, a reconstruction of the corresponding m-type was chosen based on the size and shape of its dendritic and axonal trees (Figure S6). Additionally, it was rotated to according to the orientation towards the cortical surface at that point."

      After this procedure, were cells additionally rotated around the white matter-pia axis? If yes, then how much and randomly or not? If not, then why not? Such rotations would seem important because otherwise additional order potentially not present in the real cortex is introduced in the model affecting connectivity and possibly also in vivo physiology (such as the dynamics of the extracellular electric field).

      They are indeed additionally randomly rotated. We have clarified this in the revision.

      The term "new in vivo reconstructions" for the 58 neurons used in this paper in addition to "in vitro reconstructions" is a misnomer. It is not straightforward to see where the procedure is described, but then one finds that the part of Methods that describes experimental manipulations is mostly about that (so, a clearer pointer to that part of Methods could be useful). However, the description in Methods makes it clear that it is only labeling that is done in vivo; the microscopy and reconstruction are done subsequently in vitro. I would recommend changing the terminology here, as it is confusing. Also, can the authors show reconstructions of these neurons in the supplementary figures? Is the reconstruction shown in Figure 4A representative?

      The term is used because the staining is done in vivo. To the best of our knowledge, the reconstruction process cannot be performed in vivo. However, to avoid any confusion we modified the text to clarify this distinction to in-vivo stained.

      With respect to the reconstruction in Figure 4: The intent of the panel is to demonstrate the concept of targeted long-range axons that our morphologies are missing, necessitating the use of a second algorithm for longer-range connectivity. As such, it is not one of the reconstructions we used, but one of Janelia MouseLight. While we mentioned MouseLight in the figure caption, we formulated it in a way that could be misunderstood to mean that we merely used the MouseLight browser to render one of our morphologies. We apologize for the confusion, and we have fixed the figure caption.

      In this revision we have added exemplars of representative morphology reconstructions (in slice stained and in vivo stained) in a new supplementary figure, as requested (Figure S5). It is referenced in the last paragraph of section 2.1.

      In the Discussion, "This was taken into account during the modeling of the anatomical composition, e.g. by using three-dimensional, layer-specific neuron density profiles that match biological measurements, and by ensuring the biologically correct orientation of model neurons with respect to the orientation towards the cortical surface. As local connectivity was derived from axo-dendritic appositions in the anatomical model, it was strongly affected by these aspects.

      However, this approach alone was insufficient at the large spatial scale of the model, as it was limited to connections at distances below 1000μm."

      As mentioned above, it is not clear that this approach was sufficient for local connectivity either. It would be great if the authors showed a systematic comparison of local connection probabilities between different cell types in their model with experimental data and commented here in the Discussion about how well the model agrees with the data.

      As mentioned in the reply to a previous comment, we now report connection probabilities.

      In the Discussion: "The combined connectome therefore captures important correlations at that level, such as slender-tufted layer 5 PCs sending strong non-local cortico-cortical connections, but thick-tufted layer 5 PCs not." (Also the corresponding findings in Results.)

      If I understand this statement correctly, it may not agree with biological data. See analysis from MICRONS dataset in Bodor et al., https://www.biorxiv.org/content/10.1101/2023.10.18.562531v1.

      Our statement was indeed misleading and formulated too strongly. While thick-tufted pyramidal cells do form long-range intra-cortical connections, the structural strength of these pathways is weaker than for slender-tufted PCs, which are associated with the IT (intra-telencephalic) projection type. We have made this clear in the revision.

      Table 2 is confusing. What do pluses and minuses mean? What does it mean that some entries have two pluses? This table is not mentioned anywhere else in the text. If pluses mean some meaningful predictions of the model, then their distribution in the table seems quite liberal and arbitrary. It is not clear to me that the model makes that many predictions, especially for type-specificity and plasticity. Also, why is the hippocampus mentioned in this table? I don't see anything about the hippocampus anywhere else in the paper.

      We have clarified the description of the table in its caption and removed references to hippocampus, which were left from an earlier draft of the paper.

      In the Discussion, "Thus, we made the tools to improve our model also openly available (see Data and Code availability section)."

      As mentioned before, the authors themselves write that they made "most of our tool chain openly available to the public", but not all of it.

      With regard to the tool chain, everything is on our public github (https://github.com/BlueBrain/) except for the algorithm for detecting axonal appositions. For that tool there are currently unresolved potential copyright issues with former collaboration partners. We are working to resolve them.

      Table S2 has multiple question marks. It is not clear whether the "predictions" listed in that table are truly well-thought-out and/or whether experimental confirmations are real.

      Some of the citations in that table were broken due to technical difficulties with the citation manager used. We apologize and have fixed this in the revision.

      Introduction: It would be quite appropriate to cite here Einevoll et al., Neuron, 2019 ("The Scientific Case for Brain Simulations").

      We now reference this important work.

      Recommendations for the authors:

      Reviewing Editor's note:

      Consultation with the reviewers highlighted three main issues: the integration of connection probability profiles, non-uniform cortical thickness, and the overall organization of the manuscript.

      Reviewer #1 (Recommendations For The Authors):

      Apart from the points discussed in the public review, my main concern is that the manuscript itself is not as tightly constructed as it should be, to the detriment of the reader's ability to understand the model itself and the conclusions from the presented analyses.

      There are places where the text references seemingly incorrect figure panels or refers to panels that don't exist:

      - Section 2.2, first paragraph - refers to Figure 2D, E but those panels do not exist in Figure 2.

      - Section 2.2, second paragraph - refers to Figure 3D3 - perhaps it should be 3B3?

      - Section 2.8, first paragraph - has no figure references but seems like it should be referring to parts of Figure 8 (perhaps Figure 8B1 specifically?)

      - Is the reference to Figure S11A on page 16 supposed to be to S12A?

      In other places, figure labels and descriptions are not clear, and terminology is not always well-defined or explained.

      - Figure 8 and the associated section 2.8 are very difficult to draw conclusions from as presented - several of the terms used are opaque and not clearly defined in the text or legends. I could not easily infer how the normalization works for the "normalized node participation per layer", or what "position in simplex" means for "unique neurons in core", and what their "relative counts" are relative to.

      - Are "targets" in Figure S12A the same as "sinks"? If so, it would be better to use a single term consistently throughout.

      - Figure S12 - figures in part B do not have enough labels to interpret - what is the y-axis of the "rich-club analysis" graph? Also, the figures in part B bottom are labeled "long-range" rather than "mid-range" connections.

      In general, I found the use of both letters and numbers for figure panels (e.g. Figure 7E1) more confusing than helpful - it didn't seem like panels with the same letter were visually grouped consistently, and it sometimes made it more difficult to follow the flow of a figure. I would recommend using only letters in nearly every case here.

      We thank the reviewer for directing our attention to these issues. We have fixed them in the revision. However, we have decided to keep our original panel numbering scheme. Panels with the same letter are meant to be conceptually grouped as they address related or similar measures.

      Other minor points:

      - Section 2.4 - paragraph 2 - sentence 5 "inhbititory" -> "inhibitory".

      - Figure 5B figure legend - references Schneider-Mizell et al. 2023 but probably should be Motta et al. 2019?

      - Figure 5C - figure key "expcected" -> "expected".

      - The lower part of Figure 7C looks like it belongs to panel D2 instead of panel C due to relative spacing.

      We once again thank the reviewer, and we have fixed the listed issues.

      Reviewer #2 (Recommendations For The Authors):

      (1) Abstract:

      - Is it really 'integrating whole brain-scale data'? This seems a bit misleading.

      - "We delineated the limits of determining connectivity from anatomy" - here I think you mean determining connectivity from morphology, or dendrite/axon appositions. Electron microscopy is still anatomy and presumably would be much closer to function.

      We originally used the term “anatomy” as connectivity depends on the correct placement of neurons in addition to their morphology. However, as the reviewer points out, this term is misleading as it would encompass electron microscopy, which can go beyond what we do with the model. We have updated the text to read “morphology and placement”.

      (2) Introduction:

      "Investigating the multi-scale interactions that shape perception requires a model of multiple cortical subregions with inter-region connectivity, but it also requires the subcellular resolution provided by a morphologically detailed model." - This statement, as written, is not true in my opinion. You can argue for the value of morphologically-detailed neuron models to the study of perception, but they are not required for the investigation of perception.

      We have updated the text to be clearer: subcellular resolution is only required for certain aspects that are related to perception.

      (3) Results:

      - Pg. 9/10. There are three sentences in a row that are of the style: "ensuring that the total number of synapses in a region-to-region pathway matches biology." Biology here is a loose term and implies too much confidence in the matching to some ground truth. Please instead describe the source of the data, including the type of experiment here already. o Pg. 10. On the first read, I found it quite hard to follow what exactly was done in Figure 4.

      What are the target values adapted from Reimann et al., 2019, for example?

      - Pg. 10. "Based on these results, we decided that the local connectome sufficed to model connectivity within a region.". What is the basis for this decision? Can it be formalised? o Pg. 16, Figure 7 B-C. The apparent effect of geometry on modularity is potentially very interesting. However, are the sharp drop-offs in values for modularity (but also conicality and height) true, or are some artefacts due to columns at the edges of the sampled area?

      We have discussed these points above in the general comments and strengths and weaknesses.

      - Pg. 18. Simplicial cores define central subnetworks, tied together by mid-range connections. This work, in particular leading to the conclusion of the layer 5 highway hubs, stands out as being a successful attempt to simplify the highly detailed model to a degree that it generates useable new understanding.

      We thank the reviewer for the kind comment.

      (4) Figures:

      Figure 2: The caption doesn't seem to match the Figure (e.g. there are no brain regions depicted in A). o Figure 4f. This is a key panel, but is squished into a small corner of Figure 4, and therefore hard-to-read.

      We have fixed this in the revision.

      Reviewer #3 (Recommendations For The Authors):

      In Major comments, point (1) discusses the issue of connectivity known from data. For all the aspects of connectivity mentioned there, I would recommend the authors re-build their model using the connectivity data directly. It would be interesting to test whether a model constructed in such a way would have any difference in simulated neural activity relative to the model they have constructed.

      This is indeed a very interesting avenue of research. However, we believe that it is best conducted in separate manuscripts. First, in Pokorny et al., 2024 (https://doi.org/10.1101/2024.05.24.593860) we conduct this investigation, comparing the emerging activity in the model to the one for simpler connectivity models. Additionally, in Egas-Santander et al., 2024 (https://www.biorxiv.org/content/10.1101/2024.03.15.585196v3) we found that simpler connectomes lead to less reliable spiking activity globally. Finally, in the accompanying manuscript (https://www.biorxiv.org/content/10.1101/2023.05.17.541168v5) we compare activity with and without the targeting specificity of Schneider-Mizell et al.

      In Major comments, point (2) discusses thalamic inputs. I would recommend the authors to address the issues mentioned there.

      We have replied to those comments above.

      In addition, panels F and G of Figure 6 are mentioned in the caption but are not shown in the figure. In panel B, the choice of visualization is strange. It would make sense to show box plots for all the data instead of bars for mean values and points for randomly selected 50 cells. Panels E1 and E2 lack units.

      We have removed mentions of panels F and G and changed the style of plot. Units for E1 and E2 are now explained in the figure caption.

      In Major comments, point (3) touches upon model and tool sharing. I would recommend making such statements more accurate and reflecting what exactly is provided to the community since not everything is shared.

      We have now made the entire model available under doi.org/10.7910/DVN/HISHXN.

      With regard to the tool chain, everything is on our public github (https://github.com/BlueBrain/) except for the algorithm for detecting axonal appositions. For that tool there are currently unresolved potential copyright issues with former collaboration partners. We are working to resolve them.

      I would recommend the authors address all the other points mentioned in the public review as well. In addition, below are some smaller issues that should be fixed.

      Figure 2: the caption appears to be partially wrong and partially misassigned to the figure panels.

      We fixed the issue.

      Also, note that in L6 the types L6_TPC:A and L6_TPC:C are listed in the figure, but L6_TPC:B is not mentioned.

      There is indeed no TPC:B type in layer 6. The distinction between TPC:A and TPC:B is based on early or late bifurcations of the apical dendrite and is only observed in layer 5.

      Figure 3, panel B2: the caption refers to colors in panel (C), but the authors probably meant to refer to panel (A).

      We fixed the issue.

      "The placement of morphological reconstructions matched expectation, showing an appropriately layered structure with only small parts of neurites leaving the modeled volume (Figure 2D, E)."

      Figure 2 does not have panels D and E.

      "The volume was clearly dominated by dendrites, filling between 23% and 47% of the space, compared to 2% to 11% for axons (Figure 3D3)." There is no panel D or D3 in Figure 3.

      "Recently, the MICrONS dataset (MICrONS-Consortium et al., 2021) has been analyzed with respect to the axonal targeting of inhibitory subtypes in a 100 x 100 μm subvolume spanning all layers (Schneider-Mizell et al., 2023)."

      100 x 100 μm is an area (and should be 100 x 100 μm^2), not a volume.

      Figure S11B requires a legend for the color map.

      We fixed the issues.

      Table S1: What is the difference between L6_BP and L6_BPC? They both are referred to as L6 bipolar cells.

      We have changed the description of L6_BPC to “Layer 6 bitufted pyramidal cell”.

    1. Author response:

      Reviewer #1:

      We sincerely thank you for your thoughtful review and constructive comments on our work and we appreciate your positive assessment of our study’s innovative design, which allows for improved observation of 3D cell spheroids from an additional lateral view. Your comments underscore the importance of our approach in advancing methods for investigating cell behaviors in tumor organoid studies.

      In response to your suggestions, we will first add a detailed image of the ‘First surface mirror’ in Fig. 1 to provide a reference for readers and other researchers, thereby facilitating broader use of this method in similar observations. Regarding the suitable sample sizes for this device, as the spheroid sizes are relatively small compared to the mirror and culture dish, we have been able to image samples up to 5 mm in height, which provides ample capacity for most spheroids under 1 mm. We will include additional experiments and explanations in the manuscript to clarify this further.

      Concerning the ring-shaped seeding pattern of spheroids, we have conducted extensive culture experiments to optimize this method. The agarose microwells-based method has proven to be highly tolerant of variations. Within these microwells, cells have a propensity to self-aggregate, leading to the formation of spheroid structures. We will add a discussion in the revised manuscript to address this issue.

      Lastly, this device can accommodate the fluorescence imaging of 3D spheroid samples. We will supplement the discussion with a schematic illustrating the principles of fluorescence imaging using this device, providing a foundation for future work in this area. We will also regarding language improvements to enhance the overall quality of the manuscript.

      Thank you once again for your valuable insights, which have greatly contributed to the strengthening of our manuscript.

      Reviewer #2:

      We sincerely thank you for your detailed and supportive review of our manuscript. Your recognition of our system’s capabilities for in situ observation of 3D structures along multiple axes, as well as its potential applications in studying therapeutic effects, is highly encouraging. Your comments on the advantages of this system for analyzing cell migration, morphological changes, and responses to therapeutic agents are especially appreciated.

      Thank you again for your thoughtful feedback and for highlighting the contributions of our work. Your insights have been invaluable in refining the focus and clarity of our study, and we hope that our revisions meet your expectations.

    1. Author response:

      Public reviews:

      Reviewer #1:

      Epigenetic regulation complex (PRC2) is essential for neural crest specification, and its misregulation has been shown to cause severe craniofacial defects. This study shows that Eed, a core PRC2 component, is critical for craniofacial osteoblast differentiation and mesenchymal proliferation after neural crest induction. Using mouse genetics and single-cell RNA sequencing, the researcher found that conditional knockout of Eed leads to significant craniofacial hypoplasia, impaired osteogenesis, and reduced proliferation of mesenchymal cells in post-migratory neural crest populations.

      Overall, the study is superficial and descriptive. No in-depth mechanism was analyzed and the phenotype analysis is not comprehensive.

      We thank the reviewer for sharing their expertise and for taking the time to provide a helpful suggestion to improve our study. We are gratified that the striking phenotypes we report from Eed loss in post-migratory neural crest craniofacial tissues were appreciated. The breadth and depth of our phenotyping techniques, including skeletal staining, micro-CT, echocardiogram, immunofluorescence, histology, and unbiased single-cell gene expression analysis, provide comprehensive data in support our conclusion that PRC2 is required for craniofacial osteoblast differentiation. We hypothesize that epigenetic regulation of chromatin accessibility downstream of PRC2 activity is the molecular mechanism that underlies these phenotypes. To test this hypothesis in our revision, we are using CUT&Tag to profile H3K27me3 epigenetic modifications genome-wide and at the loci encoding the differentially expressed genes revealed by our single-cell transcriptomics in developing craniofacial structures. We anticipate that these experiments will reveal an epigenetic mechanism underlying the phenotypes we report from Eed loss in post-migratory neural crest craniofacial tissues.

      Reviewer #2:

      Summary:The role of PRC2 in post-neural crest induction was not well understood. This work developed an elegant mouse genetic system to conditionally deplete EED upon SOX10 activation. Substantial developmental defects were identified for craniofacial and bone development. The authors also performed extensive single-cell RNA sequencing to analyze differentiation gene expression changes upon conditional EED disruption.

      Strengths:

      (1) Elegant genetic system to ablate EED post neural crest induction.

      (2) Single-cell RNA-seq analysis is extremely suitable for studying the cell type-specific gene expression changes in developmental systems.

      We thank the reviewer for their generous and helpful comments on our study. We are pleased that our mouse genetic and single-cell RNA sequencing approaches were appropriate in pairing the craniofacial phenotypes we report with distinct gene expression changes in post-migratory neural crest tissues upon Eed deletion.

      Weaknesses:

      (1) Although this study is well designed and contains state-of-the-art single-cell RNA-seq analysis, it lacks the mechanistic depth in the EED/PRC2-mediated epigenetic repression. This is largely because no epigenomic data was shown.

      Thank you for this suggestion. As described in response to Reviewer #1, we will include H2K27me3 CUT&Tag data in craniofacial tissue harvested from E12.5 and E16.5 Sox10-Cretg+ Eedfl/fl and Sox10-Cretg+ Eedfl/wt  embryos in our revision. Our analyses will including genome-wide and targeted metaplot visualizations across genotypes and developmental timepoints and assess how H3K27me3 occupancy relates to gene expression changes in our single-cell RNA sequencing data.

      (2) The mouse model of conditional loss of EZH2 in neural crest has been previously reported, as the authors pointed out in the discussion. What is novel in this study to disrupt EED? Perhaps a more detailed comparison of the two mouse models would be beneficial.

      We acknowledge the study the reviewer has indicated (Schwarz et al. Development 2014). This elegant investigation uses Wnt1-Cre to delete Ezh2 and found a similar phenotype to ours in the form of catastrophic craniofacial hypoplasia. We sought to add depth to the study of PRC2’s vital role in neural crest development by ablating Eed, which has a unique function in the PRC2 complex by binding to H3K27me3 and allosterically activating Ezh2. In this sense, we sought to test if phenotypes arising from deletion of Eed, the PRC2 “reader”, differ from phenotypes arising from deletion of Ezh2, the PRC2 “writer”, in neural crest derived tissues. Due to limitations associated with the Wnt1-Cre transgene (Lewis et al. Developmental Biology 2013), we used the Sox10-Cre allele which targets the migratory neural crest and is completely recombined by E10.5, instead of Wnt1-Cre which targets pre-migratory neural crest cells. A more detailed comparison of these mouse models will be included in the Discussion section of our revised manuscript, and we thank the reviewer for this thoughtful suggestion.

      (3) The presentation of the single-cell RNA-seq data may need improvement. The complexity of the many cell types blurs the importance of which cell types are affected the most by EED disruption.

      We agree with the reviewer’s critique of the scRNA-seq data presentation. Because Sox10+ cells were not sorted (via FACS, for example) from craniofacial tissues before single-cell RNA sequencing, we identified a breath of cell types in UMAP space unrelated to epigenetic disruption of neural crest derived tissues. We will include subcluster visualization plots in the figures of our revised manuscript to highlight specific changes in clusters, such as osteoblasts and mesenchymal stem cells, that arise from Eed loss in post-migratory neural crest craniofacial tissues.

      (4) While it's easy to identify PRC2/EED target genes using published epigenomic data, it would be nice to tease out the direct versus indirect effects in the gene expression changes (e.g Figure 4e).

      We agree with the reviewer that our single-cell RNA sequencing data do not provide insight into direct versus indirect changes in gene expression downstream of PRC2. We hope that the aforementioned CUT&Tag experiment will provide the necessary mechanistic insight into H3K27me3 occupancy and direct effects on gene expression resulting from PRC2 inactivation in our mouse model.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      One of the roadblocks in PfEMP1 research has been the challenges in manipulating var genes to incorporate markers to allow the transport of this protein to be tracked and to investigate the interactions taking place within the infected erythrocyte. In addition, the ability of Plasmodium falciparum to switch to different PfEMP1 variants during in vitro culture has complicated studies due to parasite populations drifting from the original (manipulated) var gene expression. Cronshagen et al have provided a useful system with which they demonstrate the ability to integrate a selectable drug marker into several different var genes that allows the PfEMP1 variant expression to be 'fixed'. This on its own represents a useful addition to the molecular toolbox and the range of var genes that have been modified suggests that the system will have broad application. As well as incorporating a selectable marker, the authors have also used selective linked integration (SLI) to introduce markers to track the transport of PfEMP1, investigate the route of transport, and probe interactions with PfEMP1 proteins in the infected host cell.

      What I particularly like about this paper is that the authors have not only put together what appears to be a largely robust system for further functional studies, but they have used it to produce a range of interesting findings including:

      - Co-activation of rif and var genes when in a head-to-head orientation.

      - The reduced control of expression of var genes in the 3D7-MEED parasite line.

      - More support for the PTEX transport route for PfEMP1.

      - Identification of new proteins involved in PfEMP1 interactions in the infected erythrocyte, including some required for cytoadherence.

      In most cases the experimental evidence is straightforward, and the data support the conclusions strongly. The authors have been very careful in the depth of their investigation, and where unexpected results have been obtained, they have looked carefully at why these have occurred.

      (1) In terms of incorporating a drug marker to drive mono-variant expression, the authors show that they can manipulate a range of var genes in two parasite lines (3D7 and IT4), producing around 90% expression of the targeted PfEMP1. Removal of drug selection produces the expected 'drift' in variant types being expressed. The exceptions to this are the 3D7-MEED line, which looks to be an interesting starting point to understand why this variant appears to have impaired mutually exclusive var gene expression and the EPCR-binding IT4var19 line. This latter finding was unexpected and the modified construct required several rounds of panning to produce parasites expressing the targeted PfEMP1 and bind to EPCR. The authors identified a PTP3 deficiency as the cause of the lack of PfEMP1 expression, which is an interesting finding in itself but potentially worrying for future studies. What was not clear was whether the selected IT4var19 line retained specific PfEMP1 expression once receptor panning was removed.

      This is a very interesting point. We do not have systematic long-term data for the Var19 line but medium-term data. After panning the Var19 line, the binding assays were done within 3 months without additional panning. The first binding assay was 2 months after the panning and the last binding assays three weeks later. While there is inherent variation in these assays that precludes detection of smaller changes, the last assay showed the highest level of binding, giving no indication for rapid loss of the binding phenotype. Hence, we can say that the binding phenotype appears to be stable for many weeks without panning the cells again and there was no indication for a rapid loss of binding in these parasites.

      Systematic long-term experiments to assess how long the Var19 parasites retain binding would be interesting, but given that the binding-phenotype appears to remain stable over many weeks, this would only make sense if done for a much longer time (6 months or more). Due to the time needed to carry out such an experiment this would not be practical to still include into the present study. But this might be advisable if the Var19 line is used in future experiments that go over extended periods of time. We intend to include a statement in the discussion of the revised manuscript to highlight that if long-term work with this line is planned, monitoring the binding phenotype and potentially re-panning might be advisable.

      (2) The transport studies using the mDHFR constructs were quite complicated to understand but were explained very clearly in the text with good logical reasoning.

      We are aware of this being a complex issue and are glad this was nevertheless understandable.

      (3) By introducing a second SLI system, the authors have been able to alter other genes thought to be involved in PfEMP1 biology, particularly transport. An example of this is the inactivation of PTP1, which causes a loss of binding to CD36 and ICAM-1. It would have been helpful to have more insight into the interpretation of the IFAs as the anti-SBP1 staining in Figure 5D (PTP-TGD) looks similar to that shown in Figure 1C, which has PTP intact. The anti-EXP2 results are clearly different.

      We realize the description of the PTP1-TGD IFA data and that of the other TGDs was rather cursory. We intend to amend this in the revision.

      (4) It is good to see the validation of PfEMP1 expression includes binding to several relevant receptors. The data presented use CHO-GFP as a negative control, which is relevant, but it would have been good to also see the use of receptor mAbs to indicate specific adhesion patterns. The CHO system if fine for expression validation studies, but due to the high levels of receptor expression on these cells, moving to the use of microvascular endothelial cells would be advisable. This may explain the unexpected ICAM-1 binding seen with the panned IT4var19 line.

      We agree with the reviewer that it is desirable to have better binding systems for studying individual binding interactions. As the main purpose of this paper was to introduce the system and show binding, we did not move to more complicated binding systems. However, we would like to point out that the CSA binding was done on receptor alone in addition to the CSA-expressing HBEC-5i cells and was competed successfully with soluble CSA. In addition, apart from the additional ICAM1-binding of the Var19 line, all binding phenotypes were conform with expectations. We therefore hope the tools used for binding studies are acceptable at this stage of introducing the system while future work interested in specific PfEMP1 receptor interactions are advised to use better systems, ideally including also endothelial organoid models, inhibitory antibodies and possibly domain competition. We intend to add a sentence to the discussion highlighting that future work using this system to study individual receptor-interactions could benefit from using optimized binding systems.

      (5) The proxiome work is very interesting and has identified new leads for proteins interacting with PfEMP1, as well as suggesting that KAHRP is not one of these. The reduced expression seen with BirA* in position 3 is a little concerning but there appears to be sufficient expression to allow interactions to be identified with this construct. The quantitative impact of reduced expression for proxiome experiments will clearly require further work to define it.

      This is a valid point. Clearly there seems to be some impact on binding when BirA* is placed in the extracellular domain (either through reduced presentation or direct reduction of binding efficiency of the modified PfEMP1). The exact impact on the proxiome is indeed difficult to assess. However, we hope that the general coverage of proteins proximal to PfEMP1 with the 3 PfEMP1-BirA* constructs will aid in the identification of proteins involved in PfEMP1 transport and surface display as illustrated with two of the hits targeted here.

      (6) The reduced receptor binding results from the TryThrA and EMPIC3 knockouts were very interesting, particularly as both still display PfEMP1 on the surface of the infected erythrocyte. While care needs to be taken in cross-referencing adhesion work in P. berghei and whether the machinery truly is functionally orthologous, it is a fair point to make in the discussion. The suggestion that interacting proteins may influence the "correct presentation of PfEMP1" is intriguing and I look forward to further work on this.

      We hope we future work will be able to shed light on this.

      Overall, the authors have produced a useful and reasonably robust system to support functional studies on PfEMP1, which may provide a platform for future studies manipulating the domain content in the exon 1 portion of var genes. They have used this system to produce a range of interesting findings and to support its use by the research community.<br /> Finally, a small concern. Being able to select specific var gene switches using drug markers could provide some useful starting points to understand how switching happens in P. falciparum. However, our trypanosome colleagues might remind us that forcing switches may show us some mechanisms but perhaps not all.

      Point noted! From non-systematic data with the Var01 line that has been cultured for extended periods of time (several years), it seems other non-targeted vars remain silent in our SLI “activation” lines but how much SLI-based var-expression “fixing” tampers with the integrity of natural switching mechanisms is indeed very difficult to gage at this stage. We intend to add a statement to the manuscript that even if mutually exclusive expression is maintained, it is not certain the mechanisms controlling var expression all remain intact.

      Reviewer #2 (Public review):

      Summary

      Croshagen et al develop a range of tools based on selection-linked integration (SLI) to study PfEMP1 function in P. falciparum. PfEMP1 is encoded by a family of ~60 var genes subject to mutually exclusive expression. Switching expression between different family members can modify the binding properties of the infected erythrocyte while avoiding the adaptive immune response. Although critical to parasite survival and Malaria disease pathology, PfEMP1 proteins are difficult to study owing to their large size and variable expression between parasites within the same population. The SLI approach previously developed by this group for genetic modification of P. falciparum is employed here to selectively and stably activate the expression of target var genes at the population level. Using this strategy, the binding properties of specific PfEMP1 variants were measured for several distinct var genes with a novel semi-automated pipeline to increase throughput and reduce bias. Activation of similar var genes in both the common lab strain 3D7 and the cytoadhesion competent FCR3/IT4 strain revealed higher binding for several PfEMP1 IT4 variants with distinct receptors, indicating this strain provides a superior background for studying PfEMP1 binding. SLI also enables modifications to target var gene products to study PfEMP1 trafficking and identify interacting partners by proximity-labeling proteomics, revealing two novel exported proteins required for cytoadherence. Overall, the data demonstrate a range of SLI-based approaches for studying PfEMP1 that will be broadly useful for understanding the basis for cytoadhesion and parasite virulence.

      Comments

      (1) While the capability of SLI to actively select var gene expression was initially reported by Omelianczyk et al., the present study greatly expands the utility of this approach. Several distinct var genes are activated in two different P. falciparum strains and shown to modify the binding properties of infected RBCs to distinct endothelial receptors; development of SLI2 enables multiple SLI modifications in the same parasite line; SLI is used to modify target var genes to study PfEMP1 trafficking and determine PfEMP1 interactomes with BioID. Curiously, Omelianczyk et al activated a single var (Pf3D7_0421300) and observed elevated expression of an adjacent var arranged in a head-to-tail manner, possibly resulting from local chromatin modifications enabling expression of the neighboring gene. In contrast, the present study observed activation of neighboring genes with head-to-head but not head-to-tail arrangement, which may be the result of shared promoter regions. The reason for these differing results is unclear although it should be noted that the two studies examined different var loci.

      The point that we are looking at different loci is very valid and we realize this is not mentioned in the discussion. In the revision we intend to add this as a possible reason for this discrepancy. As stated in the discussion, the head-to-head scenario was observed before in lines obtained with panning. However, given the rather few examples where this was analyzed, it is well possible that this varies with gene locus and we will make sure that the revised version of the manuscript will be careful to highlight that it is not clear how much this observation in our work can be generalized.

      (2) The IT4var19 panned line that became binding-competent showed increased expression of both paralogs of ptp3 (as well as a phista and gbp), suggesting that overexpression of PTP3 may improve PfEMP1 display and binding. Interestingly, IT4 appears to be the only known P. falciparum strain (only available in PlasmoDB) that encodes more than one ptp3 gene (PfIT_140083100 and PfIT_140084700). PfIT_140084700 is almost identical to the 3D7 PTP3 (except for a ~120 residue insertion in 3D7 beginning at residue 400). In contrast, while the C-terminal region of PfIT_140083100 shows near-perfect conservation with 3D7 PTP3 beginning at residue 450, the N-terminal regions between the PEXEL and residue 450 are quite different. This may indicate the generally stronger receptor binding observed in IT4 relative to 3D7 results from increased PTP3 activity due to multiple isoforms or that specialized trafficking machinery exists for some PfEMP1 proteins.

      We thank the reviewer for pointing this out, it is an interesting idea that the PTP3 duplication could be a reason for the superior binding of IT4. We intend to add this point to the discussion of the revision.

      So far it seems the PTP3 issue occurred only with Var19. The thought of an extra layer of control, particularly for PfEMP1 variants that might be associated with virulence such as Var19, is very attractive. At present, the manuscript alludes to the possibility of an extra layer of control in the discussion. As var-type specificity and existence of such mechanisms in vivo are so far not known we decided not to speculate on this.

      Reviewer #3 (Public review):

      Summary:

      The submission from Cronshagen and colleagues describes the application of a previously described method (selection linked integration) to the systematic study of PfEMP1 trafficking in the human malaria parasite Plasmodium falciparum. PfEMP1 is the primary virulence factor and surface antigen of infected red blood cells and is therefore a major focus of research into malaria pathogenesis. Since the discovery of the var gene family that encodes PfEMP1 in the late 1990s, there have been multiple hypotheses for how the protein is trafficked to the infected cell surface, crossing multiple membranes along the way. One difficulty in studying this process is the large size of the var gene family and the propensity of the parasites to switch which var gene is expressed, thus preventing straightforward gene modification-based strategies for tagging the expressed PfEMP1. Here the authors solve this problem by forcing the expression of a targeted var gene by fusing the PfEMP1 coding region with a drug-selectable marker separated by a skip peptide. This enabled them to generate relatively homogenous populations of parasites all expressing tagged (or otherwise modified) forms of PfEMP1 suitable for study. They then applied this method to study various aspects of PfEMP1 trafficking.

      Strengths:

      The study is very thorough, and the data are well presented. The authors used SLI to target multiple var genes, thus demonstrating the robustness of their strategy. They then perform experiments to investigate possible trafficking through PTEX, they knock out proteins thought to be involved in PfEMP1 trafficking and observe defects in cytoadherence, and they perform proximity labeling to further identify proteins potentially involved in PfEMP1 export. These are independent and complimentary approaches that together tell a very compelling story.

      Weaknesses:

      (1) When the authors targeted IT4var19, they were successful in transcriptionally activating the gene, however, they did not initially obtain cytoadherent parasites. To observe binding to ICAM-1 and EPCR, they had to perform selection using panning. This is an interesting observation and potentially provides insights into PfEMP1 surface display, folding, etc. However, it also raises questions about other instances in which cytoadherence was not observed. Would panning of these other lines have been successfully selected for cytoadherent infected cells? Did the authors attempt panning of their 3D7 lines? Given that these parasites do export PfEMP1 to the infected cell surface (Figure 1D), it is possible that panning would similarly rescue binding. Likewise, the authors knocked out PTP1, TryThrA, and EMPIC3 and detected a loss of cytoadhesion, but they did not attempt panning to see if this could rescue binding. To ensure that the lack of cytoadhesion in these cases is not serendipitous (as it was when they activated IT4var19), they should demonstrate that panning cannot rescue binding.

      These are very important points. Indeed, we had repeatedly attempted to pan 3D7 when we failed to get the SLI-generated 3D7 PfEMP1 expressor lines to bind, but this had not been successful. After the move to IT4 which readily bound we made no further efforts to understand why 3D7 does not bind but the fact that PfEMP1 is on the surface indicates this is not a PTP3 issue. Also, as the parent 3D7 could not be panned, we assumed it is not easily fixed.

      Panning the TGD lines: we see the reasoning for conducting panning experiments with the TGD lines, but on second thought we are unsure this should be attempted. The outcome might not be easily interpretable if panning leads to increased binding and considerable follow up analyses would be needed to define what has happened. The reason for this is that at least two forces will contribute to the selection in panning experiments with TGD lines that lost binding. Firstly, panning would work against the SLI of the TGD, resulting in a tug of war between the TGD-SLI and binding: a very low frequency of parasites can be expected to loop out the TGD plasmid and would normally be eliminated during standard culturing due to the SLI drug used for the TGD. These revertant cells would bind and the panning would enrich them (hence, panning and SLI are opposed in the case of a TGD abolishing binding). It is unclear how strong such an effect can be, but this might lead to mixed populations that complicate interpretations. The second selecting force are possible compensatory changes to restore binding. These can come in two flavors: reversal of potential independent changes that may have occurred in the TGD parasites and that are in reality causing the binding loss (the concern of the reviewer) or new changes to compensate the loss of the TGD target (in case the TGD is the cause of the binding loss). As both of the TGDs in the paper show some residual binding and have VAR01 on the surface to at least some extent, it is possible that new compensatory changes might indeed occur that indirectly increase binding again. In summary, even if more binding after panning of the lines occurs, it is not clear whether this is due to a compensatory change ameliorating the TGD or reversal of an unrelated change. The impact of repeated panning against SLI is also unknown. To determine the cause, the panned TGD lines would need to be subjected to a complex and time-consuming analysis (WGS, RNASeq, possibly Maurer’s clefts IFA phenotype) to find out whether they had an unrelated chance change that was reverted or a new compensatory change that helps binding.

      The detection of VAR01 on the surface of these TGDs speaks against a PTP3 effect. While we can’t fully exclude other changes in the TGDs that might affect binding, we conducted WGS which did not show any obvious alterations that could be responsible. To fully exclude loss of ptp3 expression as the reason as seen with Var19 (something we would not have seen in the WGS if it is only due to a transcriptional change), we intend to carry out RNASeq with the two TGD lines. The third TGD mentioned by the reviewer (targeting ptp1) was a positive control of a known PfEMP1 trafficking protein, so we assume this does not need to be further validated.

      (2) The authors perform a series of trafficking experiments to help discern whether PfEMP1 is trafficked through PTEX. While the results were not entirely definitive, they make a strong case for PTEX in PfEMP1 export. The authors then used BioID to obtain a proxiome for PfEMP1 and identified proteins they suggest are involved in PfEMP1 trafficking. However, it seemed that components of PTEX were missing from the list of interacting proteins. Is this surprising and does this observation shed any additional light on the possibility of PfEMP1 trafficking through PTEX? This warrants a comment or discussion.

      This is an interesting comment and we agree we should have discussed this. A likely reason why PTEX components are not picked up as interactors is that BirA* is expected to become unfolded when it passes through the channel and in that state can’t biotinylate. Labelling likely would only be possible if PfEMP1 lingered at the PTEX translocation step before BirA* became unfolded to go through the channel which we would not expect under physiological conditions. We intend to add a sentence to the discussion why we think PTEX components would not be detected in our BioIDs even if PfEMP1 passes through it but that this might also be an argument against it passing through PTEX.

    1. Author response:

      Reviewer #1 (Public review):

      The results of this manuscript look at the interplay between pleiotropy, standing genetic variation, and parallelism (i.e. predictability of evolution) in gene expression. Ultimately, their results suggest that (a) pleiotropic genes typically have a smaller range in variation/expression, and (b) adaptation to similar environments tends to favor changes in pleiotropic genes, which leads to parallelism in mechanisms (though not dramatically). However, it is still uncertain how much parallelism is directly due to pleiotropy, instead of a complex interplay between them and ancestral variation.

      I have a few things that I was uncertain about. It may be these things are easily answered but require more discussion or clarity in the manuscript.

      (1) The variation being talked about in this manuscript is expression levels, and not SNPs within coding regions (or elsewhere). The cause of any specific gene having a change in expression can obviously be varied - transcription factors, repressors, promoter region variation, etc. Is this taken into account within the "network connectivity" measurement? I understand the network connectivity is a proxy for pleiotropy - what I'm asking is, conceptually, what can be said about how/why those highly pleiotropic genes have a change (or not) in expression. This might be a question for another project/paper, but it feels like a next step worth mentioning somewhere.

      In current study, we are only able to detect significant and repeatable expression changes but unable to identify the underlying causal variants. An eQTL study in the founder population in combination with genomic resequencing for both evolved and ancestral populations would be required to address this question.

      (2) The authors do have a passing statement in line 361 about cis-regulatory regions. Is the assumption that genetic variation in promoter regions is the ultimate "mechanism" driving any change in expression? In the same vein, the authors bring up a potential confounding factor, though they dismiss it based on a specific citation (lines 476-481; citation 65). I'm of the mindset that in order to more confidently disregard this "issue" based on previous evidence, it requires more than one citation. Especially since the one citation is a plant. That specific point jumps out to me as needing a more careful rebuttal.

      It was not our intention to claim that the expression changes in our experiment are caused by cis-regulatory variation only. We believe that the observed expression variation has both cis- and trans-genetic components, where as some studies tend to estimate much higher cisvariation for gene expression in Drosophila populations (e.g. [1, 2]). We mentioned the positive correlation between cis-regulatory polymorphism and expression variation to (1) highlight the genetic control of gene expression and (2) make the connection between polygenic adaptation and gene expression evolutionary parallelism.

      (3) I feel like there isn't enough exploration of tissue specificity versus network connectivity. Tissue specificity was best explained by a model in which pleiotropy had both direct and indirect effects on parallelism; while network connectivity was best explained (by a small margin) via the model which was mostly pleiotropy having a direct effect on ancestral variation, that then had a direct effect on parallelism. When the strengths of either direct/indirect effects were quantified, tissue specificity showed a stronger direct effect, while network connectivity had none (i.e. not significant). My confusion is with the last point - if network connectivity is explained by a direct effect in the best-supported model, how does this work, since the direct effect isn't significant? Perhaps I am misunderstanding something.

      To clarify, for network connectivity, there’s a significant “indirect” effect on parallelism (i.e. network connectivity affect ancestral gene expression and ancestral gene expression affect parallelism). Hence, in table 2, the direct effect of network connectivity on parallelism is weak and not significant while the indirect effect via ancestral variation is significant.

      Also, network connectivity might favor the most pleiotropic genes being transcription factor hubs (or master regulators for various homeostasis pathways); while the tissue specificity metric perhaps is a kind of a space/time element. I get that a gene having expression across multiple tissues does fit the definition of pleiotropy in the broad sense, but I'm wondering if some important details are getting lost - I'm just thinking about the relative importance of what tissue specificity measurements say versus the network connectivity measurement.

      We examined the statistical relationship between the two measures and found a moderate positive correlation on the basis of which we argued that the two measures may capture different aspects of pleiotropy. We appreciate the reviewer’s suggestions about the biological basis of the two estimates of pleiotropy, but we think that without further experimental insights, an extended discussion of this topic is too premature to provide meaningful insights to the readership.

      Reviewer #2 (Public review):

      Summary:

      Lai and collaborators use a previously published RNAseq dataset derived from an experimental evolution set up to compare the pleiotropic properties of genes whose expression evolved in response to fluctuating temperature for over 100 generations. The authors correlate gene pleiotropy with the degree of parallelisms in the experimental evolution set up to ask: are genes that evolved in multiple replicates more or less pleiotropic?

      They find that, maybe counter to expectation, highly pleiotropic genes show more replicated evolution. Such an effect seems to be driven by direct effects (which the authors can only speculate on) and indirect effects through low variance in pleiotropic genes (which the authors indirectly link to genetic variation underlying gene expression variance).

      Weaknesses:

      The results offer new insights into the evolution of gene expression and into the parameters that constrain such evolution, i.e., pleiotropy. Although the conclusions are supported by the data, I find the interpretation of the results a little bit complicated.

      Major comment:

      The major point I ask the authors to address is whether the connection between polygenic adaptation and parallelism can indeed be used to interpret gene expression parallelism. If the answer is not, please rephrase the introduction and discussion, if the answer is yes, please make it explicit in the text why it is so.

      Our answer is yes, we interpreted gene expression parallelism (high ancestral variance -> less parallelism) using the same framework that links polygenic adaptation and parallelism (high polygenicity = less trait parallelism). We believe that our response covers several of the reviewer’s concerns.

      The authors' argument: parallelism in gene expression is the same as parallelism in SNP allele frequency (AFC) (see L389-383 here they don't mention that this explanation is derived from SNP parallelism and not trait parallelism, and see Figure 1 b). In previous publications, the authors have explained the low level of AFC parallelism using a polygenic argument. Polygenic traits can reach a new trait optimum via multiple SNPs and therefore although the trait is parallel across replicates, the SNPs are not necessarily so.

      Importantly, our rationale is based on the idea that gene expression is rarely the direct target of selection, but rather an intermediate trait [3]. Recently, we have specifically tested this assumption for gene expression and metabolite concentrations and our analysis showed that both traits were are redundant [4], as previously shown for DNA sequences [5]. The important implication for this manuscript is that gene expression is also redundant, so that adaptation can be achieved by distinct changes in gene expression in replicate populations adapting to the same selection pressure. This implies that we can use the same simulation framework for gene expression as for sequencing data. In our case different SNP frequencies correspond to different expression levels (averaged across individuals from a population), which in turn increases fitness by modifying the selected trait. Importantly, the selected trait in our simulations is not gene expression, but a not defined high level phenotype. A key insight from our simulations is that with increasing polygenicity the expression of a gene is more variable in the ancestral population.

      In the current paper, they seem to be exchanging SNP AFC by gene expression, and to me, those are two levels that cannot be interchanged. Gene expression is a trait, not an SNP, and therefore the fact that a gene expression doesn't replicate cannot be explained by a polygenic basis, because again the trait is gene expression itself. And, actually, the results of the simulations show that high polygenicity = less trait parallelism (Figure 4).

      As detailed above, because adaptation can be reached by changes in gene expression at different sets of genes, redundancy is also operating on the expression level not just on the level of SNPs. To clarify, the x-axis of Fig. 4 is the expression variation in the ancestral population.

      Now, if the authors focus on high parallel genes (present in e.g. 7 or more replicates) and they show that the eQTLs for those genes are many (highly polygenic) and the AFC of those eQTLs are not parallel, then I would agree with the interpretation. But, given that here they just assess gene expression and not eQTL AFC, I do not think they can use the 'highly polygenic = low parallelism' explanation.

      The interpretation of the results to me, should be limited to: genes with low variance and high pleiotropy tend to be more parallel, and the explanation might be synergistic pleiotropy.

      While we understand the desire to model the full hierarchy from eQTLs to gene expression and adaptive traits, we raise caution that this would be a very challenging task. eQTLs very often underestimate the contribution of trans-acting factors, hence the understanding of gene expression evolution based on eQTLs is very likely incomplete and cannot explain the redundancy of gene expression during adaptation. Hence, we think that the focus on redundant gene expression is conceptually simpler and thus allows us to address the question of pleiotropy without the incorporation of allele frequency changes.  

      Reviewer #3 (Public review):

      The authors aim to understand how gene pleiotropy affects parallel evolutionary changes among independent replicates of adaptation to a new hot environment of a set of experimental lines of Drosophila simulans using experimental evolution. The flies were RNAsequenced after more than 100 generations of lab adaptation and the changes in average gene expression were obtained relative to ancestral expression levels from reconstructed ancestral lines. Parallelism of gene expression change among lines is evaluated as variance in differential gene expression among lines relative to error variance. Similarly, the authors ask how the standing variation in gene expression estimated from a handful of flies from a reconstructed outbred line affects parallelism. The main findings are that parallelism in gene expression responses is positively associated with pleiotropy and negatively associated with expression variation. Those results are in contradiction with theoretical predictions and empirical findings. To explain those seemingly contradictory results the authors invoke the role of synergistic pleiotropy and correlated selection, although they do not attempt to measure either.

      Strengths:

      (1) The study uses highly replicated outbred laboratory lines of Drosophila simulans evolved in the lab under a constant hot regime for over 100 generations. This allows for robust comparisons of evolutionary responses among lines.

      (2) The manuscript is well written and the hypotheses are clearly delineated at the onset.

      (3) The authors have run a causal analysis to understand the causal dependencies between pleiotropy and expression variation on parallelism.

      (4) The use of whole-body RNA extraction to study gene expression variation is well justified.

      Weaknesses:

      (1) It is unclear how well phenotypic variation in gene expression of the evolved lines has been estimated by the sample of 20 males from a reconstructed outbred line not directly linked to the evolved lines under study. I see this as a general weakness of the experimental design.

      Our intention was not to measure the phenotypic variance of the evolved lines, but rather to estimate the phenotypic variance at the beginning of the experiment. Hence, we measured and investigated the variation of gene expression in the ancestral population since this was the beginning of the replicated experimental evolution. Furthermore, since the ancestral population represents the natural population in Florida, the gene expression variation reflects the history of selection history acting on it.

      (2) There are no estimates of standing genetic variation of expression levels of the genes under study, only phenotypic variation. I wished the authors had been clear about that limitation and had discussed the consequences of the analysis. This also constitutes a weakness of the study.

      The reviewer is correct that we do not aim to estimate the standing genetic variation, which is responsible for differences in gene expression. While we agree that it could be an interesting research question to use eQTL mapping to identify the genetic basis of gene expression, we caution that trans-effects are difficult to estimate and therefore an important component of gene expression evolution will be difficult to estimate. Hence, we consider that our focus on variation in gene expression without explicit information about the genetic basis is simpler and sufficient to address the question about the role of pleiotropy.

      (3) Moreover, since the phenotype studied is gene expression, its genetic basis extends beyond expressed sequences. The phenotypic variation of a gene's expression may thus likely misrepresent the genetic variation available for its evolution. The genetic variation of gene expression phenotypes could be estimated from a cross or pedigree information but since individuals were pool-sequenced (by batches of 50 males), this type of analysis is not possible in this study.

      We agree with the reviewer that gene expression variation may also have a non-genetic basis, we discuss this in depth in the discussion of the manuscript.  

      (4) The authors have not attempted to estimate synergistic pleiotropy among genes, nor how selection acts on gene expression modules. It makes any conclusion regarding the role of synergistic pleiotropy highly speculative.

      We mentioned synergistic pleiotropy as a possible explanation for our results. A positive correlation between the fitness effect of gene expression variation would predict more replicable evolutionary changes. A similar argument has been made by [6]. 

      I don't understand the reason why the analysis would be restricted to significantly differentially expressed genes only. It is then unclear whether pleiotropy, parallelism, and expression variation do play a role in adaptation because the two groups of adaptive and non-adaptive genes have not been compared. I recommend performing those comparisons to help us better understand how "adaptive" genes differentially contribute to adaptation relative to "nonadaptive" genes relative to their difference in population and genetic properties.

      We agree with the reviewer that the comparison between the pleiotropy of adaptive and nonadaptive genes is interesting. We performed the analysis but omitted from the current manuscript for simplicity. Similar to the results in [6], non-adaptive genes are more pleiotropic than the adaptive genes. For adaptive genes we find a positive correlation between the level of pleiotropy and evolutionary parallelism. Thus, high pleiotropy limits the evolvability of a gene, but moderate and potentially synergistic pleiotropy increases the repeatability of adaptive evolution. We included this result in the revised manuscript and discuss it.

      There is a lack of theoretical groundings on the role of so-called synergistic pleiotropy for parallel genetic evolution. The Discussion does not address this particular prediction. It could be removed from the Introduction.

      We modestly disagree with the reviewer, synergistic pleiotropy is covered by theory and empirical results also support the importance of synergistic pleiotropy. 

      References

      (1) Genissel A, McIntyre LM, Wayne ML, Nuzhdin SV. Cis and trans regulatory effects contribute to natural variation in transcriptome of Drosophila melanogaster. Molecular biology and evolution. 2008;25(1):101-10. Epub 20071112. doi: 10.1093/molbev/msm247. PubMed PMID: 17998255.

      (2) Osada N, Miyagi R, Takahashi A. Cis- and Trans-regulatory Effects on Gene Expression in a Natural Population of Drosophila melanogaster. Genetics. 2017;206(4):2139-48. Epub 20170614. doi: 10.1534/genetics.117.201459. PubMed PMID: 28615283; PubMed Central PMCID: PMCPMC5560811.

      (3) Barghi N, Hermisson J, Schlötterer C. Polygenic adaptation: a unifying framework to understand positive selection. Nature reviews Genetics. 2020;21(12):769-81. Epub 2020/07/01. doi: 10.1038/s41576-020-0250-z. PubMed PMID: 32601318.

      (4) Lai WY, Otte KA, Schlötterer C. Evolution of Metabolome and Transcriptome Supports a Hierarchical Organization of Adaptive Traits. Genome biology and evolution. 2023;15(6). Epub 2023/05/26. doi: 10.1093/gbe/evad098. PubMed PMID: 37232360; PubMed Central PMCID: PMCPMC10246829.

      (5) Barghi N, Tobler R, Nolte V, Jaksic AM, Mallard F, Otte KA, et al. Genetic redundancy fuels polygenic adaptation in Drosophila. PLoS biology. 2019;17(2):e3000128. Epub 2019/02/05. doi: 10.1371/journal.pbio.3000128. PubMed PMID: 30716062.

      (6) Rennison DJ, Peichel CL. Pleiotropy facilitates parallel adaptation in sticklebacks. Molecular ecology. 2022;31(5):1476-86. Epub 2022/01/09. doi: 10.1111/mec.16335. PubMed PMID: 34997980; PubMed Central PMCID: PMCPMC9306781.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife Assessment 

      This valuable study is a detailed investigation of how chromatin structure influences replication origin function in yeast ribosomal DNA, with a focus on the role of the histone deacetylase Sir2 and the chromatin remodeler Fun30. Convincing evidence shows that Sir2 does not affect origin licensing but rather affects local transcription and nucleosome positioning which correlates with increased origin firing. Overall, the evidence is solid and the model plausible. However, the methods employed do not rigorously establish a key aspect of the mechanism where initiation precisely occurs or rigorously exclude alternative models and the effect of Sir2 on transcription is not re-examined in the fun30 context. 

      Clarification on Sir2 Effect on Transcription in the fun30 Context

      We appreciate the reviewers’ thorough assessment but would like to clarify that the effect of Sir2 on transcription in the fun30 context was addressed in both the original and revised manuscripts. However, we recognize that the presentation of the qPCR results may have been unclear, as we initially plotted absolute transcript levels without normalizing for rDNA array size differences among the genotypes. We have now corrected this.

      After normalizing for copy number variations, the qPCR data show that the sir2 fun30 double mutant results in a ~40-fold increase in C-pro transcription relative to WT, compared to a 4-fold and 19-fold increase in fun30 and sir2 single mutants, respectively (Figure 5, figure supplement 6). These results have been discussed in the manuscript result section, where we note that "C-pro RNA levels were approximately twice as high in sir2 fun30 compared to sir2 cells when adjusted for rDNA size differences." This observation is critical for addressing both alternative models of MCM disappearance and for pinpointing transcription initiation sites, as detailed in the following sections.

      Public Reviews: 

      Reviewer #1 (Public review): 

      Summary: 

      This paper presents a mechanistic study of rDNA origin regulation in yeast by SIR2. Each of the ~180 tandemly repeated rDNA gene copies contains a potential replication origin. Earlyefficient initiation of these origins is suppressed by Sir2, reducing competition with origins distributed throughout the genome for rate-limiting initiation factors. Previous studies by these authors showed that SIR2 deletion advances replication timing of rDNA origins by a complex mechanism of transcriptional de-repression of a local PolII promoter causing licensed origin proteins (MCMcomplexes) to re-localize (slide along the DNA) to a different (and altered) chromatin environment. In this study, they identify a chromatin remodeler, FUN30, that suppresses the sir2∆ effect, and remarkably, results in a contraction of the rDNA to about onequarter it's normal length/number of repeats, implicating replication defects of the rDNA. Through examination of replication timing, MCM occupancy and nucleosome occupancy on the chromatin in sir2, fun30, and double mutants, they propose a model where nucleosome position relative to the licensed origin (MCM complexes) intrinsically determines origin timing/efficiency. While their interpretations of the data are largely reasonable and can be interpreted to support their model, a key weakness is the connection between Mcm ChEC signal disappearance and origin firing. While the cyclical chromatin association-dissociation of MCM proteins with potential origin sequences may be generally interpreted as licensing followed by firing, dissociation may also result from passive replication and as shown here, displacement by transcription and/or chromatin remodeling. Moreover, linking its disappearance from chromatin in the ChEC method with such precise resolution needs to be validated against an independent method to determine the initiation site(s). Differences in rDNA copy number and relative transcription levels also are not directly accounted for, obscuring a clearer interpretation of the results. Nevertheless, this paper makes a valuable advance with the finding of Fun30 involvement, which substantially reduces rDNA repeat number in sir2∆ background. The model they develop is compelling and I am inclined to agree, but I think the evidence on this specific point is purely correlative and a better method is needed to address the initiation site question. The authors deserve credit for their efforts to elucidate our obscure understanding of the intricacies of chromatin regulation. At a minimum, I suggest their conclusions on these points of concern should be softened and caveats discussed. Statistical analysis is lacking for some claims. 

      Strengths are the identification of FUN30 as suppressor, examination of specific mutants of FUN30 to distinguish likely functional involvement. Use of multiple methods to analyze replication and protein occupancies on chromatin. Development of a coherent model. 

      Weaknesses are failure to address copy number as a variable; insufficient validation of ChEC method relationship to exact initiation locus; lack of statistical analysis in some cases. 

      Review of revised version and response letter: 

      In the response, the authors make some improvements by better quantifying 2D gels, adding some missing statistical analyses, analyzing the effect of fun30 on rDNA replication in strains with reduced rDNA copy number, and using ChIP-seq of MCMs to support the ChEC-seq data. However, these additions do not address the main issue that is at the heart of their model: where initiation precisely occurs and whether the location is altered in the mutant(s). Thus, mechanistic insight is limited.

      We discuss the issue regarding the initiation site below.

      Under the section "Addressing Alternative Explanations", the authors claim that processes like transcription and passive replication cannot affect the displaced complex specifically. Why? They are not on same DNA (as mentioned in the Fig 1 legend). 

      Premature origin activation, not transcription, drives the disappearance of repositioned MCM complexes in sir2 mutants in HU.

      Indeed, the reviewer is correct in suggesting that C-pro transcription confined to rDNA units with repositioned MCM complexes could selectively displace those complexes, potentially explaining the selective disappearance of displaced MCMs in sir2 cells. However, our analysis of C-pro transcription and MCM occupancy in G1 versus HU across the genotypes allows us to rule out this possibility.

      We show that the fraction of repositioned MCMs in G1 cells is proportional to the level of C-pro transcription (WT < fun30 << sir2 < sir2 fun30), consistent with the involvement of transcription in the repositioning process during MCM loading in G1. Accordingly, with approximately twice the transcription in sir2 fun30 compared to sir2, we observe more repositioned MCMs in sir2 fun30 cells than in sir2 cells in G1 (Fig 5C).

      However, if the disappearance of repositioned MCMs in HU were solely due to C-pro transcription rather than origin activation, we would expect the repositioned MCMs to disappear more quickly in sir2 fun30 cells. Contrary to this expectation, our data show that repositioned MCM complexes are more stable in sir2 fun30 mutants compared to sir2 mutants, indicating that transcription is not the primary factor in the disappearance of displaced MCM complexes in HU; rather, rDNA origin activation appears to be the key factor.

      Replication initiation site in sir2. Using multiple independent approaches, including 2D gels, ChIP-seq, and EdU incorporation, we have demonstrated that rDNA origins fire prematurely in sir2 mutants, a conclusion that the reviewer does not contest. Once an origin fires, the MCM signal disappears from the site of its initial deposition, as expected, and this is confirmed in our MCM ChIP and HU ChEC data, both at rDNA origins and across the genome.

      Given that the majority of MCM complexes in sir2 mutants are repositioned, it is expected that these repositioned complexes disappear following premature origin activation. With less than half of the licensed origins (or <30% of total rDNA copies) retaining MCM at non-repositioned sites in sir2 mutants, if only these non-repositioned complexes were firing, and the repositioned MCM complexes were disappearing via mechanisms other than replication initiation (e.g., transcription), rDNA replication in sir2 mutants would be severely compromised rather than accelerated. Given this, and the strong experimental evidence that repositioned MCM complexes fire prematurely, continued focus on alternative explanations for MCM complex disappearance seems unwarranted.

      We present this analysis in the results section as follows:

      “Finally, although deletion of FUN30 could suppress replication initiation at the rDNA either by inhibiting the firing of the active, repositioned MCM complex or by preventing MCM repositioning to the "active location" in the first place, our results suggest that suppression occurs through the former mechanism. Consistent with previous reports that fun30 mutants are deficient in transcriptional silencing (Neves-Costa et al. 2009), C-pro RNA levels were approximately twice as high in sir2 fun30 cells compared to sir2 cells when adjusted for rDNA size (Figure 5—figure supplement 6).

      Moreover, deletion of FUN30 shifts the distribution toward the repositioned MCM location over the non-repositioned one in G1 cells (Figure 5C), aligning with the increased C-pro transcription observed in fun30 mutants. This shift is evident in both sir2 and SIR2 cells. Despite the increased transcription-mediated repositioning in sir2 fun30 cells compared to sir2 cells during G1, repositioned MCM persists longer in sir2 fun30 cells than in sir2 cells after release into HU. Additionally, sir2 fun30 mutants exhibit reduced MCM accumulation at the RFB compared to sir2 mutants after release into HU, supporting the conclusion that MCM disappearance in HU reflects origin activation rather than transcription-mediated displacement.”

      The model in Fig 7 implies that initiation sites are different in WT versus the mutants and this determines their timing/efficiency. But they also suggest that the same site might be used with different efficiencies in this response. I agree that both are possibilities and are not resolved. 

      Adjustment of the model to account for repositioned MCMs in WT cells In Figure 5—figure supplement 5, we demonstrate that even in WT cells, a small fraction of repositioned MCMs (~5%) can be detected, and that these repositioned MCM complexes disappear prematurely. However, because this represents a very small fraction of MCMs in WT cells, we initially did not include it in our overall model in Figure 7. In light of the reviewer's comment, we have now revised the model to incorporate this detail.

      Supporting their model requires better resolution to determine the actual replication initiation site. While this may be challenging, it should be feasible with methods to map nascent strands like DNAscent, or Okazaki fragment mapping.

      The initiation site in sir2 mutants has been thoroughly analyzed and supported by extensive experimental data, as discussed above. While high-resolution techniques such as DNAscent or Okazaki fragment mapping could potentially offer another layer of validation, the likelihood of obtaining finer detail that would change the conclusions is minimal. The methods we employed provide sufficient resolution to pinpoint the initiation site, and our results align consistently with established replication models.

      Further experimentation would not only be redundant but also unlikely to provide new insights beyond revalidation. Given the strength of our current data, we believe the conclusions regarding replication initiation are robust and well-supported, making additional experiments unnecessary at this stage. Our priority is to focus on advancing other aspects of the research that require deeper exploration.

      The 2D gel analysis of strains with reduced rDNA copy numbers adequately addresses the copy number variable with regard to the replication effect. 

      Overall, the paper is improved by providing additional data and improved analysis. The paper nicely characterizes the effect of Fun30. The model is reasonable but remains lacking in precise details of mechanism. 

      Reviewer #2 (Public review): 

      Summary: 

      In this manuscript, the authors follow up on their previous work showing that in the absence of the Sir2 deacetylase the MCM replicative helicase at the rDNA spacer region is repositioned to a region of low nucleosome occupancy. Here they show that the repositioned displaced MCMs have increased firing propensity relative to non-displaced MCMs. In addition, they show that activation of the repositioned MCMs and low nucleosome occupancy in the adjacent region depend on the chromatin remodeling activity of Fun30. 

      Strengths: 

      The paper provides new information on the role of a conserved chromatin remodeling protein in regulation of origin firing and in addition provides evidence that not all loaded MCMs fire and that origin firing is regulated at a step downstream of MCM loading. 

      Weaknesses: 

      The relationship between the authors results and prior work on the role of Sir2 (and Fob1) in regulation of rDNA recombination and copy number maintenance is not explored, making it difficult to place the results in a broader context. Sir2 has previously been shown to be recruited by Fob1, which is also required for DSB formation and recombination-mediated changes in rDNA copy number. Are the changes that the authors observe specifically in fun30 sir2 cells related to this pathway? Is Fob1 required for the reduced rDNA copy number in fun30 sir2 double mutant cells? 

      Reviewer #3 (Public review): 

      Summary: 

      Heterochromatin is characterized by low transcription activity and late replication timing, both dependent on the NAD-dependent protein deacetylase Sir2, the founding member of the sirtuins. This manuscript addresses the mechanism by which Sir2 delays replication timing at the rDNA in budding yeast. Previous work from the same laboratory (Foss et al. PLoS Genetics 15, e1008138) showed that Sir2 represses transcription-dependent displacement of the Mcm helicase in the rDNA. In this manuscript, the authors show convincingly that the repositioned Mcms fire earlier and that this early firing partly depends on the ATPase activity of the nucleosome remodeler Fun30. Using read-depth analysis of sorted G1/S cells, fun30 was the only chromatin remodeler mutant that somewhat delayed replication timing in sir2 mutants, while nhp10, chd1, isw1, htl1, swr1, isw2, and irc5 had no effect. The conclusion was corroborated with orthogonal assays including two-dimensional gel electrophoresis and analysis of EdU incorporation at early origins. Using an insightful analysis with an Mcm-MNase fusion (Mcm-ChEC), the authors

      show that the repositioned Mcms in sir2 mutants fire earlier than the Mcm at the normal position in wild type. This early firing at the repositioned Mcms is partially suppressed by Fun30. In addition, the authors show Fun30 affects nucleosome occupancy at the sites of the repositioned Mcm, providing a plausible mechanism for the effect of Fun30 on Mcm firing at that position. However, the results from the MNAse-seq and ChEC-seq assays are not fully congruent for the fun30 single mutant. Overall, the results support the conclusions providing a much better mechanistic understanding how Sir2 affects replication timing at rDNA, 

      Strengths 

      (1) The data clearly show that the repositioned Mcm helicase fires earlier than the Mcm in the wild type position. 

      (2) The study identifies a specific role for Fun30 in replication timing and an effect on nucleosome occupancy around the newly positioned Mcm helicase in sir2 cells. 

      Weaknesses 

      (1) It is unclear which strains were used in each experiment. 

      (2) The relevance of the fun30 phospho-site mutant (S20AS28A) is unclear. 

      (3) For some experiments (Figs. 3, 4, 6) it is unclear whether the data are reproducible and the differences significant. Information about the number of independent experiments and quantitation is lacking. This affects the interpretation, as fun30 seems to affect the +3 nucleosome much more than let on in the description. 

      Recommendations for the authors:  

      Reviewer #2 (Recommendations for the authors)

      The authors have addressed my concerns by the addition of new experiments and analysis. 

      One point remains unclear regarding additional support for the Mcm-ChEC results using ChIP experiments to verify whether MCM redistributes in sir2D cells. In their rebuttal, the authors state that, "New supporting based evidence: ChIP at rDNA Origins. Our ChIP analysis also shows that the disappearance of the MCM signal at rDNA origins in sir2Δ cells released into HU is accompanied by signal accumulation at the replication fork barrier (RFB), indicative of stalled replication forks at this location (Figure 5 figure supplement 3)...." The ChIP data in Figure 5 supplement 3 show accumulation of the Mcm2 ChIP signal to the left of the RFB in sir2D cells but it doesn't look like there is any decrease in the MCM signal in sir2D relative to wild-type cells for the peak C-Pro. There is a new MCM peak suggesting perhaps a new MCM loading event. 

      Figure 5 figure supplement 3 shows the relative abundance of the MCM ChIP signal across the ~2 kb rDNA region, spanning from the MCM loading site at the rDNA origin (on the left) to the replication fork barrier (RFB) on the right. The MCM-ChIP data are normalized to the highest signal within this rDNA region rather than across the entire genome, meaning that only the relative abundance of MCM within this region is represented, and not comparisons between different conditions. We have now presented the results with the same axes for both alpha factor and HU.

      In wild-type (WT) cells, the MCM signal remains primarily at the initial loading site. However, in sir2 mutants, a significant portion of the MCM signal shifts rightward, consistent with rDNA origin activation and the movement of MCM along with the progressing replication fork. While some replication forks stall at the RFB, others are positioned between the MCM loading site and the RFB. The additional MCM peak observed does not represent a new MCM loading event, as the experiment was conducted during S-phase, when new MCM loading is not possible.

      Reviewer #3 (Recommendations for the authors): 

      In this revision the authors addressed my concerns and improved the manuscript and the presentation of the data. All my recommendations were implemented.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #2 (Public review):

      Summary:

      In their manuscript the authors report that fecal transplantation from young mice into old mice alleviates susceptibility to gout. The gut microbiota in young mice is found to inhibit activation of the NLRP3 inflammasome pathway and reduce uric acid levels in the blood in the gout model.

      Strengths:

      They focused on the butanoate metabolism pathway based on the results of metabolomics analysis after fecal transplantation and identified butyrate as the key factor in mitigating gout susceptibility. In general, this is a well-performed study.

      Weaknesses:

      The discussion on the current results and previous studies regarding the effect of butyrate on gout symptoms is insufficient. The authors need to provide a more thorough discussion of other possible mechanisms and relevant literature.

      Reviewer #2 (Recommendations for the authors):

      General comments:

      I appreciate the authors' efforts to answer the comments raised in my previous review (as Reviewer#2). However, I still detect some issues that need to be fully addressed, with inadequate or even no answers for several comments.

      Thank you for your valuable feedback. Your previous suggestions have been incredibly helpful for our paper. Although we have strived to make the article as comprehensive as possible, there may still be some areas that are not perfectly refined.

      The response to comment 1: The author's statement is not very convincing. What are the trends of inflammation factors? The data in Figure 1G-H suggest that butyrate may not be the only factor to explain this phenomenon. Authors should carefully interpret the data in Figure 1G-H.

      Sorry for the inadequate clarification on your question. We utilize antibiotics for treatment in order to establish the relationship between gut microbiota, age, and gout. Our research findings indicate that there is a trend for serum uric acid levels to increase with age, and we also observe that the older the age, the more pronounced the stimulation to MSU. We found that after clearing the gut microbiota and then stimulating with MSU, the trend of inflammation factors and serum uric acid level changing with age disappears. Thus, we can preliminarily draw the conclusion that the gut microbiota is closely associated with age, gout, and hyperuricemia.

      The response to comment 2: I understand the importance of evaluating a range of indicators, but food thickness is the most crucial clinical marker for diagnosing goats. Please move the data from Supplemental Figure 1A to the main figure.

      Thank you for your suggestions. We have included the most significant results in the main figure, and the description of “foot thickness” has already been provided descriptively in the manuscript. Additionally, considering the layout and arrangement of the images, we have placed it in the supplementary figures 1.

      The response to comment 3: The immunostaining for ZO-1 and Occludin is unclear. Please provide higher magnification images to confirm the specific staining.

      Thank you for your valuable feedback. We have enhanced the clarity of the images. In addition to adding immunohistochemical images in Supplementary Material 4, we have also submitted independent images.

      The response to comment 4: The authors still haven't directly addressed my comment.

      Please accept our sincere apologies for not providing a clearer response to your question. The indicators related to uric acid-producing enzymes and uric acid transporters have been separately analyzed according to different age groups. The specific results are detailed in section " The expression of uric acid-producing enzymes activity and uric acid transporters at the mRNA level across different age groups" of Supplementary Material 4.

      No response was given for comment 5. Please address it.

      In a PCoA plot, the distance between samples reflects the similarity in the structure of the microbial communities: the closer the distance, the more similar the composition of the communities; the greater the distance, the more pronounced the differences. We judge based on the relative distances of each group in the plot, observing their degree of proximity.

      The response to comment 6: I understand the author's statement, and I suggest incorporating it into the discussion section of the revised manuscript.

      Thank you for your suggestions. We have incorporated the relevant content into our discussion.

      The response to comment 7: Again, please incorporate this statement into the discussion section of the revised manuscript.

      Thank you for your suggestions. We have incorporated the relevant content into our discussion.

      Reviewer #3 (Public review):

      Summary:

      The revised manuscript presents interesting findings on the role of gut microbiota in gout, focusing on the interplay between age-related changes, inflammation, and microbiota-derived metabolites, particularly butyrate. The study provides valuable insights into the therapeutic potential of microbiota interventions and metabolites for managing hyperuricemia and gout. While the authors have addressed many of the previous concerns, a few areas still require clarification and improvements to strengthen the manuscript's clarity and overall impact.

      (1) While the authors mention that outliers in the data do not affect the conclusions, there remains a concern about the reliability of some figures (e.g., Figure 2D-G). It is recommended to provide a more detailed explanation of the statistical analysis used to handle outliers. Additionally, the clarity of the Western blot images, particularly IL-1β in Figure 3C, should be improved to ensure clear and supportive evidence for the conclusions.

      Thank you for your suggestion. We respond as follows: (1) Outliers can occasionally constitute intrinsic elements of the dataset, reflecting genuine occurrences within the experimental context. The elimination of such outliers has the potential to introduce bias into the results, thereby facilitating misconceptions regarding the underlying phenomenon under investigation. In order to maintain the transparency and integrity of the dataset, we have elected to retain the outliers within our analysis. This decision is based on the recognition that these values may represent genuine experimental observations or unique conditions that are inherently meaningful to the phenomenon under investigation. By preserving these data points, we aim to provide a comprehensive and unbiased representation of the experimental results, allowing for a more nuanced interpretation of the findings. (2) Due to the scarcity of samples, we are unable to fulfill your request in the short term. Furthermore, we have noted that the band for IL-1β in Figure 3C is indeed visible and we consider it suitable for subsequent analysis.

      (2) The manuscript raises a key question about why butyrate supplementation and FMT have different effects on uric acid metabolism and excretion. While the authors have addressed this by highlighting the involvement of multiple bacterial genera, it is still recommended to expand on the differences between these interventions in the discussion, providing more mechanistic insights based on available literature.

      Thank you for your suggestion. We have enriched the discussion in the manuscript and included additional comparisons

      (3) It is noted that IL-6 and TNF-α results in foot tissue were requested and have been added to supplementary material. However, the main text should clearly reference these additions, and the supplementary figures should be thoroughly reviewed for consistency with the main findings. The use of abbreviations (e.g., ns for no significant difference) and labeling should also be carefully checked across all figures.

      Thank you for your valuable feedback. We have revised the manuscript in accordance with your suggestions.

      (4) The manuscript presents butyrate as a key molecule in gout therapy, yet there are lingering concerns about its central role, especially given that other short-chain fatty acids (e.g., acetic and propionic acids) also follow similar trends. The authors should consider further acknowledging these other SCFAs and discussing their potential contribution to gout management. Additionally, the rationale for focusing primarily on butyrate in subsequent research should be made clearer.

      Thank you for your input. We have incorporated additional evidence into the discussion, explaining why we ultimately chose butyrate in subsequent research.

      (5) The full-length uncropped Western blot images should be provided as requested, to ensure transparency and reproducibility of the data.

      Thank you for your suggestion. We have already included the relevant explanations in the manuscript.

      (6) Despite the authors' revisions, several references still lack page numbers. Please ensure that all references are properly formatted, including complete page ranges.

      Thank you for your suggestions; we will make more detailed revisions to the references.

      The manuscript has improved with the revisions made, particularly regarding clarifications on experimental design and the inclusion of supplementary data. However, some concerns about data quality, mechanistic insights, and clarity in the figures remain. Addressing these points will enhance the overall impact of the work and its potential contribution to the understanding of the gut microbiome in gout and hyperuricemia. A final revision, with careful attention to both major and minor points, is highly recommended before resubmission.

      Once again, we are grateful for your suggestions and recognition. Your input has been of immense help to our manuscript and has also provided us with a valuable learning opportunity.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife Assessment

      The aim of this valuable study is to identify novel genes involved in sleep regulation and memory consolidation. It combines transcriptomic approaches following memory induction with measurements of sleep and memory to discover molecular pathways underlying these interlinked behaviors. The authors explore transcriptional changes in specific mushroom body neurons and suggest roles for two genes involved in RNA processing, Polr1F and Regnase-1, in the regulation of sleep and memory. Although this work exploits convincing and validated methodology, the strength of the evidence is incomplete to support the main claim that these two genes establish a definitive link between sleep and memory consolidation.

      We appreciate the reconsideration of our manuscript and recognize that we should have toned down the claims, especially with respect to the link between sleep and memory consolidation.  We have now changed the title, the abstract and the main text and also Figure 5 to essentially just state our findings.  While there is a little speculation in the Discussion, we point out that future work would be required to draw conclusions. We believe the manuscript still represents a considerable advance in showing the modulation of RNA processing genes during sleep-dependent memory consolidation in the relevant neurons, and also showing how one such gene affects sleep and translation and a second affects sleep and memory. 

      Public Reviews:

      Reviewer #2 (Public review):

      Prior work by the Sehgal group has shown that a small group of neurons in the fly brain (anterior posterior (ap) α'β' mushroom body neurons (MBNs)) promote sleep and sleep-dependent appetitive memory specifically under fed conditions (Chouhan et al., (2021) Nature). Here, Li, Chouhan et al. combine cell-specific transcriptomics with measurements of sleep and memory to identify molecular processes underlying this phenomenon. They define transcriptional changes in ap α'β' MBNs and suggest a role for two genes downregulated following memory induction (Polr1F and Regnase-1) in regulating sleep and memory.

      The transcriptional analyses in this manuscript are impressive. The authors have now included additional experiments that define acute and developmental roles for Polr1F and Regnase-1 respectively in regulating sleep. They have also provided additional data to strengthen their conclusion that Polr1F knockdown in α'β' mushroom body neurons enhances sleep.

      The resubmitted work represents a convincing investigation of two novel sleep-regulatory proteins that may also play important roles in memory formation.

      The authors have comprehensively addressed my comments, which I very much appreciate. I congratulate them on this excellent work.

      We very much appreciate the reviewer’s positive feedback. Thank you!

      Reviewer #3 (Public review):

      Previous work (Chouhan et al., 2022) from the Sehgal group investigated the relationship between sleep and long-term memory formation by dissecting the role of mushroom body intrinsic neurons, extrinsic neurons, and output neurons during sleep-dependent and sleep-independent memory consolidation. In this manuscript, Li et al., profiled transcriptome in the anterior-posterior (ap) α'/β' neurons and identified genes that are differentially expressed after training in fed condition, which supports sleep-dependent memory formation. By knocking down candidate genes systematically, the authors identified Polr1F and Regnase-1 as two important hits that play potential roles in sleep and memory formation. What is the function of sleep and how to create a memory are two long-standing questions in science. The present study used a new approach to identify novel components that may link sleep and memory consolidation in a specific type of neuron. Importantly, these components implicated that RNA processing may play a role in these processes.

      While I am enthusiastic about the innovative approach employed to identify RNA processing genes involved in sleep regulation and memory consolidation, I feel that the data presented in the manuscript is insufficient to support the claim that these two genes establish a definitive link between sleep and memory consolidation. First, the developmental role of Regnase-1 in reducing sleep remains unclear because knocking down Regnase-1 using the GeneSwitch system produced neither acute nor chronic sleep loss phenotype. In the revised manuscript, the author used the Gal80ts to restrict the knockdown of Regnase-1 in adult animals and concluded that Regnase-1 RNAi appears to affect sleep through development. Conducting overexpression experiments of Regnase-1 would lend some credibility to the phenotypes, however, this is not pursued in the revised manuscript. Second, while constitutive Regnase-1 knockdown produced robust phenotypes for both sleep-dependent and sleep-independent memory, it also led to a severe short-term memory phenotype. This raises the possibility that flies with constitutive Regnase-1 knockdown are poor learners, thereby having little memory to consolidate. The defect in learning could be simply caused by chronic sleep loss before training. Thus, this set of results does not substantiate a strong link between sleep and long-term memory consolidation. Lastly, the discussion on the sequential function of training, sleep, and RNA processing on memory consolidation appears speculative based on the present data.

      We thank the reviewer for the enthusiasm about the approach. As noted above, we have now removed all claims about a link between sleep and memory, and instead just emphasize that we have identified RNA processing genes that affect sleep and memory.  We agree with the reviewer that the basis of the Regnase-1 memory phenotype is unclear as the flies may be poor learners.  Also, the learning/memory defects could be secondary to sleep loss or, as Reviewer 4 below suggests, all the behavioral deficits could be caused by impaired development/function of the relevant ap ɑ′/β′ cells. We have now included this possibility in the discussion of the manuscript.  And we have modified the discussion on training, RNA processing, sleep and memory to emphasize the need for future experiments to address the sequence and relationship of these different processes. 

      Reviewer #4 (Public review):

      Summary:

      Li and Chouhan et al. follow up on a previous publication describing the role of anterior-posterior (ap) and medial (m) ɑ′/β′ Kenyon cells in mediating sleep-dependent and sleep-independent memory consolidation, respectively, based on feeding state in Drosophila melanogaster. The authors sequenced bulk RNA of ap ɑ′/β′ Kenyon cells 1h after flies were either trained-fed, trained-starved or untrained-fed and find a small number of genes (59) differentially expressed (3 upregulated, 56 downregulated) between trained-fed and trained-starved conditions. Many of these genes encode proteins involved in the regulation of gene expression. The authors then screened these differentially expressed genes for sleep phenotypes by expressing RNAi hairpins constitutively in ap ɑ′/β′ Kenyon cells and measuring sleep patterns. Two hits were selected for further analysis: Polr1F, which promoted sleep, and Regnase-1, which reduced sleep. The pan-neuronal expression of Polr1F and Regnase-1 RNAi constructs was then temporally restricted to adult flies using the GeneSwitch system. Polr1F sleep phenotypes were still observed, while Regnase-1 sleep phenotypes were not, indicating developmental defects. Appetitive memory was then assessed in flies with constitutive knockdown of Polr1F and Regnase-1 in ap ɑ′/β′ Kenyon cells. Polr1F knockdown did not affect sleep-dependent or sleep-independent memory, while Regnase-1 knockdown disrupted sleep-dependent memory, sleep-independent memory, as well as learning. Polr1F knockdown increased pre-ribosomal RNA transcripts in the brain, as measured by qPCR, in line with its predicted role as part of the RNA polymerase I complex. A puromycin incorporation assay to fluorescently label newly synthesized proteins also indicated higher levels of bulk translation upon Polr1F knockdown. Regnase-1 knockdown did not lead to observable changes in measurements of bulk translation.

      Strengths:

      The proposed involvement of RNA processing genes in regulating sleep and memory processes is interesting, and relatively unexplored. The methods are satisfactory.

      Weaknesses:

      The main weakness of the paper is in the overinterpretation of their results, particularly relating to the proposed link between sleep and memory consolidation, as stated in the title. Constitutive Polr1F knockdown in ap ɑ′/β′ Kenyon cells had no effect on appetitive long-term memory, while constitutive Regnase-1 knockdown affected both learning and memory. Since the effects of constitutive Regnase-1 knockdown on sleep could be attributed to developmental defects, it is quite plausible that these same developmental defects are what drive the observed learning and memory phenotypes. In this case, an alternative explanation of the authors' findings is that constitutive Regnase-1 knockdown disrupts the entire functioning of ap ɑ′/β′ Kenyon cells, and as a consequence behaviors involving these neurons (i.e. learning, memory and sleep) are disrupted. It will be important to provide further evidence of the function of RNA processing genes in memory in order to substantiate the memory link proposed by the authors.

      As noted above, we have removed claims of a link between sleep and memory and instead focused the manuscript on our findings of RNA processing genes modulated during sleep-dependent memory. We concur that impaired development of ap ɑ′/β′neurons could account for the sleep and memory phenotype observed and have included this possibility in the manuscript.

      Recommendations for the authors:

      Reviewer #4 (Recommendations for the authors):

      The title of the paper should be reconsidered to reflect the results. The evidence for a link between RNA processing genes and memory is weak.

      We have changed the title.

      Line 328. The term "central dogma" is misused. The central dogma refers to the unidirectional flow of information from DNA to protein. Instead the authors mean "gene expression".

      Changed, thank you.

      A couple of minor comments relating to the figures:

      Figure 1b. It is not clear what the number 10570 in the bottom right corner refers to.

      Fixed.

      Figure 3b. RU- and RU+ annotation is missing (as shown in 3d).

      Fixed.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review)

      (1) The identification of the proximal to distal degeneration of the tailgut within the human tail is difficult to distinguish with the current images present in Figure 3. A picture within a picture of the area containing the tail gut could be provided to prominently demonstrate the cellular architecture. Additionally, quantification of the localization of apoptosis would strongly support this observation, as well as provide a visualization of the tail's regression overall. For example, a graph plotting the number of apoptotic cells versus the rostral to caudal locations of the transverse sections while accounting for the CS stage of each analyzed embryo could be created; this could even be further broken down by region of tail, for example, tailgut, ventral ectodermal ridge, somite, etc.

      To provide more information on apoptosis, we prepared serial sections from an additional 6 human tails, 5 of which were processed for fluorescence anti-caspase 3 immunohistochemistry with DAPI staining (Fig 4) and H&E (Fig 6). This confirmed our previous finding of apoptosis especially in the tailgut and ventral mesoderm. We have not quantified the apoptosis, given the difficulty of deciding whether anti-caspase signals represent single or multiple dying cells. Instead, we performed a tissue area analysis from caudal to rostral along the tail (new section on p 9). This shows a progressive enlargement of the neural tube, no change in the notochord and a striking reduction in tailgut area (Fig 4C,D). The smaller tailgut has fewer nuclei in cross section rostrally compared with more caudally (Fig 4E). Given that apoptosis is present in the tailgut at all rostro-caudal levels, this is consistent with a rostralto-caudal loss of the tailgut, as is also found in mouse and rat embryos.

      (2) The identification of the mode of formation of the secondary neural tube is probably the most interesting question to be addressed, however, Figure 7's evidence is not completely satisfying in its current form. While I agree that it is unlikely that multiple polarization foci form within the most caudal part of the tail and coalesce more rostrally, I am equally unsure that a single polarization would form rostrally and then split and re-coalesce as it moves caudally, as is currently depicted by 7B. Multiple groups have recently shown the influence of geometric confinement on neuroectoderm and its ability to polarize and form a singular central lumen (Karzbrun 2021, Knight 2018), or the inverse situation of a lack of confinement resulting in the presence of multiple lumens. The tapering of the diameter of the tail and its shared perimeter and curvature with the polarization bears a striking resemblance to this controlled confinement. An interesting quantification to depict would include the number of lumens versus the transverse section diameter and CS stage to see if there is any correlation between embryo size and the number of multiple polarizations. Anecdotally, the fusion of multiple polarizations/lumens tends to occur often in these human organoid-type platforms, while splitting to multiple lumens as the tissues mature does not. Other supplements to Figure 7 could include 3D renderings of lumens of interest as depicted in Catala 2021, especially if it demonstrates the recoalescence as seen in 7B. The non-pathologic presence of multiple polarizations in human tails compared to the rodent pathogenic counterpart is interesting given that rodents obviously maintain this appendage while it is lost in humans.   

      The additional 6 sectioned human embryo tails (as described above) provide further information in support of the original findings of the paper: (i) that the secondary neural tube formation initially involves a single lumen, and (ii) that neural tube duplication occurs in many tails at more rostral levels. Neural tube duplication was observed in 15/25 of our sectioned tails: hence, overall 60% of human tails exhibited neural tube duplication in this study. We have replaced all the cross sectional images in the original Fig 7 (now Fig 6) to better illustrate the findings of neural tube duplication at relatively rostral levels of the human tail. Additionally, the axial position of sections containing duplicated neural tube are indicated by arrows in the graph of neural tube areas (Fig 4C). From this analysis it appears that neural tube duplication is not contingent on an increasing tail diameter, as raised by the reviewer, because some tails show a transition to neural tube duplication, and then return to an single lumen morphology more rostrally. While the 3D renderings of lumens would be interesting, we consider it beyond the scope of the present study.

      (3) Of potential interest is the process of junctional neurulation describing the mechanistic joining of the primary and secondary neural tube, which has recently been explored in chick embryos and demonstrated to have relevance to human disease (Dady 2014, Eibach 2017, Kim 2021). While it is clear this paper's goal does not center on the relationship between primary and secondary neurulation, such a mechanism may be relevant to the authors' interpretation of their observations of lumen coalescence. I wonder if the embryos studied provide any evidence to support junctional neurulation.  

      We agree this is an important point to address in the paper, and a new section has been inserted in the Discussion: ‘Transition from primary to secondary neurulation’ (pp 13-14). In brief, we find no evidence for a specific mode of ‘junctional neurulation’ in the human embryos. In any event, its existence is hypothetical in humans, suggested largely as an ‘embryological explanation’ for the finding of rare interrupted spinal cord defects in neurosurgical patients (Eibach, 2017). In chick neurulation there is longitudinal dorso-ventral overlap between the primary and secondary neural tubes (Dryden, 1980), with the junctional zone derived from ingressing cells at the node-streak border (Dady, 2014), a known source of neuromesodermal progenitors (NMPs). However, this is a very different developmental situation from the human so-called ‘junctional neurulation’ defect (Eibach, 2017), in which the spinal cord is physically and functionally interrupted, with only a rudimentary filament connecting the rostral and caudal parts.

      Reviewer #1 (Recommendations For The Authors):

      (1) Figures 3, 4, and 7, would be easier to digest quickly with inclusions of labels that mark the rostral and caudal transverse sections. For example, "caudal" over 3G and "rostral" over 3F.  

      Figures 3 and 4 have been combined to form revised Figure 3, and the rostral/caudal sections are no longer included, as these are superseded by the new Figure 4. Similarly Figure 7 has been replaced by new images in the revised Figure 6, with clear labelling of axial levels.

      (2) The manuscript does a nice job of comparing and contrasting the human findings to mouse, however, there are several instances where it would be nice to continue this trend within the text, such as including the rate of somite formation for rodents in the sections that you state the quantified human and published organoid findings, as well as the total number of somite rodents' exhibit. Additionally, the last sentence of the "Morphology of human PNP closure" section correctly states that human PNP's seem to close via Mode 2 neurulation that is seen in the mouse. However, my read of the literature (published by Dr. Copp) demonstrates that the PNP in mice actually closes via Mode 3 at the most caudal portion. If this is the case, it would be pointed to explicitly state that regionally dependent morphogenetic difference between the two species.  

      We agree these are important points to include. The additional somite data (for mouse) has been inserted in the Results section on ‘Somite formation’ (p 8), and the apparent absence of Mode 3 during human spinal neural tube closure is now included in the new Discussion section, ‘Transition from primary to secondary neurulation’ (pp 13-14).

      (3) The introduction to secondary neural tube formation with the hypothesis diagrams in Figure 7 is slightly jarring. At the beginning of the Figure, a schematic depicting the morphogenetic differences between primary and secondary would be helpful in introducing the readership to these complex embryologic events. An example of this could be similar to Figure 1 in Dr. Copp's paper:

      Nikolopoulou, E., et al. Neural tube closure: cellular, molecular and biomechanical mechanisms. Development 144, 552-566 (2017).  

      We feel that a summary diagram of primary and secondary neurulation would simply reproduce diagrams that are already widespread in the literature. As noted by the Reviewer, our article in Development (Nikolopoulou, 2017) contains just such a summary diagram as Figure 1. Therefore, we prefer to explicitly cite this article/figure in our Introduction (see modified first sentence, third paragraph, p 3), so that readers can consult the freely accessible Nikolopoulou review for more detail. The diagram in Figure 7 (now revised Figure 6) has been completely redrawn to make much clearer the hypotheses being examined in the study of human secondary neural tube formation, and neural tube duplication.

      (4) Finally, a matter of semantics, the second paragraph of the introduction describes myelomeningocele as a neurodegenerative defect, while it is true amniotic fluid further degrades exposed neural tissue while exposed, to me, the term neurodegenerative defect suggests a lifelong degeneration, which is not the case for human patients. Perhaps shortening to neurological defect is a compromise. Thank you for the important and interesting work.  

      We agree that ‘neurodegenerative’ can mean different things to different people. Literally, it refers to degeneration of neural tissue, which of course includes neuroepithelial loss due to amniotic fluid action in the uterus. Nevertheless, to avoid confusion, the word has been removed and the sentence expanded to include a reference to the adverse effects of amniotic fluid on the exposed neuroepithelium (see Introduction, second paragraph, p 3).

      Reviewer #2 (Public Review)

      It is not clear how the gestational age of the specimens was determined or how that can be known with certainty. There is no information given in the methods on this. With this in mind, bunching the samples at 2-day intervals in Figure 1J will lead to inaccuracies in assessing the rate of somite formation. This is pointed out as a major difference between specimens and organoids in the abstract but a similar result in the results section. The data supporting either of these statements is not convincing.

      Human embryos were assigned to Carnegie stages based on standard morphological criteria. This was stated, with references, in the first Results paragraph, and we have now also included this information in the Methods (first paragraph, p 19). We assigned the embryos to 2-day intervals based on the standard literature timing of these Carnegies stages, as described in O’Rahilly and Muller (1987). We have clarified both Carnegie staging and assignment of embryos to 2-day intervals in a new sentence within the Methods, first paragraph, p 19. “Embryos were assigned to Carnegie Stages (CS) using morphological criteria (O'Rahilly and Muller 1987; Bullen and Wilson 1997) and to 2-day post-conception intervals for regression analysis based on timings in Table 0-1 of O’Rahilly and Muller (1987).” This has also been inserted in the legend to Figure 1J.

      The regression analysis of somite number against days post-conception (Figure 1J) allowed a conclusion to be drawn on the rate of somite formation in early human embryos. We have added 95% confidence intervals to our finding of a new somite formed every 7.1 h in humans. We consider this to be important for comparison with non-human species and organoid systems. On p 8, second paragraph, we simply state our finding of a 7.1 h somite periodicity in human embryos, compared with 5 h in the organoid system (and 2 h in mouse and rat – as suggested by Reviewer 1). We are careful not to say it is a ‘major difference’ or ‘similar result’ in different parts of the paper, as the Reviewer has drawn attention to.

      Whenever possible, give the numbers of specimens that had the described findings. For example, in Figure 2C - how many embryos were examined with the massive rounded end at CS13? Apoptosis in Figures 3 and 4?  

      Numbers of embryos analysed in Figures 2 and 3 (the latter now a combined version of the original Figures 3 and 4) are shown in Table 2. We have also created a new Supplementary Figure 1 to show additional examples of human embryonic tails, which illustrate the consistency of morphology through the stages from CS13 to CS18. Numbers of samples that contributed to Figures 4-6 are detailed in the legends.

      For Figure 2I-K, it would be informative to superimpose the individual data points on the box plots distinguishing males from females, as in Figure 1I.  

      This was attempted but the data points overlie the box plots and look confusing. Instead, we have created Supplementary Table 2 which gives the raw data on which Figure 2I-K are

      based. We have also drawn attention to the fact that not all embryos yielded all types of measurement, especially tail lengths.

      Is it possible to quantitate apoptosis and proliferation data?  

      We have not quantified apoptosis, given the difficulty of deciding whether anti-caspase signals represent single or multiple dying cells. Instead, we performed a new tissue area analysis along the body axis, which has shed light on the possible direction (rostral to caudal) of tailgut loss in the human caudal region (see response to Reviewer 1 above). Since the cell proliferation data were limited in extent, and not a major focus of the paper, we have removed that analysis completely from the revised version.

      The Tunel staining in Figure 3 is difficult to make out.  

      We have extended our analysis of anti-caspase 3 immunohistochemisty and removed the TUNEL images.

      Reviewer #2 (Recommendations For The Authors)

      The anatomy of the sections in Figures 3, 4, and 7 is difficult to discern. Is it possible to insert adjacent panels tracing and labeling the structures in each panel? Also, drawings showing the axial level of each section would be helpful.

      To clarify the axial levels of sections, we have inserted images of mouse and human embryos as parts A and B of the revised Figure 3. We have tried to clarify the morphology of sections by labelling all relevant structures in the sections themselves.   

      High-magnification views of the tailbud in Figure 5 would be more informative. Staining is difficult to see after CS13. The low-magnification views can be shown in an insert. Figures 5 and 6 can be combined.

      At the reviewer’s suggestion, we have merged Figures 5 and 6 into a revised Figure 5. Now, the sections provide higher magnification images of the areas of expression as shown in the lower magnification whole mount images. We feel this makes the gene expression findings much clearer than before.

      Some of the writing in the abstract, introduction, and results is very descriptive, with a lack of summary and integration of information. For instance, the abstract could be rewritten to include an overall conclusion at the end and a better description of the longstanding questions addressed. Moreover, the abstract suggests multiple lumens are not found in human specimens. Another example is the second paragraph of the introduction lists various NTDs but doesn't provide an integrative conclusion of the information. The discussion is much better but lacks a conclusion at the end.

      We agree that more concluding sentences should be used, as the Reviewer suggests. To this end, we have rewritten the Abstract (p 2) to emphasise the long-standing questions that our study addresses, and concluding sentences are now included in other places (e.g. somite results, p 8). A new ‘Conclusions’ section has been added at the end of the Discussion (pp 17-18).

      ADDITIONAL CHANGES MADE TO REVISED MANUSCRIPT

      Title. This has been amended to: “Spinal neural tube formation and tail development in human embryos” to reflect the greater focus on developmental events, and less on tail regression.

      Additional studies have been added to Supplementary Table 1, to include the main transcriptomic studies of human embryos in the primary/secondary neurulation stage range. This takes the number of previous studies to 28 and the total number of embryos to 925. See p 4, top and p 12, first paragraph for corresponding changes to the text.

      We added a sentence to the Discussion (p 13, first paragraph) to counter the claim that humans have undergone ‘tail-loss’, as included in Xia et al, 2024, “On the genetic basis of tail-loss evolution in humans and apes”. Nature 626:1042-8. Clearly, the human embryo is tailed, which undermines these authors’ statement.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In the paper, Yan and her colleagues investigate at which stage of development different categorical signals can be detected with EEG using a steady-state visual evoked potential paradigm. The study reports the development trajectory of selective responses to five categories (i.e., faces, limbs, corridors, characters, and cars) over the first 1.5 years of life. It reveals that while responses to faces show significant early development, responses to other categories (i.e., characters and limbs) develop more gradually and emerge later in infancy. The paper is well-written and enjoyable, and the content is well-motivated and solid.

      Strengths:

      (1) This study contains a rich dataset with a substantial amount of effort. It covers a large sample of infants across ages (N=45) and asks an interesting question about when visual category representations emerge during the first year of life.

      (2) The chosen category stimuli are appropriate and well-controlled. These categories are classic and important for situating the study within a well-established theoretical framework.

      (3) The brain measurements are solid. Visual periodicity allows for the dissociation of selective responses to image categories within the same rapid image stream, which appears at different intervals. This is important for the infant field, as it provides a robust measure of ERPs with good interpretability.

      Weaknesses:

      The study would benefit from a more detailed explanation of analysis choices, limitations, and broader interpretations of the findings. This includes:

      a) improving the treatment of bias from specific categories (e.g., faces) towards others;

      b) justifying the specific experimental and data analysis choices;

      c) expanding the interpretation and discussion of the results.

      I believe that giving more attention to these aspects would improve the study and contribute positively to the field.

      We thank the reviewer for their clear summary of the work and their constructive feedback. To address the reviewer’s concerns, in the revised manuscript we now provide a detailed explanation of analysis choices, limitations, and broader interpretations, as summarized in the point-by-point responses in the section: Reviewer #1 (Recommendations For The Authors) below, for which we give here an overview in points (a), (b), and (c):

      (a) The reviewer is concerned that using face stimuli as one of the comparison categories may hinder the detection of selective responses to other categories like limbs. Unfortunately, because of the frequency tagging design of our study we cannot compare the responses to one category vs. only some of the other categories (e.g. limbs vs objects but not faces). In other words, our experimental design does not enable us to do this analysis suggested by the reviewer. Nonetheless, we underscore that faces compromise only ¼ of contrast stimuli and we are able to detect significant selective responses to limbs, corridors and characters in infants after 6-8 months of age even as faces are included in the contrast and the response to faces continues to increase (see Fig 4). We discuss the reviewer’s point regarding how contrast can contribute to differences in findings in the discussion on pages 12-13, lines 344-351. Full details below in Reviewer 1: Recommendations for Authors - Frequency tagging category responses.

      (b) We expanded the justification of specific experimental and data analysis choices, see details below in Reviewer 1: Recommendations for Authors ->Specific choices for experiment and data analysis.

      (c) We expand the interpretation and discussion, see details below in Reviewer 1: Recommendations for Authors -> More interpretation and discussion.

      Reviewer #2 (Public Review):

      Summary:

      The current work investigates the neural signature of category representation in infancy. Neural responses during steady-state visually-evoked potentials (ssVEPs) were recorded in four age groups of infants between 3 and 15 months. Stimuli (i.e., faces, limbs, corridors, characters, and cars) were presented at 4.286 Hz with category changes occurring at a frequency of 0.857 Hz. The results of the category frequency analyses showed that reliable responses to faces emerge around 4-6 months, whereas responses to libs, corridors, and characters emerge at around 6-8 months. Additionally, the authors trained a classifier for each category to assess how consistent the responses were across participants (leave-one-out approach). Spatiotemporal responses to faces were more consistent than the responses to the remaining categories and increased with increasing age. Faces showed an advantage over other categories in two additional measures (i.e., representation similarity and distinctiveness). Together, these results suggest a different developmental timing of category representation.

      Strengths:

      The study design is well organized. The authors described and performed analyses on several measures of neural categorization, including innovative approaches to assess the organization of neural responses. Results are in support of one of the two main hypotheses on the development of category representation described in the introduction. Specifically, the results suggest a different timing in the formation of category representations, with earlier and more robust responses emerging for faces over the remaining categories. Graphic representations and figures are very useful when reading the results.

      Weaknesses:

      (1) The role of the adult dataset in the goal of the current work is unclear. All results are reported in the supplementary materials and minimally discussed in the main text. The unique contribution of the results of the adult samples is unclear and may be superfluous.

      (2) It would be useful to report the electrodes included in the analyses and how they have been selected.

      We thank the reviewer for their constructive feedback and for summarizing the strengths and weaknesses of our study. We revised the manuscript to address these two weaknesses.

      (1) The reviewer indicates that the role of the adult dataset is unclear. The goal of testing adult participants was to validate the EEG frequency tagging paradigm. We chose to use adults because a large body of fMRI research shows that both clustered and distributed responses to visual categories are found in adults’ high-level visual cortex. Therefore, the goal of the adult data is to determine whether with the same amount of data as we collect on average in infants, we have sufficient power to detect categorical responses using the frequency tagging experimental paradigm as we use in infants. Because this data serves as a methodological validation purpose, we believe it belongs to the supplemental data.

      We clarify this in the Results, second paragraph, page 5 where now write: “As the EEG-SSVEP paradigm is novel and we are restricted in the amount of data we can obtain in infants, we first tested if we can use this paradigm and a similar amount of data to detect category-selective responses in adults. Results in adults validate the SSVEP paradigm for measuring category-selectivity: as they show that (i) category-selective responses can be reliably measured using EEG-SSVEP with the same amount of data as in infants (Supplementary Figs S1-S2), and that (ii) category information from distributed spatiotemporal response patterns can be decoded with the same amount of data as in infants (Supplementary Fig S3).”

      (2) The reviewer asks us to report the electrodes used in the analysis and their selection. We note that the selection of electrodes included in the analyses has been reported in our original manuscript (Methods, section: Univariate EEG analyses). On pages 18-19, lines 530-538, we write: “Both image update and categorical EEG visual responses are reported in the frequency and time domain over three regions-of-interest (ROIs): two occipito-temporal ROIs (left occipitotemporal (LOT): channels 57, 58, 59, 63, 64, 65 and 68; right occipitotemporal (ROT) channels: 90, 91, 94, 95, 96, 99, and 100) and one occipital ROI (channels 69, 70, 71, 74, 75, 76, 82, 83 and 89). These ROIs were selected a priori based on a previously published study51. We further removed several channels in these ROIs for two reasons: (1) Three outer rim channels (i.e., 73, 81, and 88) were not included in the occipital ROI for further data analysis for both infant and adult participants because they were consistently noisy. (2) Three channels (66, 72, and 84) in the occipital ROI, one channel (50) in the LOT ROI, and one channel (101) in the ROT ROI were removed because they did not show substantial responses in the group-level analyses.”

      In the section Reviewer 2, Recommendations for the authors, we also addressed the reviewer’s minor points.

      Reviewer #3 (Public Review):

      Yan et al. present an EEG study of category-specific visual responses in infancy from 3 to 15 months of age. In their experiment, infants viewed visually controlled images of faces and several non-face categories in a steady state evoked potential paradigm. The authors find visual responses at all ages, but face responses only at 4-6 months and older, and other category-selective responses at later ages. They find that spatiotemporal patterns of response can discriminate faces from other categories at later ages.

      Overall, I found the study well-executed and a useful contribution to the literature. The study advances prior work by using well-controlled stimuli, subgroups of different ages, and new analytic approaches.

      I have two main reservations about the manuscript: (1) limited statistical evidence for the category by age interaction that is emphasized in the interpretation; and (2) conclusions about the role of learning and experience in age-related change that are not strongly supported by the correlational evidence presented.

      We thank the reviewer for their enthusiasm and their constructive feedback.

      (1) The overall argument of the paper is that selective responses to various categories develop at different trajectories in infants, with responses to faces developing earlier. Statistically, this would be most clearly demonstrated by a category-by-age interaction effect. However, the statistical evidence for a category by interaction effect presented is relatively weak, and no interaction effect is tested for frequency domain analyses. The clearest evidence for a significant interaction comes from the spatiotemporal decoding analysis (p. 10). In the analysis of peak amplitude and latency, an age x category interaction is only found in one of four tests, and is not significant for latency or left-hemisphere amplitude (Supp Table 8). For the frequency domain effects, no test for category by age interaction is presented. The authors find that the effects of a category are significant in some age ranges and not others, but differences in significance don't imply significant differences. I would recommend adding category by age interaction analysis for the frequency domain results, and ensuring that the interpretation of the results is aligned with the presence or lack of interaction effects.

      The reviewer is asking for additional evidence for age x category interaction by repeating the interaction analysis in the frequency domain. The reason we did not run this analysis in the original manuscript is that the categorical responses of interest are reflected in multiple frequency bins: the category frequency (0.857 Hz) and its harmonics, and there are arguments in the field as to how to quantify response amplitudes from multiple frequency bins (Peykarjou, 2022). Because there is no consensus in the field and also because how the different harmonics combine depends not just on their amplitudes but also on their phase, we chose to transform the categorical responses across multiple frequency bins from the frequency domain to the time domain. The transformed signal in the time domain includes both phase and amplitude information across the category frequency and its harmonics. Therefore, subsequent analyses and statistical evaluations were done in the time domain.

      However, we agree with the reviewer that adding category by age interaction analysis for the frequency domain results can further solidify the results. Thus, in the revised manuscript we added a new analysis, in which we quantified the root mean square (RMS) amplitude value of the responses at the category frequency (0.857 Hz) and its first harmonic (1.714 Hz) for each category condition and infant. Then we used a LMM to test for an age by category interaction. The LMM was conducted separately for the left and right lateral occipitotemporal ROIs. Results of this analysis find a significant category by age interaction, that is, in both hemispheres, the development of response RMS amplitudes varied across category (left occipitotemporal ROIs: βcategory x age = -0.21, 95% CI: -0.39 – -0.04, t(301) = -2.40, pFDR < .05; right occipitotemporal ROIs: βcategory x age = -0.26, 95% CI: -0.48 – -0.03, t(301) = -2.26, pFDR < .05). We have added this analysis in the manuscript, pages 7-8, lines 186-193: “We next examined the development of the category-selective responses separately for the right and left lateral occipitotemporal ROIs. The response amplitude was quantified by the root mean square (RMS) amplitude value of the responses at the category frequency (0.857 Hz) and its first harmonic (1.714 Hz) for each category condition and infant. With a  LMM analysis, we found significant development of response amplitudes in the both occipitotemporal ROIs which varied by category (left occipitotemporal ROIs: βcategory x age = -0.21, 95% CI: -0.39 – -0.04, t(301) = -2.40, pFDR < .05; right occipitotemporal ROIs: βcategory x age = -0.26, 95% CI – -0.48 – -0.03, t(301) = -2.26, pFDR < .05, LMM as a function of log (age) and category; participant: random effect).” We also added the formula for the LMM analysis in Table 1 in the Methods section, page 21.

      (2) The authors argue that their results support the claim that category-selective visual responses require experience or learning to develop. However, the results don't bear strongly on the question of experience. Age-related changes in visual responses could result from experience or experience-independent maturational processes. Finding age-related change with a correlational measure does not favor either of these hypotheses. The results do constrain the question of experience, in that they suggest against the possibility that category-selectivity is present in the first few months of development, which would in turn suggest against a role of experience. However the results are still entirely consistent with the possibility of age effects driven by experience-independent processes. The manner in which the results constrain theories of development could be more clearly articulated in the manuscript, with care taken to avoid overly strong claims that the results demonstrate a role of experience.

      Thanks for the comment. We agree with this nuanced point. It is possible that development of category-selective visual responses is a maturational process. In response to this comment, we have revised the manuscript to discuss both perspectives, see revised discussion section – A new insight about cortical development: different category representations emerge at different times during infancy, pages 14-15, lines 403-426, where we now write: “In sum, the key finding from our study is that the development of category selectivity during infancy is non-uniform: face-selective responses and representations of distributed patterns develop before representations to limbs and other categories. We hypothesize that this differential development of visual category representations may be due to differential visual experience with these categories during infancy. This hypothesis is consistent with behavioral research using head-mounted cameras that revealed that the visual input during early infancy is dense with faces, while hands become more prevalent in the visual input later in development and especially when in contact with objects 41,42. Additionally, a large body of research has suggested that young infants preferentially look at faces and face-like stimuli 17,18,33,34, as well as look longer at faces than other objects 41, indicating that not only the prevalence of faces in babies’ environments but also longer looking times may drive the early development of face representations. Further supporting the role of visual experience in the formation of category selectivity is a study that found that infant macaques that are reared without seeing faces do not develop face-selectivity but develop selectivity to other categories in their environment like body parts40. An alternative hypothesis is that differential development of category representations is maturational. For example, we found differences in the temporal dynamics of visual responses among four infant age groups, which suggests that the infant’s visual system is still developing during the first year of life. While the mechanisms underlying the maturation of the visual system in infancy are yet unknown, they may include myelination and cortical tissue maturation 66-71. Future studies can test these alternatives by examining infants’ visual diet, looking behavior, and brain development and examine responses using additional behaviorally relevant categories such as food 72–74. These measurements can test how environmental and individual differences in visual experiences may impact infants’ developmental trajectories. Specifically, a visual experience account predicts that differences in visual experience would translate into differences in development of cortical representations of categories, but a maturational account predicts that visual experience will have no impact on the development of category representations.”

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Major points:

      Bias from faces to other categories:

      - Frequency tagging category responses:

      We see faces from non-face objects and limbs from non-limb objects. Non-limb objects include faces; I suspect that finding the effects of limbs is challenging with faces in the non-limbs category. How would you clarify the choice of categories, and to what extent are the negative (i.e., non-significant) effects on other categories not because of the heavy bias to faces?

      The reviewer is concerned that using face stimuli as one of the comparison categories may hinder the ability to detect selective responses to other categories like limbs in our study. Unfortunately, because of the frequency tagging design of our study, we cannot compare the responses to one category to only some of the other categories (e.g. limbs vs objects but not faces), so our experimental design does not enable us to do the analysis suggested by the reviewer. Nonetheless, we underscore that faces compromise only ¼ of contrast stimuli in the category frequency tagging and we are able to detect significant selective responses to limbs, corridors and characters in infants after 6-8 months of age, when faces are included in the contrast and the responses to faces continue to increase more than for other categories (see Fig 4).

      We address this point in the discussion where we consider differences between our findings and those of Kosakowski et al. 2022, on pages 12-13, lines 344-351 we write: “We note that, the studies differ in several ways: (i) measurement modalities (fMRI in 27 and EEG here), (ii) the types of stimuli infants viewed: in 27 infants viewed isolated, colored and moving stimuli, but in our study, infants viewed still, gray-level images on phase-scrambled backgrounds, which were controlled for several low level properties, and (iii) contrasts used to detect category-selective responses, whereby in 27 the researchers identified within predefined parcels – the top 5% of voxels that responded to the category of interest vs. objects, here we contrasted the category of interest vs. all other categories the infant viewed. Thus, future research is necessary to determine whether differences between findings are due to differences in measurement modalities, stimulus format, and data analysis choices.”

      - Decoding analyses:

      Figure 5 Winner-take-all classification. First, the classifier may be biased towards the categories with strong and clean data, similar to the last point, this needs clarification on the negative effect. Second, it could be helpful to see how exactly the below-chance decoded categories were being falsely classified to which categories at the group level. Decoding accuracy here means a 20% chance the selection will go to the target category, but the prediction and the exact correlation coefficient the winner has is not explicit; concerning a value of 0.01 correlation could take the winner among negative or pretty bad correlations with other categories. It would be helpful to report how exactly the category was correlated, as it could be a better way to define the classification bias, for example, correlation differences between hit and miss classification. Also, the noise ceiling of the correlation within each group should be provided. Third, this classifier needs improvement in distinguishing between noise and signals to identify the type of information it extracts. Do you have thoughts about that?

      Thanks for the questions, answers below:

      In the winner-take-all (WTA) classifier analysis, at each iteration, the LOOCV classifier computed the correlation between each of the five category vectors from the left-out participant (test data, for an unknown stimulus) and each of the mean spatiotemporal vectors across the N-1 participants (training data, labeled data). The winner-take-all (WTA) classifier classifies the test vector to the category that yields the highest correlation with the training vector. For a given test pattern, correct classification yielded a score of 1 and an incorrect classification yielded a score of 0. Then we computed the group mean decoding performance across all N iterations for each category and the group mean decoding accuracies across five categories.

      For the classification data in Fig 5, the statistics and differences from chance are provided in 5B, where we report overall classification across all categories from an infant’s brain data. Like the reviewer, we were interested in assessing if successful classification is uniform across categories or is driven by some categories. As is visible in 5C, decoding success is non-uniform across categories, and is higher for faces than other categories. Because this is broken by category we cannot compare to chance, and what is reported in Fig 5c is percentage infants in each age group that a particular category was successfully decoded. Starting from 4 months of age, faces can be decoded from distributed brain data in a majority of infants, but other categories only in 20-40% of infants. 

      The reviewer also asks about what levels of correlations drive the classification. The analysis of RSMs in Fig 6a shows the mean correlations of distributed responses to different images within and between categories per age group. As is evident from the RSM, reproducible responses for a category only start to emerge at 4-6 months of age and the highest within category correlations are for faces. To quantify what drives the classification we measure distinctiveness - within category minus between-category correlations of distributed responses; all individual infant data per category are in Fig 6C. Distinctiveness values vary by age and category, see text related to Fig 6 in section: What is the nature of categorical spatiotemporal patterns in individual infants?

      Figure 6 Category distinctiveness. An analysis that runs on a "single item level" would ideally warrant a more informative category distinction. Did you try that? Does it work?

      Thanks for the question. We agree that doing an analysis at the single item level would be interesting. However, none of the images were repeated, so we do not have sufficient SNR to perform this analysis.

      Specific choices for experiment and data analysis:

      - Although using the SSVEP paradigm is familiar to the field, the choice could be detailed for understanding or evaluation of the effectiveness of the paradigm. For example, how the specific frequency for entrainment was chosen, and are there any theories or related warrants for studying in infants?

      Thanks for the questions. We choose to use the SSVEP paradigm over traditional ERP designs for several reasons, as described which have been listed in our original manuscript (Results part, first paragraph, pages 4-5, lines 90-94): “We used the EEG-SSVEP approach because: (i) it affords a high signal-to-noise ratio with short acquisitions making it effective for infants 23,46, (ii) it has been successfully used to study responses to faces in infants23,46,49, and (iii) it enables measuring both general visual response to images by examining responses at the image presentation frequency (4.286 Hz), as well as category-selective responses by examining responses at the category frequency (0.857 Hz, Fig 1A).”

      With regards to our choice of presentation rate, a previous study in 4-6-month-olds by de Heering and Rossion (2015) used SSVEP showing infants faces and objects presented the visual stimuli at 6 Hz (i.e. 167 ms per image) to study infants’ categorical responses to natural faces relative to objects. Here, we chose to use a relatively slower presentation rate, which was 4.286 Hz (i.e. 233 ms per image), so that our infant participants would have more time to process each image yet still unlikely to make eye movements across a stimulus. Both de Heering et (2015) and our study have found significant selective responses to faces relative to other categories in 4-6-month-olds, across these presentation rates. As discussed in a recent review of frequency tagging with infants: The visual oddball paradigm (Peykarjou, 2022), there are many factors to consider when adapting SSVEP paradigms to infants. We agree that an interesting direction for future studies is examination of how SSVEP parameters such as stimulus and oddball presentation rate, and overall duration of acquisition affects the sensitivity of the SSVEP paradigm in infants. We added a discussion point on this on page 12, lines 332-334 where we write: “As using SSVEP to study high-level representations is a nascent field52–54, future work can further examine how SSVEP parameters such as stimulus and target category presentation rate may affect the sensitivity of measurements in infants (see review by54).”

      - There is no baseline mentioned in the study. How was the baseline considered in the paradigm and data analysis? The baseline is important for evaluating how robust/ reliable the periodic responses within each group are in the first place. It also helps us to see how different the SNR changes in the fast periodic responses from baseline across age groups. Would the results be stable if the response amplitudes were z-scored by a baseline?

      Thanks for the question. Previous studies using a similar frequency tagging paradigm have compared response amplitude at stimulus-related frequencies to that of neighboring frequency bins as their baseline for differentiating signal from noise. We use a more statistically powerful method, the Hotelling’s T2 statistic to test whether response amplitudes were statistically different from 0 amplitude. Importantly, this method takes into consideration both the amplitude and phase information of the response. That is, a significant response is expected to have consistent phase information across participants as well as significant amplitude.

      - Statistical inferences: could the variance of data be considered appropriately in your LLM? Why?

      As we have explained in our original manuscript (Methods part, section-Statistical Analyses of Developmental Effects, page 21 lines 611-615): “LMMs allow explicit modeling of both within-subject effects (e.g., longitudinal measurements) and between-subject effects (e.g., cross-sectional data) with unequal number of points per participants, as well as examine main and interactive effects of both continuous (age) and categorical (e.g., stimulus category) variables. We used random-intercept models that allow the intercept to vary across participants (term: 1|participant).” This statistical model is widely used in developmental studies that combine both longitudinal and cross-sectional measurements (e.g. Nordt et al. 2022, 2023; Natu et al. 2021; Grotheer et al. 2022).

      - The sampling of the age groups. Why are these age groups considered, as 8-12 months are not considered? Or did the study first go with an equal sampling of the ages from 3 to 15 months? Then how was the age group defined? The log scale of age makes sense for giving a simplified view of the effects, but the sampling procedure could be more detailed.

      Thanks for the question. Our study recruited infants longitudinally for both anatomical MRI and EEG studies. Some of the infants participated in both studies and some only in one of the studies. Infants were recruited at around newborn, 3 months, 6 months, and 12 months. We did not recruit infants between 8-12 months of age because around 9 months there is little contrast between gray and white matter in anatomical MRI scans that were necessary for the MRI study. For the EEG study we binned the subjects by age group such that there were a similar number of participants across age groups to enable similar statistical power. The division of age groups was decided based on the distribution of the infants included in the analyses.

      We have now added the sampling procedure details in the Methods, part, under section: Participants, pages 15-16, lines 440-445: “Sixty-two full-term, typically developing infants were recruited. Twelve participants were part of an ongoing longitudinal study that obtained both anatomical MRI and EEG data in infants. Some of the infants participated in both studies and some only in one of the studies. Infants were recruited at around newborn, 3 months, 6 months, and 12 months. We did not recruit infants between 8-12 months of age because around 9 months there is little contrast between gray and white matter in anatomical MRI scans that were necessary for the MRI study.”

      - 30 Hz cutoff is arbitrary, but it makes sense as most EEG effects can be expected in a lower frequency band than higher. However, this specific choice is interesting and informative, when faced with developmental data and this type of paradigm. Would the results stay robust as the cutoff changes? Would the results benefit from going even lower into the frequency cutoff?

      In the time domain analyses, we choose the 30 Hz cutoff to be consistent with previous EEG studies including those done with infants. However, as our results from the frequency domain (Fig 3, right panel, and supplementary Fig S6-S9) show that there are barely any selective categorical responses above about 6 Hz. Therefore, we expect that using a lower frequency cutoff, such as 10 Hz, will not lead to different results.

      More interpretation and discussion:

      - You report the robust visual responses in occipital regions, the responses that differ across age groups, and their characteristics (i.e., peak latency and amplitude) in time curves. This part of the results needs more interpretation to help the data be better situated in the field; I wondered whether this relates to the difference in the signal processing of the information. Could this be the signature of slow recurrence connection development? Or how could this be better interpreted?

      Thanks for the question. Changes in speed of processing can arise from several related reasons including (i) myelination of white matter connections that would lead to faster signal transmission (Lebenberg et al. 2019; Grotheer et al. 2022), (ii) maturation of cortical visual circuits affecting temporal integration time, and (iii) development of feedback connections. Our data cannot distinguish among these different mechanisms. Future studies that combine functional high temporal resolution measurements with structural imaging of tissue properties could elucidate changes in cortical dynamics over development.

      We added this as a discussion point, on page 15 lines 416-420 we write: “For example, we found differences in the temporal dynamics of visual responses among four infant age groups, which suggests that the infant’s visual system is still developing during the first year of life. While underlying maturational mechanisms are yet unknown, they may include myelination and cortical tissue maturation68–73.”

      - The supplementary material includes a detailed introduction to the methods when facing the developing visual acuity, which justifies the choice of the paradigm. I appreciate this thorough explanation. Interestingly, high visual acuity has its potential developmental downside; for instance, low visual acuity would aid in the development of holistic processing associated with face recognition (as discussed by Vogelsang et al., 2018, in PNAS). How do you view this point in relation to the emergence of complex cognitive processes, as here the category-selective responses?

      Thanks for linking this to the Vogelsang (2018) study. Just as faces are processed in a hierarchical manner, starting with low-level features (edges, contours) and progressing to high-level features (identity, expression), other complex visual categories like cars, scenes, and body parts follow similar hierarchies. Early holistic processing could provide a foundation for recognizing objects quickly and efficiently, while feature-based processing might allow for more precise recognition and categorization as acuity increases. Therefore, as visual acuity improves, an infant’s brain can integrate finer details into those holistic representations, supporting more refined and complex cognitive processes. The balance between low- and high-level visual acuity highlights the intricate interplay between sensory processing and cognitive development across various domains.

      Minor points:

      Paradigm:

      - Are the colored cartoon images for motivating infants' fixation counterbalanced across categories in the paradigm? Or how exactly were the cartoon images presented in the paradigm?

      Response: Yes, the small cartoon images that were presented at the center of the screen during stimuli presentation were used to engage infants’ attention and accommodation to the screen. For each condition, they were randomly drawn from a pool of 70 images (23 flowers, 22 butterflies, 25 birds) from categories unrelated to the ones under test. They were presented in random order with durations uniformly distributed between 1 and 1.5 s.  We have added these details of the paradigm to the Methods section, page 17, lines 479-481: “To motivate infants to fixate and look at the screen, we presented at the center of the screen small (~1°) colored cartoon images such as butterflies, flowers, and ladybugs. They were presented in random order with durations uniformly distributed between 1 and 1.5 s.”

      Analysis:

      - Are the visual responses over the occipital cortex different across different category conditions in the first place? I guess this should not be different; this probably needs one more supplementary figure.

      The visual responses reflect the responses to images that are randomly drawn from the five stimuli categories at a presentation frequency of 4.286 Hz. The only difference between the five conditions is that the stimuli presentation order is different. Therefore, the visual response over the occipital cortex across conditions should not be different within an age group.

      In the revised manuscript, we have added Supplementary Figure S5 that shows the frequency spectra distribution and the response topographies of the visual response at 4.286 Hz and its first 3 harmonics separately for each condition and age group and a new Supplementary Materials section: 5. Visual responses over occipital cortex per condition for all age groups. On page 5, lines 116-120, we now write: “Analysis of visual responses in the occipital ROI separately by category condition revealed that visual responses were not significantly across category condition (Supplementary Fig S5, no significant main effect of category (βcategory = 0.08, 95% CI: -0.08 – 0.24, t(301) \= 0.97, p = .33), or category by age interaction (βcategory x age = -0.04, 95% CI: -0.11 – 0.03, t(301) \= -1.09, p = .28, LMM on RMS of response to first three harmonics).”

      - The summary of epochs used for each category for each age group needs to be included; this is important while evaluating whether the effects are due to not having enough data for categories or others.

      This part of information is provided in the manuscript in the Methods section, page 18 lines 521-524, and supplementary Table S2. Our analysis shows that there was no significant difference in the number of pre-processed epochs across different age groups (F(3,57) = 1.5, p \= .2).

      - Numbers of channels of EEG being interpolated should be provided; is that a difference across age groups?

      Thanks for the suggestion. We have now added information about the number of channels being interpolated for each age groups in the Methods section (page 18, lines 525-528): “The number of electrodes being interpolated for each age group were 10.0 ± 4.8 for 3-4-month-olds, 9.9 ± 3.7 for 4-6-month-olds, 9.9 ± 3.9 for 6-8-month-olds, and 7.7 ± 4.7 for 12-15-month-olds. There was no significant difference in the number of electrodes being interpolated across infant age-groups (F(3,55) = 0.78, p = .51).”

      - I noticed that the removal of EEG artifacts (i.e., muscles and eye-blinks) for data analysis is missing; did the preprocessing pipeline involve any artifacts removing procedures that are typically used in both infants and adults SSVEP data analysis? If so, please provide more information.

      In our analysis, artifact rejection was performed in two steps. First, the continuous filtered data were evaluated according to a sample-by-sample thresholding procedure to locate consistently noisy channels. Channels with more than 20% of samples exceeding a 100-150 μV amplitude threshold were replaced by the average of their six nearest spatial neighbors. Once noisy channels were interpolated in this fashion, the EEG was re-referenced from the Cz reference used during the recording to the common average of all sensors and segmented into epochs (1166.7-ms). Finally, EEG epochs that contained more than 15% of time samples exceeding threshold (150-200 microvolts) were excluded on a sensor-by-sensor basis. This method is provided in the manuscript under Methods section, page 18 lines 510-516.

      Figure:

      - Supplementary Figure 8. The illustration of the WTA classifier was not referred to anywhere in the main text.

      Thanks for pointing this out. The supplementary Figure 8 should be noted as supplementary Figure 10 instead. We have now mentioned it in the manuscript, page 10, line 267.

      - Figure 5 WTA classifier needed to be clarified. It was correlation-based but used to choose the most correlated response patterns averaged across the N-1 subjects for the leave-one-out subject. The change from correlation coefficients to decoding accuracy could be clearer as I spent some time making sense of it. The correlation coefficient here evaluates how correlated the two vectors are, but the actual decoding accuracy estimated at the end is the percentage of participants who can be assigned to the "ground truth" label, so one step in between is missing. Can this be better illustrated?

      Thanks for surfacing that this is not described sufficiently clearly and for your suggestions. The spatiotemporal vector was calculated separately for each category. This is illustrated in Fig 5A. At each iteration, the LOOCV classifier computed the correlation between each of the five category vectors from the left-out participant (test data, for an unknown stimulus) and each of the mean spatiotemporal vectors across the N-1 participants (training data, labeled data). The winner-take-all (WTA) classifier classifies the test vector to the category that yields the highest correlation with the training vector. This is illustrated in Fig 5A, with spatiotemporal patterns and correlation values from an example infant shown.  For a given test pattern, correct classification yields a score of 1 and an incorrect classification yields a score of 0.  We compute the percentage correct across all categories for each left-out-infant, and then mean decoding performance across all participants in an age group (Fig 5B). We have now added these details in the Methods part, section – Decoding analyses, Group-level, page 20 lines 590-597, where we write: “At each iteration, the LOOCV classifier computed the correlation between each of the five category vectors from the left-out participant (test data, for an unknown stimulus) and each of the mean spatiotemporal vectors across the N-1 participants (training data, labeled data). The winner-take-all (WTA) classifier classifies the test vector to the category of the training vector that yields the highest correlation with the training vector (Fig 5A). For a given test pattern, correct classification yields a score of 1 and an incorrect classification yields a score of 0.  For each left-out infant, we computed the percentage correct across all categories, and then the mean decoding performance across all participants in an age group (Fig 5B).”

      Reviewer #2 (Recommendations For The Authors):

      I only have some minor comments.

      Typo on line 90 ("Infants participants in 5 conditions, which [...]").

      Thanks for pointing this out. We have now corrected ‘participants’ to ‘participated’.

      Typo on lines 330: "[...] in example 4-5-months-olds.".

      Thanks for pointing this out. We changed ‘4-5-months-olds’ to ‘4-5-month-olds’.

      Figure 2 - bar plots: rotating and spacing out values on the x-axis may improve readability. Ditto for the line plots in Figure 4.

      Thanks for the suggestions. In the revised manuscript, we have improved the readability of Figure 2.

      Caption of Figure 6: description of the distinctiveness plots may refer to panel C, instead of the bottom panels of section B.

      Thanks for pointing this out. We have now corrected this information in the manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Opioids and related drugs are powerful analgesics that reduce suffering from pain. Unfortunately, their use often leads to addiction and there is an opioid-abuse epidemic that affects people worldwide. This study represents an ongoing effort to develop non-opioid analgesics for pain management. The findings point to an alternative approach to control post-surgical pain in lieu of opioid medications.

      Strengths:

      (1) The study responds to the urgent need for the development of non-opioid analgesics.

      (2) The study demonstrates the efficacy of Clarix Flo (FLO) and HC-HA/PTX3 from the human amniotic membrane (AM) in reducing pain in a mouse model without the adverse effects of opioids.

      (3) The study further explored the underlying mechanisms of how HC-HA/PTX3 produces its effects on neurons, suggesting the molecules/pathways involved in pain relief.

      (4) The potential use of naturally derived biologics from human birth tissues (AM) is safe and sustainable, compared to synthetic pharmaceuticals.

      (5) The study was conducted with scientific rigor, involving purification of active components, comparative analysis with multiple controls, and mechanistic explorations.

      Weaknesses:

      (1) It should be cautioned that while the preclinical findings are promising, these results still need to be translated into clinical settings that are complex and often unpredictable.

      (2) The study shows the efficacy of FLO and HC-HA/PTX3 in one preclinical model of post-surgical pain. The observed effect may be variable in other pain conditions.

      We thank the reviewer for these good comments and support! We agree with your suggestions and have provided more information in the discussion (Pages 11-12) and conclusion to address these comments.

      Reviewer #2 (Public review):

      Summary:

      This is an outstanding piece of work on the potential of FLO as a viable analgesic biologic for the treatment of postsurgical pain. The authors purified the HC-HA/PTX3 from FLO and demonstrated its potential as an effective non-opioid therapy for postsurgical pain. They further unraveled the mechanisms of action of the compound at cellular and molecular levels.

      Strengths:

      Prominent strengths include the incorporation of behavioral assessment, electrophysiological and imaging recordings, the use of knockout and knockdown animals, and the use of antagonist agents to verify biological effects. The integrated use of these techniques, combined with the hypothesis-driven approach and logical reasoning, provides compelling evidence and novel insight into the mechanisms of the significant findings of this work.

      Weaknesses:

      I did not find any significant weaknesses even with a critical mindset. The only minor suggestion is that the Results section may focus on the results from this study and minimize the discussions of background information.

      We thank the reviewer for your support! We revised the result section as suggested and reduced the discussion of background information.   

      Reviewer #3 (Public review):

      Summary:

      Non-opioid analgesics derived from human amniotic membrane (AM) product represents a novel and unique approach to analgesia that may avoid the traditional harms associated with opioids. Here, the study investigators demonstrate that HC-HAPTX3 is the primary bioactive component of the AM product FLO responsible for anti-nociception in mouse-model and in-vitro dorsal root ganglion (DRG) cell culture experiments. The mechanism is demonstrated to be via CD44 with an acute cytoskeleton rearrangement that is induced that inhibits Na+ and Ca++ current through ion channels. Taken together, the studies reported in the manuscript provide supportive evidence clarifying the mechanisms and efficacy of HC-HAPTX3 antinociception and analgesia.

      Strengths:

      Extensive experiments including murine behavioral paw withdrawal latency and Catwalk test data demonstrating analgesic properties. The breadth and depth of experimental data are clearly supporting mechanisms and antinociceptive properties.

      Weaknesses:

      A few changes to the text of the manuscript would be recommended but no major weaknesses were identified.

      We thank the reviewer for your support! We revised these texts as suggested. 

      Recommendations for the authors: Reviewer #1 (Recommendations for the authors):

      (1) The study showed an effect on baseline nociception and acute post-surgical pain. Chronic post-surgical pain is a major problem and should be considered.

      We thank the reviewer for this comment. To further improve the translational potential, we will extend current findings and employ chronic post-surgical pain models, such as skin/muscle incision and retraction (SMIR) in the thigh of the rodent,(1-3) as well as chronic pain models such as neuropathic pain in the future.  We acknowledged this limitation in the discussion. (Page 12)

      (2) Indicate the source of cultures DRGs.

      We added “Method 15 Culturing DRG neurons” in the revised manuscript.   

      (3) The size of DRG neurons was described in cross-sectional area (Figure 2 caption) and diameter (method). Be consistent.

      We thank the reviewer for this comment. Cross-sectional area has often been used for describing the size of DRG neurons for in vivo calcium imaging studies, including our previous work (4, 5). In order to keep consistent and make data comparable between studies, we also used the cross-sectional area in current study in Fig 2 in vivo calcium imaging experiment.  On the other hand, cell-diameter has been routinely/widely used for in vitro experiments such as in vitro electrophysiology recording and immunofluorescence staining of cultured DRG neurons. To be consistent with this tradition, we used cell-diameter in these experiments.  Methods for measuring the area and diameter are explicitly described for each experimental setting, and consistent between the current study and our previous studies (6). In the manuscript, our previously published studies have also been cited in the Methods section. (Method “4 In vivo calcium imaging in mice” and “10.2 Intrinsic excitability studies of DRG neurons”).

      (4) Clarify what "% of total" means in Figure 2. For bar graphs in 2B-D, the percent of total activated neurons (small, medium, and large) does not add up to 100.

      “% of total” represented the proportion of activated neurons relative to the total number of neurons counted from the same analyzed image. This information was added to the figure legend of Figure 2 (B-C) and Method “4 In vivo calcium imaging in mice”  in the revised manuscript. At the end of each experiment, we can over-exposure the image to unravel all neuronal profiles and count the total number of neurons on that field/image. Only a small portion of neurons in each size category responded to the test stimulation, and hence the total does not add up to 100.

      (5) Discuss clinical data or human studies to validate the efficacy and safety of FLO or HC-HA/PTX3 in patients.

      Thanks for the great suggestion. We provided a brief discussion (Page 11-12).

      Cryopreserved AM/UC has been clinically validated through several hundred peer-reviewed publications since 1995, including 12 studies specifically assessing FLO (Clarix Flo). These studies collectively support the safety and preliminary effectiveness of Clarix Flo in managing some clinical pain conditions such as knee osteoarthritis(7, 8), discogenic pain (9), rotator cuff tears(10), and painful neuropathy of the lower extremities (11). Currently, HC-HA/PTX3 is limited to pre-clinical research, and to our knowledge, there are no available data on its clinical efficacy and safety.

      (6) Introduction, last sentence of the second paragraph, delete "also".

      Thanks for carefully examining our manuscript. It was revised as suggested.

      Reviewer #2 (Recommendations for the authors):

      My only recommendation for improving the writing and presentation is to shorten the discussion of background information in Results.

      We thank the reviewer for your support and comments!  We previously intended to provide some background information to help readers understand the premise and rationale of the study, before illustrating our findings. Nevertheless, we reduced some background information in the result section as suggested by this reviewer to make it more straightforward. 

      Reviewer #3 (Recommendations for the authors):

      P4 last sentence - "Our findings highlight the potential of a naturally derived biologic from human birth tissues as an effective non-opioid treatment for post-surgical pain and unravel the underlying mechanisms." - another sentence clause is required before "unravel".

      As advised, we revised the sentence to: “Collectively, our findings highlight the potential of naturally derived biologics from human birth tissues as an effective non-opioid treatment for post-surgical pain. Moreover, we unravel the underlying mechanisms of pain inhibition induced by FLO and HC-HA/PTX3.”

      P7 second paragraph - please edit the following sentence for clarity: "Since HC-HA/PTX3 mimics FLO in producing pain inhibition, and it has high purity and is more water-soluble than FLO, making it suitable for probing cellular mechanisms.".

      As advised, we have revised the sentence. “Since HC-HA/PTX3 mimics FLO in its ability to inhibit pain and has higher purity and greater water solubility compared to FLO, it is well-suited for investigating cellular mechanisms.”

      References:

      (1) Flatters SJ. Characterization of a model of persistent postoperative pain evoked by skin/muscle incision and retraction (SMIR). Pain. 2008;135(1-2):119-30.

      (2) Ying YL, Wei XH, Xu XB, She SZ, Zhou LJ, Lv J, et al. Over-expression of P2X7 receptors in spinal glial cells contributes to the development of chronic postsurgical pain induced by skin/muscle incision and retraction (SMIR) in rats. Experimental neurology. 2014;261:836-43.

      (3) Cao S, Bian Z, Zhu X, and Shen SR. Effect of Epac1 on pERK and VEGF Activation in Postoperative Persistent Pain in Rats. Journal of molecular neuroscience : MN. 2016;59(4):554-64.

      (4) Chen Z, Huang Q, Song X, Ford NC, Zhang C, Xu Q, et al. Purinergic signaling between neurons and satellite glial cells of mouse dorsal root ganglia modulates neuronal excitability in vivo. Pain. 2022;163(8):1636-47.

      (5) Chen Z, Zhang C, Song X, Cui X, Liu J, Ford NC, et al. BzATP Activates Satellite Glial Cells and Increases the Excitability of Dorsal Root Ganglia Neurons In Vivo. Cells. 2022;11(15).

      (6) Ford NC, Barpujari A, He SQ, Huang Q, Zhang C, Dong X, et al. Role of primary sensory neurone cannabinoid type-1 receptors in pain and the analgesic effects of the peripherally acting agonist CB-13 in mice. Br J Anaesth. 2022;128(1):159-73.

      (7) Castellanos R, and Tighe S. Injectable Amniotic Membrane/Umbilical Cord Particulate for Knee Osteoarthritis: A Prospective, Single-Center Pilot Study. Pain Med. 2019;20(11):2283-91.

      (8) Mead OG, and Mead LP. Intra-Articular Injection of Amniotic Membrane and Umbilical Cord Particulate for the Management of Moderate to Severe Knee Osteoarthritis. Orthop Res Rev. 2020;12:161-70.

      (9) Buck D. Amniotic Umbilical Cord Particulate for Discogenic Pain. J Am Osteopath Assoc. 2019;119(12):814-9.

      (10) Ackley JF, Kolosky M, Gurin D, Hampton R, Masin R, and Krahe D. Cryopreserved amniotic membrane and umbilical cord particulate matrix for partial rotator cuff tears: A case series. Medicine (Baltimore). 2019;98(30):e16569.

      (11) Buksh AB. Ultrasound-guided injections of amniotic membrane/umbilical cord particulate for painful neuropathy of the lower extremity. Cogent Medicine. 2020;7(1):1724067.

    1. Author response:

      eLife Assessment

      “The work presented is important for our understanding of the development of the cardiac conduction system and its regulation by T-box transcription factors. The conclusions are supported by convincing data. Overall, this is an excellent study that advances our understanding of cardiac biology and has implications beyond the immediate field of study.”

      We appreciate the positive assessment of this work and the recognition of its importance in advancing our understanding of the cardiac conduction system, its regulation by T-box transcription factors, and contribution beyond the immediate field.

      Reviewer #1 (Public review):

      Summary:

      In a heroic effort, Ozanna Burnicka-Turek et al. have made and investigated conduction system-specific Tbx3-Tbx5 deficient mice and investigated their cardiac phenotype. Perhaps according to expectations, given the body of literature on the function of the two T-box transcription factors in the heart/conduction system, the cardiomyocytes of the ventricular conduction system seemed to convert to "ordinary" ventricular working myocytes. As a consequence, loss of VCS-specific conduction system propagation was observed in the compound KO mice, associated with PR and QRS prolongation and elevated susceptibility to ventricular tachycardia.

      Strengths:

      Great genetic model. Phenotypic consequences at the organ and organismal levels are well investigated. The requirement of both Tbx3 and Tbx5 for maintaining VCS cell state has been demonstrated.

      We thank Reviewer #1 for acknowledging the effort involved in generating and characterizing the Tbx3/Tbx5 double conditional knockout mouse model and for highlighting the significance of this work in elucidating the role of these transcription factors in maintaining the functional and transcriptional identity of the ventricular conduction system.

      Weaknesses:

      The actual cell state of the Tbx3/Tbx5 deficient conducting cells was not investigated in detail, and therefore, these cells could well only partially convert to working cardiomyocytes, and may, in reality, acquire a unique state.

      We agree with Reviewer #1 that the Tbx3/Tbx5 double mutant ventricular conduction myocardial cells may only partially convert to working cardiomyocytes or may acquire a unique state.  The transcriptional state of the double mutant VCS cells was investigated by bulk profiling of key genes associated with specific conduction and non-conduction cardiac regions, including fast conduction, slow conduction, or working myocardium. Neither the bulk transcriptional approaches nor the optical mapping approaches we employed capture single-cell data; in both cases, the data represents aggregated signals from multiple cells (1, 2). Single cell approaches for transcriptional profiling and cellular electrophysiology would clarify this concern and are appropriate for future studies.

      (1) O’Shea C, Nashitha Kabri S, Holmes AP, Lei M, Fabritz L, Rajpoot K, Pavlovic D (2020) Cardiac optical mapping – State-of-the-art and future challenges. The International Journal of Biochemistry & Cell Biology 126:105804. doi: 10.1016/j.biocel.2020.105804.

      (2) Efimov IR, Nikolski VP, and Salama G (2004) Optical Imaging of the Heart. Circulation Research 95:21-33. doi: 10.1161/01.RES.0000130529.18016.35.

      Reviewer #2 (Public review):

      Summary:

      The goal of this work is to define the functions of T-box transcription factors Tbx3 and Tbx5 in the adult mouse ventricular cardiac conduction system (VCS) using a novel conditional mouse allele in which both genes are targeted in cis. A series of studies over the past 2 decades by this group and others have shown that Tbx3 is a transcriptional repressor that patterns the conduction system by repressing genes associated with working myocardium, while Tbx5 is a potent transcriptional activator of "fast" conduction system genes in the VCS. In a previous work, the authors of the present study further demonstrated that Tbx3 and Tbx5 exhibit an epistatic relationship whereby the relief of Tbx3-mediated repression through VCS conditional haploinsufficiency allows better toleration of Tbx5 VCS haploinsufficiency. Conversely, excess Tbx3-mediated repression through overexpression results in disruption of the fast-conduction gene network despite normal levels of Tbx5. Based on these data the authors proposed a model in which repressive functions of Tbx3 drive the adoption of conduction system fate, followed by segregation into a fast-conducting VCS and slow-conduction AVN through modulation of the Tbx5/Tbx3 ratio in these respective tissue compartments.

      The question motivating the present work is: If Tbx5/Tbx3 ratio is important for slow versus fast VCS identity, what happens when both genes are completely deleted from the VCS? Is conduction system identity completely lost without both factors and if so, does the VCS network transform into a working myocardium-like state? To address this question, the authors have generated a novel mouse line in which both Tbx5 and Tbx3 are floxed on the same allele, allowing complete conditional deletion of both factors using the VCS-specific MinK-CreERT2 line, convincingly validated in previous work. The goal is to use these double conditional knockout mice to further explore the model of Tbx3/Tbx5 co-dependent gene networks and VCS patterning. First, the authors demonstrate that the double conditional knockout allele results in the expected loss of Tbx3 and Tbx5 specifically in the VCS when crossed with Mink-CreERT2 and induced with tamoxifen. The double conditional knockout also results in premature mortality. Detailed electrophysiological phenotyping demonstrated prolonged PR and QRS intervals, inducible ventricular tachycardia, and evidence of abnormal impulse propagation along the septal aspect of the right ventricle. In addition, the mutants exhibit downregulation of VCS genes responsible for both fast conduction AND slow conduction phenotypes with upregulation of 2 working myocardial genes including connexin-43. The authors conclude that loss of both Tbx3 and Tbx5 results in "reversion" or "transformation" of the VCS network to a working myocardial phenotype, which they further claim is a prediction of their model and establishes that Tbx3 and Tbx5 "coordinate" transcriptional control of VCS identity.

      We appreciate Reviewer #2’s detailed summary of the study’s aims, methodologies, and findings, as well as their thoughtful suggestions for further analysis. We are grateful for their recognition of our genetic model’s novelty and robustness.

      Overall Appraisal:

      As noted above, the present study does not further explore the Tbx5/Tbx3 ratio concept since both genes are completely knocked out in the VCS. Instead, the main claims are that the absence of both factors results in a transcriptional shift of conduction tissue towards a working myocardial phenotype, and that this shift indicates that Tbx5 and Tbx3 "coordinate" to control VCS identity and function.

      We agree with this reviewer’s assessment of the assertions in our manuscript.  The novel combined Tbx5/Tbx3 double mutant model does not further explore the TBX5/TBX3 ratio concept, which we previously examined in detail (1). Instead, as the Reviewer notes, this manuscript focuses on testing a model that the coordinated activity of Tbx3 and Tbx5 defines specialized ventricular conduction identity.

      (1) Burnicka-Turek O, Broman MT, Steimle JD, Boukens BJ, Petrenko NB, Ikegami K, Nadadur RD, Qiao Y, Arnolds DE, Yang XH, Patel VV, Nobrega MA, Efimov IR, Moskowitz IP (2020) Transcriptional Patterning of the Ventricular Cardiac Conduction System. Circulation Research 127:e94-e106. doi:10.1161/CIRCRESAHA.118.314460. 

      Strengths:

      (1) Successful generation of a novel Tbx3-Tbx5 double conditional mouse model.

      (2) Successful VCS-specific deletion of Tbx3 and Tbx5 using a VCS-specific inducible Cre driver line.

      (3) Well-powered and convincing assessments of mortality and physiological phenotypes.

      (4) Isolation of genetically modified VCS cells using flow.

      We thank Reviewer #2 for acknowledging the listed strengths of our study.

      Weaknesses:

      (1) In general, the data is consistent with a long-standing and well-supported model in which Tbx3 represses working myocardial genes and Tbx5 activates the expression of VCS genes, which seem like distinct roles in VCS patterning. However, the authors move between different descriptions of the functional relationship and epistatic relationship between these factors, including terms like "cooperative", "coordinated", and "distinct" at various points. In a similar vein, sometimes terms like "reversion" are used to describe how VCS cells change after Tbx3/Tbx5 conditional knockout, and other times "transcriptional shift" and at other times "reprogramming". But these are all different concepts. The lack of a clear and consistent terminology for describing the phenomena observed makes the overarching claims of the manuscript more difficult to evaluate.

      We discriminate prior work on the “long-standing and well-supported model’ supported by investigation of the role of Tbx5 and Tbx3 independently from this work examining the coordinated role of Tbx5 and Tbx3. Prior work demonstrated that Tbx3 represses working myocardial genes and Tbx5 activates expression of VCS genes, consistent with the reviewer’s suggestion of their distinct roles in VCS patterning. However, the current study uniquely evaluates the combined role of Tbx3 and Tbx5 in distinguishing specialized conduction identify from working myocardium, for the first time.

      We appreciate Reviewer #2’s feedback regarding the need for consistent terminology when describing the impact of the double Tbx3 and Tbx5 mutant. We will edit the manuscript to replace terms like “reversion” with “transcriptional shift” or “transformation” when describing the observed phenotype, and we will use “coordination” to describe the combined role of Tbx5 and Tbx3 in maintaining VCS-specific identity.

      (2) A more direct quantitative comparison of Tbx5 Adult VCS KO with Tbx5/Tbx3 Adult VCS double KO would be helpful to ascertain whether deletion of Tbx3 on top of Tbx5 deletion changes the underlying phenotype in some discernable way beyond mRNA expression of a few genes. Superficially, the phenotypes look quite similar at the EKG and arrhythmia inducibility level and no optical mapping data from a single Tbx5 KO is presented for comparison to the double KO.

      We thank Reviewer #2 for the suggestions that a direct comparison between Tbx5 single conditional knockout and Tbx3/Tbx5 double conditional knockout models may help isolate the specific contribution of Tbx3 deletion in addition to Tbx5 deletion.

      Previous studies have assessed the effect of single Tbx5 CKO in the VCS of murine hearts (1, 3, 5). Arnolds et al. demonstrated that the removal of Tbx5 from the adult ventricular conduction system results in VCS slowing, including prolonged PR and QRS intervals, prolongation of the His duration and His-ventricular (HV) interval (3). Furthermore, Burnicka-Turek et al. demonstrated that the single conditional knockout of Tbx5 in the adult VCS caused a shift toward a pacemaker cell state, with ectopic beats and inappropriate automaticity (1). Whole-cell patch clamping of VCS-specific Tbx5-deficient cells revealed action potentials characterized by a slower upstroke (phase 0), prolonged plateau (phase 2), delayed repolarization (phase 3), and enhanced phase 4 depolarization - features characteristic of nodal action potentials rather than typical VCS action potentials (3). These observations were interpreted as uncovering nodal potential of the VCS in the absence of Tbx5. Based on the role of Tbx3 in CCS specification (2), we hypothesized that the nodal state of the VCS uncovered in the absence of Tbx5 was enabled by maintained Tbx3 expression. This motivated us to generate the double Tbx5 / Tbx3 knockout model to examine the state of the VCS in the absence of both T-box TFs.

      In the current study, we demonstrate that the VCS-specific deletion of Tbx3 and Tbx5 results in the loss of fast electrical impulse propagation in the VCS, similar to that observed in the single Tbx5 mutant. However, unlike the Tbx5 single mutant, the Tbx3/Tbx5 double deletion does not cause a gain of pacemaker cell state in the VCS. Instead, the physiological data suggests a transition toward non-conduction working myocardial physiology. This conclusion is supported by the presence of only a single upstroke in the optical action potential (OAP) recorded from the His bundle region and VCS cells in Tbx3/Tbx5 double conditional knockout mice. The electrical properties of VCS cells in the double knockout are functionally indistinguishable from those of ventricular working myocardial cells. As a result, ventricular impulse propagation is significantly slowed, resembling activation through exogenous pacing rather than the rapid conduction typically associated with the VCS. We will edit the text of the manuscript to more carefully distinguish the observations between these models, as suggested.

      (1) Burnicka-Turek O, Broman MT, Steimle JD, Boukens BJ, Petrenko NB, Ikegami K, Nadadur RD, Qiao Y, Arnolds DE, Yang XH, Patel VV, Nobrega MA, Efimov IR, Moskowitz IP (2020) Transcriptional Patterning of the Ventricular Cardiac Conduction System. Circulation Research 127:e94-e106. doi:10.1161/CIRCRESAHA.118.314460. 

      (2) Mohan RA, Bosada FM, van Weerd JH, van Duijvenboden K, Wang J, Mommersteeg MTM, Hooijkaas IB, Wakker V, de Gier-de Vries C, Coronel R, Boink GJJ, Bakkers J, Barnett P, Boukens BJ, Christoffels VM (2020) T-box transcription factor 3 governs a transcriptional program for the function of the mouse atrioventricular conduction system. Proc Natl Acad Sci U S A. 117:18617-18626. doi: 10.1073/pnas.1919379117.

      (3) Arnolds DE, Liu F, Fahrenbach JP, Kim GH, Schillinger KJ, Smemo S, McNally EM, Nobrega MA, Patel VV, Moskowitz IP (2012) TBX5 drives Scn5a expression to regulate cardiac conduction system function. The Journal of Clinical Investigation 122:2509–2518. doi: 10.1172/JCI62617.

      (4) Frank DU, Carter KL, Thomas KR, Burr RM, Bakker ML, Coetzee WA, Tristani-Firouzi M, Bamshad MJ, Christoffels VM, Moon AM (2012) Lethal arrhythmias in Tbx3-deficient mice reveal extreme dosage sensitivity of cardiac conduction system function and homeostasis. Proc Natl Acad Sci U S A. 109:E154-63. doi: 10.1073/pnas.1115165109.

      (5) Moskowitz IP, Pizard A, Patel VV, Bruneau BG, Kim JB, Kupershmidt S, Roden D, Berul CI, Seidman CE, Seidman JG (2004) The T-Box transcription factor Tbx5 is required for the patterning and maturation of the murine cardiac conduction system. Development 131:4107-4116. doi: 10.1242/dev.01265. PMID: 15289437.

      (3) The authors claim that double knockout VCS cells transform to working myocardial fate, but there is no comparison of gene expression levels between actual working myocardial cells and the Tbx3/Tbx5 DKO VCS cells so it's hard to know if the data reflect an actual cell state change or a more non-specific phenomenon with global dysregulation of gene expression or perhaps dedifferentiation. I understand that the upregulation of Gja1 and Smpx is intended to address this, but it's only two genes and it seems relevant to understand their degree of expression relative to actual working myocardium. In addition, the gene panel is somewhat limited and does not include other key transcriptional regulators in the VCS such as Irx3 and Nkx2-5. RNA-seq in these populations would provide a clearer comparison among the groups.

      And

      the main claims are that the absence of both factors results in a transcriptional shift of conduction tissue towards a working myocardial phenotype, and that this shift indicates that Tbx5 and Tbx3 "coordinate" to control VCS identity and function. However, only limited data are presented to support the claim of transcriptional reprogramming since the knockout cells are not directly compared to working myocardial cells at the transcriptional level and only a small number of key genes are assessed (versus genome-wide assessment).

      We appreciate Reviewer #2’s suggestion to expand the gene expression analysis in Tbx3/Tbx5-deficient VCS cells by including other specific genes and comparisons with “native”/actual working ventricular myocardial cells and broadening the gene panel. In this study, we evaluated core cardiac conduction system markers, revealing a loss of conduction system-specific gene expression in the double mutant VCS. Furthermore, we evaluated key working myocardial markers normally excluded from the conduction system, Gja1 and Smpx, revealing a shift towards a working myocardial state in the double mutant VCS (Figure 4). We agree that a more comprehensive analysis, such as transcriptome-wide approaches, would offer greater clarity on the extent and specificity of the observed shift from conduction to non-conduction identity. These approaches are appropriate directions for future studies.

      (4) From the optical mapping data, it is difficult to distinguish between the presence of (a) a focal proximal right bundle branch block due to dysregulation of gene expression in the VCS but overall preservation of the right bundle and its distal ramifications; from (b) actual loss of the VCS with reversion of VCS cells to a working myocardial fate. Related to this, the authors claim that this experiment allows for direct visualization of His bundle activation, but can the authors confirm or provide evidence that the tissue penetration of their imaging modality allows for imaging of a deep structure like the AV bundle as opposed to the right bundle branch which is more superficial? Does the timing of the separation of the sharp deflection from the subsequent local activation suggest visualization of more distal components of the VCS rather than the AV bundle itself? Additional clarification would be helpful.

      And

      In addition, the optical mapping dataset is incomplete and has alternative interpretations that are not excluded or thoroughly discussed.

      We agree with Reviewer #2 that the resolution of the optical mapping experiment may be insufficient to precisely localize the conduction block due to the limited signal strength from the VCS. It is possible that the region defined as the His Bundle also includes portions of the right bundle branch. Our control mice show VCS OAP upstrokes consistent with those reported by Tamaddon et al. (2000) using Di-4-ANEPPS (1). We appreciate the Reviewer’s attention to alternative interpretations, and we will incorporate these caveats into the manuscript text.

      (1) Tamaddon HS, Vaidya D, Simon AM, Paul DL, Jalife J, Morley GE (2000) High-resolution optical mapping of the right bundle branch in connexin40 knockout mice reveals slow conduction in the specialized conduction system. Circulation Research 87:929-36. doi: 10.1161/01.res.87.10.929. 

      Impact:

      The present study contributes a novel and elegantly constructed mouse model to the field. The data presented generally corroborate existing models of transcriptional regulation in the VCS but do not, as presented, constitute a decisive advance.

      And

      In sum, while this study adds an elegantly constructed genetic model to the field, the data presented fit well within the existing paradigm of established functions of Tbx3 and Tbx5 in the VCS and in that sense do not decisively advance the field. Moreover, the authors' claims about the implications of the data are not always strongly supported by the data presented and do not fully explore alternative possibilities.

      We appreciate Reviewer # 2’s acknowledgment of the elegance and novelty of the mouse model we generated. However, we respectfully disagree with their assessment that this work merely corroborates existing models without providing a decisive advance. Previous studies have investigated single Tbx5 or Tbx3 gene knockouts in-depth and established the T-box ratio model for distinguishing fast VCS from slow nodal conduction identity (1) that the reviewer alludes to in earlier comments. In contrast, this study aimed to explore a different model, that the combined effects of Tbx5 and Tbx3 distinguish adult VCS identity from non-conduction working myocardium. The coordinated Tbx3 and Tbx5 role in conduction system identify remained untested due to the lack of a mouse model that allowed their simultaneous removal. The very model the reviewer recognizes as “novel and elegantly constructed” has allowed the examination of the coordinated role of Tbx5 and Tbx3 for the first time. While we acknowledge the opportunity for additional depth of investigation of this model in future studies, the data we present provides consistent experimental support for the coordinated requirement of both Tbx5 and Tbx3 for ventricular cardiac conduction system identity.

      (1) Burnicka-Turek O, Broman MT, Steimle JD, Boukens BJ, Petrenko NB, Ikegami K, Nadadur RD, Qiao Y, Arnolds DE, Yang XH, Patel VV, Nobrega MA, Efimov IR, Moskowitz IP (2020) Transcriptional Patterning of the Ventricular Cardiac Conduction System. Circulation Research 127:e94-e106. doi:10.1161/CIRCRESAHA.118.314460. 

      Reviewer #3 (Public review):

      Summary:

      In the study presented by Burnicka-Turek et al., the authors generated for the first time a mouse model to cause the combined conditional deletion of Tbx3 and Tbx5 genes. This has been impossible to achieve to date due to the proximity of these genes in chromosome 5, preventing the generation of loss of function strategies to delete simultaneously both genes. It is known that both Tbx3 and Tbx5 are required for the development of the cardiac conduction system by transcription factor-specific but also overlapping roles as seen in the common and diverse cardiac defects found in patients with mutations for these genes. After validating the deletion efficiency and specificity of the line, the authors characterized the cardiac phenotype associated with the cardiac conduction system (CCS)-specific combined deletion of T_bx5_ and Tbx3 in the adult by inducing the activation of the CCS-specific tamoxifen-inducible Cre recombination (MinK-creERT) at 6 weeks after birth. Their analysis of 8-9-week-old animals did not identify any major morphological cardiac defects. However, the authors found conduction defects including prolonged PR and QTR intervals and ventricular tachycardia causing the death of the double mutants, which do not survive more than 3 months after tamoxifen induction. Molecular and optical mapping analysis of the ventricular conduction system (VCS) of these mutants concluded that, in the absence of Tbx5 and Tbx3 function, the cells forming the ventricular conduction system (VCS) become working myocardium and lose the specific contractile features characterizing VCS cells. Altogether, the study identified the critical combined role of Tbx3 and Tbx5 in the maintenance of the VCS in adulthood.

      Strengths:

      The study generated a new animal model to study the combined deletion of Tbx5 and Tbx3 in the cardiac conduction system. This unique model has provided the authors with the perfect tool to answer their biological questions. The study includes top-class methodologies to assess the functional defects present in the different mutants analyzed, and gathered very robust functional data on the conduction defects present in these mutants. They also applied optical action potential (OAP) methods to demonstrate the loss of conduction action potential and the acquisition of working myocardium action potentials in the affected cells because of Tbx5/Tbx3 loss of function. The study used simpler molecular and morphological analysis to demonstrate that there are no major morphological defects in these mutants and that indeed, the conduction defects found are due to the acquisition of working myocardium features by the VCS cells. Altogether, this study identified the critical role of these transcription factors in the maintenance of the VCS in the adult heart.

      We appreciate the Reviewer’s comments regarding the originality and utility of our model and the strengths of our methodological approach. The Reviewer’s appreciation of the molecular and morphological analyses as well as their constructive feedback is highly valuable.

      Weaknesses:

      In the opinion of this reviewer, the weakness in the study lies in the morphological and molecular characterization. The morphological analysis simply described the absence of general cardiac defects in the adult heart, however, whether the CCS tissues are present or not was not investigated. Lineage tracing analysis using the reporter lines included in the crosses described in the study will determine if there are changes in CCS tissue composition in the different mutants studied. Similarly, combining this reporter analysis with the molecular markers found to be dysregulated by qPCR and western blot, will demonstrate that indeed the cells that were specified as VCS in the adult heart, become working myocardium in the absence of Tbx3 and Tbx5 function.

      We appreciate the reviewer’s concern regarding the morphology of the cardiac conduction system in the Tbx3/Tbx5 double conditional knockout model. We did not observe any structural abnormalities, as the Reviewer notes. We agree with their suggestion for using Genetic Inducible Fate Mapping to mark cardiac conduction cells expressing MinKCre. In fact, we utilized this approach to isolate VCS cells for transcriptional profiling. Specifically, we combined the tamoxifen-inducible MinKCreERT allele with the Cre-dependent R26Eyfp reporter allele to label MinKCre-expressing cells in both control VCS and VCS-specific double Tbx3/Tbx5 knockouts. EYFP-positive cells were isolated for transcriptional studies, ensuring that our analysis exclusively targeted conduction system-lineage marked cells. The ability to isolate MinKCre-marked cells from both controls and Tbx5/Tbx3 double mutants indicates that VCS cells persisted in the double knockout. Nonetheless, the suggestion for in-vivo marking by Genetic Inducible Fate Mapping and morphologic analysis is a valuable recommendation for future studies.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Mutations in CDHR1, the human gene encoding an atypical cadherin-related protein expressed in photoreceptors, are thought to cause cone-rod dystrophy (CRD). However, the pathogenesis leading to this disease is unknown. Previous work has led to the hypothesis that CDHR1 is part of a cadherin-based junction that facilitates the development of new membranous discs at the base of the photoreceptor outer segments, without which photoreceptors malfunction and ultimately degenerate. CDHR1 is hypothesized to bind to a transmembrane partner to accomplish this function, but the putative partner protein has yet to be identified.

      The manuscript by Patel et al. makes an important contribution toward improving our understanding of the cellular and molecular basis of CDHR1-associated CRD. Using gene editing, they generate a loss of function mutation in the zebrafish cdhr1a gene, an ortholog of human CDHR1, and show that this novel mutant model has a retinal dystrophy phenotype, specifically related to defective growth and organization of photoreceptor outer segments (OS) and calyceal processes (CP). This phenotype seems to be progressive with age. Importantly, Patel et al, present intriguing evidence that pcdh15b, also known for causing retinal dystrophy in previous Xenopus and zebrafish loss of function studies, is the putative cdhr1a partner protein mediating the function of the junctional complex that regulates photoreceptor OS growth and stability.

      This research is significant in that it:

      (1) provides evidence for a progressive, dystrophic photoreceptor phenotype in the cdhr1a mutant and, therefore, effectively models human CRD; and

      (2) identifies pcdh15b as the putative, and long sought after, binding partner for cdhr1a, further supporting the theory of a cadherin-based junction complex that facilitates OS disc biogenesis.

      Nonetheless, the study has several shortcomings in methodology, analysis, and conceptual insight, which limits its overall impact.

      Below I outline several issues that the authors should address to strengthen their findings.

      Major comments:

      (1) Co-localization of cdhr1a and pcdh15b proteins

      The model proposed by the authors is that the interaction of cdhr1a and pcdh15b occurs in trans as a heterodimer. In cochlear hair cells, PCDH15 and CDHR23 are proposed to interact first as dimers in cis and then as heteromeric complexes in trans. This was not shown here for cdhr1a and pcdh15b, but it is a plausible configuration, as are single heteromeric dimers or homodimers. Regardless, this model depends on the differential compartmental expression of the cdhr1a and pcdh15b proteins. Data in Figure 1 show convincing evidence that these two proteins can, at least in some cases, be distributed along the length of photoreceptor membranes that are juxtaposed, as would be the case for OS and CP. If pcdh15b is predominantly expressed in CPs, whereas cdhr1a is predominantly expressed in OS, then this should be confirmed with actin double labeling with cdhr1a and pcdh15b since the apicobasal oriented (vertical) CPs would express actin in this same orientation but not in the OS. This would help to clarify whether cdhr1a and pcdh15b can be trafficked to both OS and CP compartments or whether they are mutually exclusive.

      First let me thank the reviewer for taking the time to comprehensively evaluate our work and provide constructive criticism which will improve the quality of our final version.

      To address this issue, we are undertaking imaging of actin/cdhr1a and actin/pcdh15b using SIM in both transverse and axial sections. Additionally, we have recently established an immuno-gold-TEM protocol and are going to provide data showcasing co-labeling of cdhr1a and pcdh15b at TEM resolution.

      Photoreceptor heterogeneity goes beyond the cone versus rod subtypes discussed here and it is known that in zebrafish, CP morphology is distinct in different cone subtypes as well as cone versus rod. It would be important to know which specific photoreceptor subtypes are shown in zebrafish (Figures 1A-C) and the non-fish species depicted in Figures 1E-L. Also, a larger field of view of the staining patterns for Figures 1E-L would be a helpful comparison (could be added as a supplementary figure).

      The revised manuscript will include clear labeling of the different cone cell types as well as lower magnification images to be included as supplemental figures.

      (2) Cdhr1a function in cell culture

      The authors should explain the multiple bands in the anti-FLAG blots. Also, it would be interesting to confirm that the cdhr1a D173 mutant prevents the IP interaction with pcdh15b as well as the additive effects in aggregate assays of Figure 2.

      We believe that the D173 mutation results in no cdhr1a polypeptide, based on the lack of in situ signal in our WISH studies (figures showing absence of cdhr1a mRNA will be provided in a new supplemental figure). However, we will clone the D173 mutant and attempt co-IP with pchd15b in our cell culture system as well as the aggregation assay using K562 cells.

      Is it possible that the cultured cells undergo proliferation in the aggregation assays shown in Figure 2? Cells might differentially proliferate as clusters form in rotating cultures. A simple assay for cell proliferation under the different transfection conditions showing no differences would address this issue and lend further support to the proposed specific changes to cell adhesion as a readout of this assay.

      This is a possibility, however we did not use rotating cultures, this was a monolayer culture. We did not observe any differences in total cell number between the differing transfections. As such, we do not feel proliferation explains the aggregation of K562 cells.

      Also, the authors report that the number of clusters was normalized to the field of view, but this was not defined. Were the n values different fields of view from one transfection experiment, or were they different fields of view from separate transfection experiments? More details and clarification are needed.

      This will be clarified in the revised manuscript, in short we replicated this experiment 3 times, quantifying 5 different fields of view in each replicate.

      (3) Methodological issues in quantification and statistical analyses

      Were all the OS and CP lengths counted in the observation region or just a sample within the region? If the latter, what were the sampling criteria? For CPs, it seems that the length was an average estimate based on all CPs observed surrounding one cone or one-rod cell. Is this correct? Again, if sampled, how was this implemented? In Fig 4M', the cdhr1a-/- ROS mostly looks curvilinear. Did the measurements account for this, or were they straight linear dimension measurements from base to tip of the OS as depicted in Fig 5A-E? A clearer explanation of the OS and CP length quantification methodology is required.

      The revised manuscript will clearly outline measurement methods. In short, we measured every CP/OS in the imaged regions. We did not average CPs/cell, we simply included all CP measurements in our analysis. All our CP measurements (actin or cdhr1a or pcdh15), were done in the presence of a counter stain, WGA, prph2, gnb1 or PNA to ensure proper measurements (landmark) and association with proper cell type.

      All measurements were taken as best as possible to reflect a straight linear dimension for consistency.

      How were cone and rod photoreceptor cell counts performed? The legend in Figure 4 states that they again counted cells in the observation region, but no details were provided. For example, were cones and rods counted as an absolute number of cells in the observation region (e.g., number of cones per defined area) or relative to total (DAPI+) cell nuclei in the region? Changes in cell density in the mutant (smaller eye or thinner ONL) might affect this quantification so it would be important to know how cell quantification was normalized.

      The revised manuscript will clearly outline measurement methods. In short, rod and cone cell counts were based on the number of outer segments that were observed in the imaging region and previously measured for length. We did not observe any eye size differences in our mutant fish.

      In Figure 6I, K, measuring the length of the signal seems problematic. The dimension of staining is not always in the apicobasal (vertical) orientation. It might be more accurate to measure the cdhr1a expression domain relative to the OS (since the length of the OS is already reduced in the mutants). Another possible approach could be to measure the intensity of cdhr1 staining relative to the intensity within a Prph2 expression domain in each group. The authors should provide complementary evidence to support their conclusion.

      The revised manuscript will clearly outline measurement methods. In short, all of our CP measurements (actin or cdhr1a or pcdh15), were done in the presence of a counter stain, WGA, prph2, gnb1 or PNA to ensure proper measurements and association with proper cell type.

      A better description of the statistical methodology is required. For example, the authors state that "each of the data points has an n of 5+ individuals." This is confusing and could indicate that in Figure 4F alone there were ~5000 individuals assayed (~100 data points per treatment group x n=5 individuals per data point x 10 treatment groups). I don't think that is what the authors intended. It would be clearer if the authors stated how many OS, CP, or cells were counted in their observation region averaged per individual, and then provided the n value of individuals used per treatment group (controls and mutants), on which the statistical analyses should be based.

      This will be addressed in the revised manuscript. In short we had an n=5 (individual fish) analyzed for each genotype/time point. We will also include numbers of OS/CP quantified in the observation regions.

      There are hundreds of data points in the separate treatment groups shown in several of the graphs. It would not be correct to perform the ANOVA on the separate OS or CP length measurements alone as this will bias the estimates since they are not all independent samples. For example, in Figure 6H, 5dpf pcdh15b+/- have shorter CPs compared to WT but pcdh15b-/- have longer compared to WT. This could be an artifact of the analysis. Moreover, the authors should clarify in the Methods section which ANOVA post hoc tests were used to control for multiple pairwise comparisons.

      This will be clarified in the revised manuscript.

      (4) Cdhr1a function in photoreceptors

      The cdhr1a IHC staining in 5dpf WT larvae in Figure 3E appears different from the cdhr1a IHC staining in 5dpf WT larvae in Figure 1A or Figure 6I. Perhaps this is just the choice of image. Can the authors comment or provide a more representative image?

      The image in figure 3E was captured using a previous non antigen retrieval protocol which limits the resolution of the cdhr1a signal along the CP. In the revised manuscript we will include an image that better represents cdhr1a staining in the WT and mutant.

      The authors show that pcdh15b localization after 5dpf mirrored the disorganization of the CP observed with actin staining. They also show in Figure 5O that at 180dpf, very little pcdh15b signal remains. They suggest based on this data that total degradation of CPs has occurred in the cdhr1a-/- photoreceptors by this time. However, although reduced in length, COS and cone CPs are still present at 180dpf (Figure 5E, E'). Thus, contrary to the authors' general conclusion, it is possible that the localization, trafficking, and/or turnover of pcdh15b is maintained through a cdhr1a-dependent mechanism, irrespective of the degree to which CPs are maintained. The experiments presented here do not clearly distinguish between a requirement for maintenance of localization versus a secondary loss of localization due to defective CPs.

      We agree, this point will be addressed in our revised manuscript.

      (5) Conceptual insights

      The authors claim that cdhr1a and pcdh15b double mutants have synergistic OS and CP phenotypes. I think this interpretation should be revisited.

      First, assuming the model of cdhr1a-pcdh15b interaction in trans is correct, the authors have not adequately explained the logic of why disrupting one side of this interaction in a single mutant would not give the same severity of phenotype as disrupting both sides of this interaction in a double mutant.

      Second, and perhaps more critically, at 10dpf the OS and CP lengths in cdhr1a-/- mutants (Figure 7J, T) are significantly increased compared to WT. In contrast, there are no significant differences in these measurements in the pcdh15b-/- mutants. Yet in double homozygous mutants, there is a significant reduction of ~50% in these measurements compared to WT. A synergistic phenotype would imply that each mutant causes a change in the same direction and that the magnitude of this change is beyond additive in the double mutants (but still in the same direction). Instead, I would argue that the data presented in Figure 7 suggest that there might be a functionally antagonistic interaction between cdhr1a and pcdh15b with respect to OS and CP growth at 10dpf.

      If these proteins physically interacted in vivo, it would appear that the interaction is complex and that this interaction underlies both OS growth-promoting and growth-restraining (stabilizing) mechanisms working in concert. Perhaps separate homodimers or heterodimers subserve distinct CP-OS functional interactions. This might explain the age-dependent differences in mutant CP and OS length phenotypes if these mechanisms are temporally dynamic or exhibit distinct OS growth versus maintenance phases. Regardless of my speculations, the model presented by the authors appears to be too simplistic to explain the data.

      We agree with the reviewer, as such we will address this conclusion in our revised manuscript. To do so we will revise our final model and include more flexibility in the proposed mechanisms.

      Reviewer #2 (Public review):

      Summary:

      The goal of this study was to develop a model for CDHR1-based Con-rod dystrophy and study the role of this cadherin in cone photoreceptors. Using genetic manipulation, a cell binding assay, and high-resolution microscopy the authors find that like rods, cones localize CDHR1 to the lateral edge of outer segment (OS) discs and closely oppose PCDH15b which is known to localize to calyceal processes (CPs). Ectopic expression of CDHR1 and PCDH15b in K652 cells indicates these cadherins promote cell aggregation as heterophilic interactants, but not through homophilic binding. This data suggests a model where CDHR1 and PCDH15b link OS and CPs and potentially stabilize cone photoreceptor structure. Mutation analysis of each cadherin results in cone structural defects at late larval stages. While pcdh15b homozygous mutants are lethal, cdhr1 mutants are viable and subsequently show photoreceptor degeneration by 3-6 months.

      Strengths:

      A major strength of this research is the development of an animal model to study the cone-specific phenotypes associated with CDHR1-based CRD. The data supporting CDHR1 (OS) and PCDH15 (CP) binding is also a strength, although this interaction could be better characterized in future studies. The quality of the high-resolution imaging (at the light and EM levels) is outstanding. In general, the results support the conclusions of the authors.

      Weaknesses:

      While the cellular phenotyping is strong, the functional consequences of CDHR1 disruption are not addressed. While this is not the focus of the investigation, such analysis would raise the impact of the study overall. This is particularly important given some of the small changes observed in OS and CP structure. While statistically significant, are the subtle changes biologically significant? Examples include cone OS length (Figures 4F, 6E) as well as other morphometric data (Figure 7I in particular). Related, for quantitative data and analysis throughout the manuscript, more information regarding the number of fish/eyes analyzed as well as cells per sample would provide confidence in the rigor. The authors should also note whether the analysis was done in an automated and/or masked manner.

      First let me thank the reviewer for taking the time to comprehensively evaluate our work and provide constructive criticism which will improve the quality of our final version.

      The revised manuscript will clearly outline both methods and statistics used for quantitation of our data. (please see comments from reviewer 1). While we do not include direct evidence of the mechanism of CDHR1 function, we do propose that its role is important in anchoring the CP and the OS, particularly in the cones, while in rods it may serve to regulate the release of newly formed disks (as previously proposed in mice). We do plan to test both of these hypothesis directly, however, that will be the basis of our future studies.

      Reviewer #3 (Public review):

      Summary:

      The manuscript by Patel et al investigates the hypothesis that CDHR1a on photoreceptor outer segments is the binding partner for PCDH15 on the calyceal processes, and the absence of either adhesion molecule results in separation between the two structures, eventually leading to degeneration. PCDH15 mutations cause Usher syndrome, a disease of combined hearing and vision loss. In the ear, PCDH15 binds CDH23 to form tip links between stereocilia. The vision loss is less understood. Previous work suggested PCDH15 is localized to the calyceal processes, but the expression of CDH23 is inconsistent between species. Patel et al suggest that CDHR1a (formerly PCDH21) fulfills the role of CDH23 in the retina.

      The experiments are mainly performed using the zebrafish model system. Expression of Pcdh15b and Cdhr1a protein is shown in the photoreceptor layer through standard confocal and structured illumination microscopy. The two proteins co-IP and can induce aggregation in vitro. Loss of either Cdhr1a or Pcdh15, or both, results in degeneration of photoreceptor outer segments over time, with cones affected primarily.

      The idea of the study is logical given the photoreceptor diseases caused by mutations in either gene, the comparisons to stereocilia tip links, and the protein localization near the outer segments. The work here demonstrates that the two proteins interact in vitro and are both required for ongoing outer segment maintenance. The major novelty of this paper would be the demonstration that Pcdh15 localized to calyceal processes interacts with Cdhr1a on the outer segment, thereby connecting the two structures. Unfortunately, the data presented are inadequate proof of this model.

      Strengths:

      The in vitro data to support the ability of Pcdh15b and Cdhr1a to bind is well done. The use of pcdh15b and cdhr1a single and double mutants is also a strength of the study, especially being that this would be the first characterization of a zebrafish cdhr1a mutant.

      Weaknesses:

      (1) The imaging data in Figure 1 is insufficient to show the specific localization of Pcdh15 to calyceal processes or Cdhr1a to the outer segment membrane. The addition of actin co-labelling with Pcdh15/Cdhr1a would be a good start, as would axial sections. The division into rod and cone-specific imaging panels is confusing because the two cell types are in close physical proximity at 5 dpf, but the cone Cdhr1a expression is somehow missing in the rod images. The SIM data appear to be disrupted by chromatic aberration but also have no context. In the zebrafish image, the lines of Pcdh15/Cdhr1a expression would be 40-50 um in length if the scale bar is correct, which is much longer than the outer segments at this stage and therefore hard to explain.

      First let me thank the reviewer for taking the time to comprehensively evaluate our work and provide constructive criticism which will improve the quality of our final version.

      To address this issue, we are undertaking imaging of actin/cdhr1a and actin/pcdh15b using SIM in both transverse and axial sections. Additionally, we have recently established an immuno-gold-TEM protocol and are going to provide data showcasing co-labeling of cdhr1a and pcdh15b at TEM resolution. We are also going to include lower magnification images to complement the SIM images presented in figure 1.

      (2) Figure 3E staining of Cdhr1a looks very different from the staining in Figure 1. It is unclear what the authors are proposing as to the localization of Cdhr1a. In the lab's previous paper, they describe Cdhr1a as being associated with the connecting cilium and nascent OS discs, and fail to address how that reconciles with the new model of mediating CP-OS interaction. And whether Cdhr1a localizes to discrete domains on the disc edges, where it interacts with Pcdh15 on individual calyceal processes.

      The image in figure 3E was captured using a previous non antigen retrieval protocol which limits the resolution of the cdhr1a signal along the CP. In the revised manuscript we will include an image that better represents cdhr1a staining in the WT and mutant.

      (3) The authors state "In PRCs, Pcdh15 has been unequivocally shown to be localized in the CPs". However, the immunostaining here does not match the pattern seen in the Miles et al 2021 paper, which used a different antibody. Both showed loss of staining in pcdh15b mutants so unclear how to reconcile the two patterns.

      We agree that our staining appears different, but we attribute this to our antigen retrieval protocol which differed from the Miles et al paper. We also point to the fact that pcdh15b localization has been shown to be similar to our images in other species (monkey and frog). As such, we believe our protocol reveals the proper localization pattern which might be lost/hampered in the procedure used in Miles et al 2021.

      (4) The explanation for the CRISPR targets for cdhr1a and the diagram in Figure 3 does not fit with crRNA sequences or the mutation as shown. The mutation spans from the latter part of exon 5 to the initial portion of exon 6, removing intron 5-6. It should nevertheless be a frameshift mutation but requires proper documentation.

      This was an overlooked error in figure making, we apologize and will address this typo in the revised manuscript.

      (5) There are complications with the quantification of data. First, the number of fish analyzed for each experiment is not provided, nor is the justification for performing statistics on individual cell measurements rather than using averages for individual fish. Second, all cone subtypes are lumped together for analysis despite their variable sizes. Third, t-tests are inappropriately used for post-hoc analysis of ANOVA calculations.

      As we discussed for reviewer 1 and 2, all methods and quantification/statistics will be clearly described in the revised manuscript.

      (6) Unclear how calyceal process length is being measured. The cone measurements are shown as starting at the external limiting membrane, which is not equivalent to the origin of calyceal processes, and it is uncertain what defines the apical limit given the multiple subtypes of cones. In Figure 5, the lines demonstrating the measurements seem inconsistently placed.

      As we discussed for reviewer 1 and 2, all methods and quantification/statistics will be clearly described in the revised manuscript.

      (7) The number of fish analyzed by TEM and the prevalence of the phenotype across cells are not provided. A lower magnification view would provide context. Also, the authors should explain whether or not overgrowth of basal discs was observed, as seen previously in cdhr1-null frogs (Carr et al., 2021).

      The revised manuscript will include the aforementioned stats and lower magnification images. We will also compare our results directly to Carr 2021.

      (8) The statement describing the separation between calyceal processes and the outer segment in the mutants is not backed up by the data. TEM or co-labelling of the structures in SIM could be done to provide evidence.

      We will work to include more TEM and co-labeling data for the revised manuscript (see comments to reviewer 1)

      (9) "Based on work in the murine model and our own observations of rod CPs, we hypothesize that zebrafish rod CPs only extend along the newly forming OS discs and do not provide structural support to the ROS." Unclear how murine work would support that conclusion given the lack of CPs in mice, or what data in the manuscript supports this conclusion.

      In the revised manuscript we will improve our discussion of murine CPs, in that we still detect the juxtaposition of cdhr1 and pcdh15, along a potential remanent of the CP as previously described in SEM studies. Our findings do not indicate that mice or rats have CPs, we simply wanted to outline that the behavior of cdhr1 and pcdh15 still remains conserved, despite the absence of long traditional CPs.

      (10) The authors state "from the fact that rod CPs are inherently much smaller than cone CPs" without providing a reference. In the manuscript, the measurements do show rod CPs to be shorter, but there are errors in the cone measurements, and it is possible that the RPE pigment is interfering with the rod measurements.

      We will include a reference where rod CPs have been found to be shorter (monkey and frog data). We have no doubt that in zebrafish the rod CPs are significantly shorter. All our CP measurements are done with a counter stain for rods and cones to be sure that we are measuring the correct cell type.

      (11) The discussion should include a better comparison of the results with ocular phenotypes in previously generated pcdh15 and cdhr1 mutant animals.

      In the revised manuscript we will include this in our discussion.

      (12) The images in panels B-F of the Supplemental Figure are uncannily similar, possibly even of the same fish at different focal planes.

      We assure the reviewer that each of the images in supplemental figure 1 are distinct and represent different in situ experiments.

    1. Author response:

      We thank the reviewers for the positive and constructive feedback on our manuscript. We appreciate you highlighting the importance of our work in advancing our understanding of HIV latency and viral reactivation. The reviewers had mostly minor comments that we are in the process of addressing by completing additional experiments that are responsive to reviewer comments as well as some clarification of the text. These include:

      (1) The impact of INTS12 knockout on cell viability.

      We did not see an effect of the knockout of INTS12 on cell viability in the flow cytometry gating of live/dead cells, nor a gross difference in cell proliferation. However, we will test cell viability and proliferation more quantitatively and include this data in the revision.

      (2) The effect of INTS12 knockout on additional LRAs.

      There is published data that the Integrator complex inhibits HIV reactivation via additional LRAs that we will better highlight in the revision. In addition, we have data that we did not include in the original submission suggesting that INST12 knockout affects the degree of HIV reactivation with additional LRAs. We will confirm these results and include the data in the revision.

      (3) Extend the discussion on how exquisitely sensitive HIV transcription is to pausing and transcriptional elongation and the insights this provides about general HIV transcriptional regulation.

      Yes, we agree with this and will extend the discussion in this manner. We will also include additional data that we recently obtained that further emphasizes this point.

      (4) Comparison to another CRISPR screen using the same library (Hsieh et al., PLOS Pathogens, 2023).

      Indeed, INST12 was one of the hits in the previous paper (Hsieh et al., 2023) but was not specifically described or validated in that paper. We will point that out in the revision. Also, the Hsieh et al paper already described the library in more detail, but we will include additional text in the revision to emphasize that it casts a wide net on processes involved in transcriptional regulation.

      (5) We made a mistake on the numbering of the supplemental figures which lead to some misunderstanding. We will correct this as well as add other suggestions of the reviewers for clarifications.

    1. Author response:

      The following is the authors’ response to the current reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      D'Oliviera et al. have demonstrated cleavage of human TRMT1 by the SARS-CoV-2 main protease in vitro. Following, they solved the structure of Mpro (Nsp5)-C145A bound to TRMT1 substrate peptide, revealing binding conformation distinct from most viral substrates. Overall, this work enhances our understanding of substrate specificity for a key drug target of CoV2. The paper is well-written and the data is clearly presented. It complements the companion article by demonstrating interaction between Mpro and TRMT1, as well as TRMT1 cleavage under isolated conditions in vitro. They show that cleaved TRMT1 has reduced tRNA binding affinity, linking a functional consequence to TRMT1 cleavage by MPro. Importantly, the revelation for flexible substrate binding of Nsp5 is fundamental for understanding Nsp5 as a drug target. Trmt1 cleavage assays by Mpro revealed similar kinetics for TRMT1 cleavage as compared to nsp8/9 viral polyprotein cleavage site. They purify TRMT1-Q350K, in which there is a mutation in the predicted cleavage consensus sequence, and confirm that it is resistant to cleavage by recombinant Mpro. I am unable to comment critically on the structural analyses as it is outside of my expertise. Overall, I think that these findings are important for confirming TRMT1 as a substrate of Mpro, defining substrate binding and cleavage parameters for an important drug target of SARS-CoV-2, and may be of interest to researchers studying RNA modifications.

      We thank the reviewer for their positive assessment and summary of our work in this paper!

      Reviewer #2 (Public review):

      Summary:

      The manuscript 'Recognition and Cleavage of Human tRNA Methyltransferase TRMT1 by the SARS-CoV-2 Main Protease' from Angel D'Oliviera et al., uncovers that TRMT1 can be cleaved by SARS-CoV-2 main protease (Mpro) and defines the structural basis of TRMT1 recognition by Mpro. They use both recombinant TRMT1 and Mpro as well as endogenous TRMT1 from HEK293T cell lysates to convincingly show cleavage of TRMT1 by the SARS-CoV-2 protease. Using in vitro assays, the authors demonstrate that TRMT1 cleavage by Mpro blocks its enzymatic activity leading to hypomodification of RNA. To understand how Mpro recognizes TRMT1, they solved a co-crystal structure of Mpro bound to a peptide derived from the predicted cleavage site of TRMT1. This structure revealed important protein-protein interfaces and highlights the importance of the conserved Q530 for cleavage by Mpro. They then compare their structure with previous X-ray crystal structures of Mpro bound to substrate peptides derived from the viral polyprotein and propose the concept of two distinct binding conformations to Mpro: P3´-out and P3´-in conformations (here P3´ stands for the third residue downstream of the cleavage site). It remains unknown what is the physiological role of these two binding conformations on Mpro function, but the authors established that Mpro has dramatically different cleavage efficiencies for three distinct substrates. In an effort to rationalize this observation, a series of mutations in Mpro's active site and the substrate peptide were tested but unexpectedly had no significant impact on cleavage efficiency. While molecular dynamic simulations further confirmed the propensity of certain substrates to adopt the P3´-out or P3´-in conformation, it did not provide additional insights into the dramatic differences in cleavage efficiencies between substrates. This led the authors to propose that the discrimination of Mpro for preferred substrates might occur at a later stage of catalysis after binding of the peptide. Overall, this work will be of interest to biologists studying proteases and substrate recognition by enzymes and RNA modifications as well as help efforts to target Mpro with peptide-like drugs.

      We thank the reviewer for this thorough and accurate summary of our work in this manuscript.

      Strengths:

      • The authors' statements are well supported by their data, and they used relevant controls when needed. Indeed, they used the Mpro C145A inactive variant to unambiguously show that the TRMT1 cleavage detected in vitro is solely due to Mpro's activity. Moreover, they used two distinct polyclonal antibodies to probe TRMT1 cleavage.

      • They demonstrate the impact of TRMT1 cleavage on RNA modification by quantifying both its activity and binding to RNA.

      • Their 1.9 Å crystal structure is of high quality and increases the confidence in the reported protein-protein contacts seen between TRMT1-derived peptide and Mpro.

      • Their extensive in vitro kinetic assay was performed in ideal conditions although it is sometimes unclear how many replicates were performed.

      • They convincingly show how Mpro cleavage is conserved among most but not all mammalian TRMT1 bringing an interesting evolutionary perspective on virus-host interactions.

      • The authors test multiple hypotheses to rationalize the preference of Mpro for certain substrates.

      • While this reviewer is not able to comment on the rigor of the MD simulations, the interpretations made by the authors seem reasonable and convincing.

      • The concept of two binding conformations (P3´-out or P3´-in) for the substrate in the active site of Mpro is significant and can guide drug design.

      We thank the reviewer for these positive assessments of manuscript strengths!

      Weaknesses:

      • The two polyclonal antibodies used by the authors seem to have strong non-specific binding to proteins other than TRMT1 but did not impact the author's conclusions or statements. This is a limitation of the commercially available antibodies for TRMT1.

      Yes, there are some levels of non-specific binding for all of the TRMT1 antibodies we have tested (this limitation of commercially available TRMT1 antibodies is also observed and noted by Zhang et al), but we agree that this does not impact the overall conclusions and that by using multiple different antibodies to show the same effects, we can have high confidence in the Western blot analysis and interpretation.

      • Despite the reasonable efforts of the authors, it remains unknown why Mpro shows higher cleavage efficiency for the nsp4/5 sequence compared to TRMT1 or nsp8/9 sequences. This is a challenging problem that will take substantially more effort by several labs to decipher mechanistically.

      True! To our knowledge and despite significant past efforts of many research groups studying similar coronavirus proteases (e.g. SARS-CoV-1 Mpro) a clear understanding of the detailed mechanistic relationship between cleavage sequence and cleavage kinetics remains mostly undefined. This is a great and important problem for mechanistic and computational groups with deep interests in proteases to tackle in the future! To highlight these and similar open questions, we have added a short paragraph to the Discussion section (second from the last paragraph).

      • The peptide cleavage kinetic assay used by the authors relies on a peptide labelled with a fluorophore (MCA) on the N-terminus and a quencher (Dpn) on the C-terminus. This design allows high-throughput measurements compatible with plate readers and is a robust and convenient tool. Nevertheless, the authors did not control for the impact of the labels (MCA and Dpn) on the activity of Mpro. While in most cases the introduced fluorophore/quencher do not impact activity, sometimes it can.

      Yes, we agree that it is possible the MCA and Dnp labels could have effects on the measured cleavage rates. These fluorophore/quencher peptide cleavage assays are the standard assays used by many labs in the protease field to study diverse proteases and diverse cleavage targets. When other labs have compared cleavage kinetic parameters measured with fluorophore/quencher-based peptide cleavage assays versus HPLC-based peptide cleavage assays, these are often found to be quite similar (e.g. Lee, J., Worrall, L.J., Vuckovic, M. et al. Crystallographic structure of wild-type SARS-CoV-2 main protease acyl-enzyme intermediate with physiological C-terminal autoprocessing site. Nat Commun 11, 5877 (2020). https://doi.org/10.1038/s41467-020-19662-4), although there are also examples where differences arise. In any case, we agree there could be some effects on the cleavage kinetics introduced by the fluorophore and/or quencher groups. However, our main focus in this paper is to show how a sequence in the human tRNA-modifying enzyme TRMT1 is cleaved by Mpro (and in this revision we have also added new data to show the functional effects of cleavage on TRMT1 activity); it will take significant future work to fully dissect the detailed relationships between peptide sequence, including the quantitative effects of fluorophore/quencher labels, and protease-directed cleavage kinetics. Based on our work in this paper and many past studies of similar proteases, understanding how peptide sequence or conformation relates to cleavage efficiency is a longer-term and very challenging problem that we view as beyond the scope of this work. We have added a brief section elaborating on this in the Discussion.

      • An unanswered question not addressed by the authors is if the peptides undergo conformational changes upon Mpro binding or if they are pre-organized to adopt the P3´-out and P3´-in conformations. This might require substantially more work outside the scope of this immediate article.

      We agree this is unanswered; we considered additional MD experiments to address this, but ultimately decided that since both of these sequences are cleaved in the context of much larger polypeptides (FL TRMT1 or the viral polypeptide), any simple analysis to assess the possibility of pre-organization and relate this preferred binding conformation to cleavage kinetics would be difficult to interpret in a biologically meaningful way. We think this and similar questions about how pre-organization of peptides or amino acid sequences in the polypeptides might influence protease binding and cleavage activity are interesting and important future questions for protease-focused groups in this field.

      Reviewer #3 (Public review):

      Summary:

      In this manuscript, the authors have used a combination of enzymatic, crystallographic, and in silico approaches to provide compelling evidence for substrate selectivity of SARS-CoV-2 Mpro for human TRMT1.

      Strengths:

      In my opinion, the authors came close to achieving their intended aim of demonstrating the structural and biochemical basis of Mpro catalysis and cleavage of human TRMT1 protein. The revised version of the manuscript has addressed most of the questions I had posed in my earlier review.

      We thank the reviewer for their positive assessment of this work, and we are glad to hear the manuscript revisions were helpful in addressing the first round of reviews and questions.

      Weaknesses:

      Although several new hypotheses are generated from the Mpro structural data, the manuscript falls a bit short of testing them in functional assays, which would have solidified the conclusions the authors have drawn.

      Toward showing some of the functional effects of TRMT1 cleavage, in this revised version of the manuscript we have added new data and a new results section (‘Cleavage of TRMT1 results in complete loss of tRNA m2,2G modification activity and reduced tRNA binding in vitro’) showing that cleavage of TRMT1 results in reduced tRNA binding to TRMT1 (Figure 2D) and the complete loss of TRMT1-mediated tRNA modification activity in vitro (Figure 2C). This complements the in-cell data presented by Zhang et al showing that cleavage of TRMT1 in SARS-CoV-2 infected human cells results in the reduction of m2,2G modification levels. We think these data are a strong addition to this paper that broadens the impacts of our reported results more directly into the RNA modifications field.

      In terms of showing the further, downstream biological effects of TRMT1 cleavage and/or the specific impacts of TRMT1 cleavage on SARS-CoV-2 propagation and replication, while we agree further functional assays could absolutely heighten the overall impact, we view the main focus of our paper as showing how TRMT1 is recognized and cleaved by Mpro at the structural level and characterizing the biochemistry of the TRMT1-Mpro interaction and the effects of cleavage on TRMT1 tRNA-modifying activity. Zhang et al present some cellular data suggesting that loss of TRMT1 and/or TRMT1 cleavage during infection is actually detrimental to SARS-CoV-2 replication and infectivity. However, a full understanding of how TRMT1-mediated m2,2G modification of tRNA impacts viral translation, whether TRMT1 plays other roles during the viral life cycle, or whether TRMT1 cleavage (even if not important for viral fitness) contributes to cellular phenotypes during infection, will take a significant amount of future cell biology and virology work to unravel. Indeed, our understanding is that characterizing some of the endogenous cleavage targets for the HIV protease and determining the downstream biological effects and impacts on HIV infection took well over a decade. We hope that the biochemical and structural characterization of the Mpro-TRMT1 interaction presented in our paper will provide the necessary fundamental groundwork and impetus for future virology and cellular biochemistry studies to further investigate the biological roles of TRMT1 cleavage by SARS-CoV-2 Mpro.

      ---

      The following is the authors’ response to the original reviews.

      eLife Assessment:

      This manuscript provides important structural insights into the recognition and degradation of the host tRNA methyltransferase by SARS-CoV-2 protease nsp5 (Mpro). The data convincingly support the main conclusions of the paper. These results will be of interest to researchers studying structures and substrate recognition and specificity of viral proteases.

      We thank the eLife editors and reviewers for handling this manuscript and the overall positive assessment of our work.

      In this revised version of the manuscript we have included significant, new experimental data with recombinant purified, catalytically active TRMT1 that directly shows cleavage of TRMT1 reduces its tRNA binding affinity (by gel shift assays) and results in the complete loss of tRNA modifying activity in vitro (by radiolabel-based methyltransferase assays). Because these added experiments provide new information about how Mpro-mediated cleavage specifically impacts TRMT1 tRNA binding and m2,2G modification activity, and thus new information about the functional effects of loss of the TRMT1 Zn finger domain, we would strongly suggest adding that “this work may be of interest to researchers studying RNA modifications”, or a similar phrase, in the eLife assessment.

      Please find below our point-by-point response to each of the reviewer comments, which outlines additional changes to the manuscript.

      Public Reviews:

      Reviewer #1 (Public Review):

      D'Oliviera et al. have demonstrated cleavage of human TRMT1 by the SARS-CoV-2 main protease in vitro. Following this, they solved the structure of Mpro-C145A bound to TRMT1 substrate peptide, revealing binding conformation distinct from most viral substrates. Overall, this work enhances our understanding of substrate specificity for a key drug target of CoV2. The paper is well-written and the data is clearly presented. It complements the companion article by demonstrating the interaction between Mpro and TRMT1 and TRMT1 cleavage under isolated conditions in vitro. Importantly, the revelation of flexible substrate binding of Nsp5 is fundamental for understanding Nsp5 as a drug target. Trmt1 cleavage assays revealed similar kinetics for TRMT1 cleavage as compared to the nsp8/9 viral polyprotein cleavage site, however, it would have been more rigorous for the authors to independently reproduce the kinetics reported for nsp8/9 using their specific experimental conditions. The finding that murine TRMT1 lacks a conserved consensus sequence is interesting, but is not experimentally tested here and is reported elsewhere. I am unable to comment critically on the structural analyses as it is outside of my expertise. Overall, I think that these findings are important for confirming TRMT1 as a substrate of Mpro and defining substrate binding and cleavage parameters for an important drug target of SARS-CoV-2.

      We thank the reviewer for their positive assessment and summary of our work in this paper!

      We absolutely agree that comparing to nsp8/9 cleavage kinetics measured in our own hands would be more rigorous here, and we have carried out these measurements in triplicate under the same conditions as were used to measure all the other peptide cleavage kinetics in this manuscript. Figures 5A & B (as well as Table S3 and Dataset S2) have been updated with our new nsp8/9 kinetic data (kcat = 0.019 +/- 0.002 s-1 and KM = 40 +/- 7.5 µM). As expected, our newly measured nsp8/9 kinetic parameters are very similar to those that we had previously cited from MacDonald et al (kcat = 0.013 +/- 0.001 s-1, KM = 36 +/- 6.0 µM), and show that Mpro-mediated TRMT1 peptide cleavage has similar proteolysis kinetics to the nsp8/9 viral polypeptide cleavage site.

      We have also purified full-length human TRMT1 Q530K, which is the key change in the cleavage consensus sequence that likely makes murine TRMT1 resistant to Mpro-mediated cleavage. In in vitro cleavage assays we find that indeed TRMT1 Q530K is entirely resistant to cleavage by recombinant Mpro and we have added this data to the manuscript in Figure 6D. These findings are consistent with previously cited data from Lu et al, which suggest mouse and hamster TRMT1 are not cleaved in HEK293T cells expressing Mpro.

      With the addition of the TRMT1 Q530K mutant data, we decided to move the evolutionary analysis together with this kinetic data to a new section in the Results. We think these additions and changes make the paper stronger and clearer, and thank the reviewer for these suggestions!

      Reviewer #2 (Public Review):

      Summary:

      The manuscript 'Recognition and Cleavage of Human tRNA Methyltransferase TRMT1 by the SARS-CoV-2 Main Protease' from Angel D'Oliviera et al., uncovers that TRMT1 can be cleaved by SARS-CoV-2 main protease (Mpro) and defines the structural basis of TRMT1 recognition by Mpro. They use both recombinant TRMT1 and Mpro as well as endogenous TRMT1 from HEK293T cell lysates to convincingly show cleavage of TRMT1 by the SARS-CoV-2 protease. To understand how Mpro recognizes TRMT1, they solved a co-crystal structure of Mpro bound to a peptide derived from the predicted cleavage site of TRMT1. This structure revealed important protein-protein interfaces and highlights the importance of the conserved Q530 for cleavage by Mpro. They then compared their structure with previous X-ray crystal structures of Mpro bound to substrate peptides derived from the viral polyprotein and proposed the concept of two distinct binding conformations to Mpro: P3´-out and P3´-in conformations (here P3´ stands for the third residue downstream of the cleavage site). It remains unknown what is the physiological role of these two binding conformations on Mpro function, but the authors established that Mpro has dramatically different cleavage efficiencies for three distinct substrates. In an effort to rationalize this observation, a series of mutations in Mpro's active site and the substrate peptide were tested but unexpectedly had no significant impact on cleavage efficiency. While molecular dynamic simulations further confirmed the propensity of certain substrates to adopt the P3´-out or P3´-in conformation, they did not provide additional insights into the dramatic differences in cleavage efficiencies between substrates. This led the authors to propose that the discrimination of Mpro for preferred substrates might occur at a later stage of catalysis after binding of the peptide. Overall, this work will be of interest to biologists studying proteases and substrate recognition by enzymes as well as help efforts to target Mpro with peptide-like drugs.<br />

      We thank the reviewer for this thorough and accurate summary of our work in this manuscript.

      Strengths:

      • The authors' statements are well supported by their data, and they used relevant controls when needed. Indeed, they used the Mpro C145A inactive variant to unambiguously show that the TRMT1 cleavage detected in vitro is solely due to Mpro's activity. Moreover, they used two distinct polyclonal antibodies to probe TRMT1 cleavage.

      • Their 1.9 Å crystal structure is of high quality and increases the confidence in the reported protein-protein contacts seen between TRMT1-derived peptide and Mpro.

      • Their extensive in vitro kinetic assay was performed in ideal conditions although it is unclear how many replicates were performed.

      • The authors test multiple hypotheses to rationalize the preference of Mpro for certain substrates.

      • While this reviewer is not able to comment on the rigor of the MD simulations, the interpretations made by the authors seem reasonable and convincing.

      • The concept of two binding conformations (P3´-out or P3´-in) for the substrate in the active site of Mpro is significant and can guide drug design.

      We thank the reviewer for these positive assessments of manuscript strengths!

      Weaknesses:

      • While the authors convincingly show that TRMT1 is cleaved by Mpro, the exact cleavage site was never confirmed experimentally. It is most likely that the predicted site is the main cleavage site as proposed by the authors (region 527-534). Nevertheless, in Fig 1C (first lane from the right) there are two bands clearly observed for the cleavage product containing the MT Domain. If the predicted site was the only cleavage site recognized by Mpro, then a single band for the MT domain would be expected. This observation suggests that there might be two cleavage sites for Mpro in TRMT1. Indeed, residues RFQANP (550-555) in TRMT1 might be a secondary weaker cleavage site for Mpro, which would explain the two observed bands in Fig 1C. A mass spectrometry analysis of the cleaved products would clarify this.

      We agree with the reviewer that based on the originally presented data it is possible there could be an additional Mpro-targeted cleavage site in TRMT1 beyond the 527-534 region that we validated through peptide cleavage assays of the TRMT1 526-536 peptide. Because it may be difficult to unambiguously identify and differentiate other putative cleavage sites that are nearby to 527-534 (e.g. the suggested possibility of 550-555) by mass spectrometry, we instead carried out additional in vitro cleavage assays with purified FL TRMT1 Q530K. Mutation of the invariant P1 Gln residue in the cleavage sequence is expected to prevent cleavage at this site, and allow us to probe whether there are other sites in TRMT1 that can be cleaved by Mpro (and if so, more straightforwardly identify them by mass spectrometry). We compared cleavage of purified WT FL TRMT1 and FL TRMT1 Q530K with recombinant Mpro in in vitro cleavage assays and found that TRMT1 Q530K is not cleaved by Mpro over the course of a 2h cleavage reaction. In these experiments, we also saw clear cleavage of WT FL TRMT1 over the course of 2h into only a single detectable band. Together, both of these pieces of data strongly suggest that the 527-534 region is the only Mpro-targeted cleavage site in TRMT1 (if there was an additional cleavage site, we should have seen some amount of cleavage in the Q530K mutant, but we do not). Overall, we feel that the updated WT and Q530K experiments clearly demonstrate that there is only one Mpro-mediated cleavage site in human TRMT1, which also is consistent with experiments in Zhang et al showing that Q530N mutations also block TRMT1 cleavage by co-expressed Mpro in human cells.

      The updated WT and Q530K cleavage assays have been added to the manuscript in Figure 6D.

      • A control is missing in Fig 1D. Since the authors use western blots to show the gradual degradation of endogenous TRMT1, a control with a protein that does not change in abundance over the course of the measurement is important. This is required to show that the differences in intensity of TRMT1 by western blotting are not due to loading differences etc.

      Yes, we agree this is an important control and have repeated these experiments and blotted for TRMT1 and GAPDH as a loading control. The updated Western blots are now shown in Figure 2B, and show the same result as the older data.

      • The two polyclonal antibodies used by the authors seem to have strong non-specific binding to proteins other than TRMT1 but did not impact the author's conclusions. This is a limitation of the commercially available antibodies for TRMT1, and unless the authors select a new monoclonal antibody specific to TRMT1 (costly and lengthy process), this limitation seems out of their control.

      Yes, there are some levels of non-specific binding for all of the TRMT1 antibodies we have tested (this limitation of commercially available TRMT1 antibodies is also observed and noted by Zhang et al), but we agree that this does not impact the overall conclusions and that by using multiple different antibodies to show the same effects, we can have high confidence in the Western blot analysis and interpretation.

      • The recombinantly purified TRMT1 seems to have some non-negligible impurities (extra bands in Fig 1C). This does not impact the conclusions of the authors but might be relevant to readers interested in working with TRMT1 for biochemical, structural, or other purposes.

      Yes, our initial isolations of recombinant TRMT1 for the first version of this paper produced smaller amounts of TRMT1 with some impurities; we agree that these do not impact the conclusions of the cleavage experiments. However, since our first submission, we have optimized our purification protocols for TRMT1 and are now able to obtain larger quantities of higher purity recombinant human TRMT1 from bacterial cells and we have used this material for the TRMT1 activity and tRNA binding assays added in this revision; we have also included updates to the expression and purification section for recombinant TRMT1. We hope that these improvements will be helpful to readers interested in working on TRMT1.

      • Despite the reasonable efforts of the authors, it remains unknown why Mpro shows higher cleavage efficiency for the nsp4/5 sequence compared to TRMT1 or nsp8/9 sequences.

      True! To our knowledge and despite significant past efforts of many research groups studying similar coronavirus proteases (e.g. SARS-CoV-1 Mpro) a clear understanding of the detailed mechanistic relationship between cleavage sequence and cleavage kinetics remains mostly undefined. This is a great and important problem for mechanistic and computational groups with deep interests in proteases to tackle in the future! To highlight these and similar open questions, we have added a short paragraph to the Discussion section (second from the last paragraph).

      • The peptide cleavage kinetic assay used by the authors relies on a peptide labelled with a fluorophore (MCA) on the N-terminus and a quencher (Dpn) on the C-terminus. This design allows high-throughput measurements compatible with plate readers and is a robust and convenient tool. Nevertheless, the authors did not control for the impact of the labels (MCA and Dpn) on the activity of Mpro. It is possible that the differences in cleavage efficiencies between peptides are due to unexpected conformational changes in the peptide upon labelling. Moreover, the TRMT1 peptide has an E at the N-terminus and an R at the C-terminus (while the nsp4/5 peptide has an S and M, respectively). It is possible that these two terminal residues form a salt bridge in the TRMT1 peptide that might constrain the conformation of the peptide and thus reduce its accessibility and cleavage by Mpro. Enzymatic assays in the absence of labels and MD simulations with the bona fide peptides (including the labels) used in the kinetic measurements are needed to prove that the cleavage efficiencies are not biased by the fluorescence assay.

      These fluorophore/quencher peptide cleavage assays are the standard assays used by many labs in the protease field to study diverse proteases and diverse cleavage targets. When other labs have compared cleavage kinetic parameters measured with fluorophore/quencher-based peptide cleavage assays versus HPLC-based peptide cleavage assays, these are often found to be quite similar (e.g. Lee, J., Worrall, L.J., Vuckovic, M. et al. Crystallographic structure of wild-type SARS-CoV-2 main protease acyl-enzyme intermediate with physiological C-terminal autoprocessing site. Nat Commun 11, 5877 (2020). https://doi.org/10.1038/s41467-020-19662-4), although there are also examples where differences arise. In any case, we agree there could be some effects on the cleavage kinetics introduced by the fluorophore and/or quencher groups or sequence-specific conformational preferences of the peptides. However, because our main focus in this paper is to show how a sequence in the human tRNA-modifying enzyme TRMT1 is cleaved by Mpro (and in this revision we have also added new data to show the functional effects of cleavage on TRMT1 activity), and the broad focus of our lab is understanding the mechanisms controlling the function and activity of RNA-modifying enzymes, we will leave it to other labs focused more specifically on protease biochemistry to fully dissect the detailed relationships between peptide sequence and conformation to protease-directed cleavage kinetics. As discussed above, based on our work in this paper and many past studies of similar proteases, understanding how sequence relates to cleavage efficiency is a longer-term and very challenging problem that we view as beyond the scope of this work. As noted above, we have added a brief section explaining this in the Discussion.

      • The authors used A431S variant in TRMT1-derived peptide to disrupt the P3´-in conformation. While this reviewer agrees with the rationale behind A431S design, it is important to confirm experimentally that the mutation disrupted the P3´-in conformation in favor of the P3´-out conformer. The authors could use their MD simulations to determine if the TRMT1 A431S variant favors the P3´-out conformation.

      Thank you for this suggestion; we agree and have carried out the suggested MD simulations with TRMT1 A531S peptides bound to Mpro. Surprisingly, these simulations suggest that the A531S peptide can still readily adopt the P3’-in conformation by orienting the Ser sidechain in a different way as compared to its positioning in the Mpro-nsp4/5 structure. Since this somewhat changes our interpretation of the results of the A531S kinetic experiments, we have rewritten this section of the manuscript by: (a) removing the ‘TRMT1 mutations predicted to alter peptide binding conformation have little effect on cleavage kinetics’ section in the Results, (b) instead adding several sentences talking about the A531S mutation to the previous section of the results, and including this mutation as another example of how mutations to either Mpro or TRMT1 residues that might be expected to impact cleavage kinetics do not in fact affect cleavage rates, and finally (c) adding the new MD simulation results to the A531S kinetic data in Figure S5 in the Supporting Information. We thank the reviewer for suggesting this important follow-up simulation!

      • An unanswered question not addressed by the authors is if the peptides undergo conformational changes upon Mpro binding or if they are pre-organized to adopt the P3´-out and P3´-in conformations.

      We agree this is unanswered; we considered additional MD experiments to address this, but ultimately decided that since both of these sequences are cleaved in the context of much larger polypeptides (FL TRMT1 or the viral polypeptide), any simple analysis to assess the possibility of pre-organization and relate this preferred binding conformation to cleavage kinetics would be difficult to interpret in a biologically meaningful way. We think this and similar questions about how pre-organization of peptides or amino acid sequences in the polypeptides might influence protease binding and cleavage activity are interesting and important future questions for protease-focused groups in this field.

      • While the authors describe at great length the hydrogen bonds involved in the substrate recognition by Mpro, they occluded to highlight important stacking interactions in this interface. For instance, Phe533 from TRMT1 stacks with Met49 while L529 from TRMT1 packs against His41 of Mpro. Both hydrogen bonding and stacking interactions seem important for TRMT1-derived peptide recognition by Mpro.

      Thank you for these suggestions toward additional structural analysis. We have added a short description of L529 packing in the S2 pocket to the main text and Figure S3B. We have also added a short description of F533 packing in the S3’ pocket to the main text and Figure S3C.

      Reviewer #3 (Public Review):

      Summary:

      In this manuscript, the authors have used a combination of enzymatic, crystallographic, and in silico approaches to provide compelling evidence for substrate selectivity of SARS-CoV-2 Mpro for human TRMT1.

      Strengths:

      In my opinion, the authors came close to achieving their intended aim of demonstrating the structural and biochemical basis of Mpro catalysis and cleavage of human TRMT1 protein. The combination of orthogonal approaches is highly commendable.

      We thank the reviewer for their positive assessment of this work!

      Weaknesses:

      It would have been of high scientific impact if the consequences of TRMT1 cleavage by Mpro on cellular metabolism were provided. Furthermore, assays to investigate the effect of inhibition of this Mpro activity on SARS-CoV-2 propagation and infection would have been extremely useful in providing insights into host- SARS-CoV-2 interactions.

      Toward showing some of the consequences of TRMT1 cleavage, in this revised version of the manuscript we have added new data and a new results section (‘Cleavage of TRMT1 results in complete loss of tRNA m2,2G modification activity and reduced tRNA binding in vitro’) showing that cleavage of TRMT1 results in reduced tRNA binding to TRMT1 (Figure 2D) and the complete loss of TRMT1-mediated tRNA modification activity in vitro (Figure 2C). This complements the in-cell data presented by Zhang et al showing that cleavage of TRMT1 in SARS-CoV-2 infected human cells results in the reduction of m2,2G modification levels. We think these data are a strong addition to this paper that broadens the impacts of our reported results more directly into the RNA modifications field.

      In terms of showing the further, downstream biological effects of TRMT1 cleavage and/or the specific impacts of TRMT1 cleavage on SARS-CoV-2 propagation and replication, while we agree this would absolutely heighten the overall impact, we view the main focus of our paper as showing how TRMT1 is recognized and cleaved by Mpro at the structural level and characterizing the biochemistry of the TRMT1-Mpro interaction and the effects of cleavage on TRMT1 tRNA-modifying activity. Zhang et al present some cellular data suggesting that loss of TRMT1 and/or TRMT1 cleavage during infection is actually detrimental to SARS-CoV-2 replication and infectivity. However, a full understanding of how TRMT1-mediated m2,2G modification of tRNA impacts viral translation, whether TRMT1 plays other roles during the viral life cycle, or whether TRMT1 cleavage (even if not important for viral fitness) contributes to cellular phenotypes during infection, will take a significant amount of future cell biology and virology work to unravel. Indeed, our understanding is that characterizing some of the endogenous cleavage targets for the HIV protease and determining the downstream biological effects and impacts on HIV infection took well over a decade. We hope that the biochemical and structural characterization of the Mpro-TRMT1 interaction presented in our paper will provide the necessary fundamental groundwork and impetus for future virology and cellular biochemistry studies to further investigate the biological roles of TRMT1 cleavage by SARS-CoV-2 Mpro.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Please list Mpro alias Nsp5 in the Abstract and Introduction, as this is the nomenclature used in the companion article.

      OK, we have made these changes.

      Reviewer #2 (Recommendations For The Authors):

      In addition to the points mentioned in the public review, this reviewer encourages the authors to address the following points:

      • Citation 14 is important for this work since the authors used multiple structures from that earlier study for comparison. Citation 14 seems outdated since it refers to a preprint that has been published since then in Nat Comm. The authors should cite the peer-reviewed work https://pubmed.ncbi.nlm.nih.gov/35729165/

      Thank you, we have updated this reference.

      • The description of the hydrogen bonds is tedious to read. The authors could instead classify them into two groups. Hydrogen bonds between main chain backbones or hydrogen bonds between side chains. For instance, they mention the contact between Mpro Glu166-TRMT1 Arg528. This can lead to confusion that a salt bridge is formed while these two residues interact only via their main chain backbones. Indeed, the side chain of R528 is exposed to the solvent.

      OK, we have taken this suggestion and tried to simplify and clarify this portion of the text (along with the accompanying structure Figure 3 showing key hydrogen bonds; see below).

      • For Figure 2, please label the residues of the peptide with the TRMT1 numbering. This will help the reader to follow the text while looking at the figure.

      OK we have added the TRMT1 numbering to what is now Figure 3A, and labeled key TRMT1 residues in Figures 3B, C, and D.

      • Fig 2B is important but crowded. The authors could use two panels to show two different views of this interface.

      Thank you for this suggestion, we have split B (now C and D in Figure 3) into two panels, rotated 90 degrees from one another, with each view showing a different subset of TRMT1-Mpro interactions. These updated panels are less crowded, and will hopefully be much clearer to readers.

      • For increased clarity, the authors could color P3´-out in orange and P3´-in teal in Fig 3D.

      OK, we have made this change.

      • Please proofread the method section. There should be a space between values and their units. For example, 20mM HEPES should be 20 mM HEPES.

      Thank you, we have corrected these formatting errors in the methods section of the revised version of the manuscript.

      • The authors did not identify the mechanism for the higher efficiency of nsp4/5 cleavage despite testing several mutants and MD simulations. Did the author consider changes in the network of water molecules that might be identified in the MD simulations?

      We did look at the positioning of waters in nsp4/5 vs nsp8/9 vs TRMT1 MD simulations. In the nsp4/5 simulation we do see a slightly higher density of water molecules positioned at approximately reasonable attack angles for substrate hydrolysis. If we consider water molecules with an attack angle on the scissile amide of 82 – 96 degrees and an attack distance of 4 Å or closer, the probabilities for these conditions in the simulations are: nsp4/5 – 19%, nsp8/9 – 9%, TRMT1 – 6%. More water positioned at reasonable attack positions for nsp4/5 might be consistent with its higher cleavage efficiency, but: (a) these are relatively small differences in water positioning across these 3 Mpro-substrate simulations that would not be enough to clearly explain the large differences in observed kinetics, and (b) hydrolysis happens in the later steps of the catalytic cycle, so to accurately capture this we would likely need to simulate reaction intermediates formed after initial attack of the active site Cys.

      We very much appreciate the reviewer’s enthusiasm in pushing us to understand the mechanistic basis for Mpro-directed cleavage efficiencies, and we would have absolutely loved to figure this out! (As it appears to be a long-standing question in the field!) But as discussed above and in the manuscript, we think that it will take a detailed dissection of different steps in the catalytic cycle to understand where and how this selectivity arises. We will leave it to research groups focused more exclusively on the details of protease biochemistry and simulations of reactive intermediates to take up these significant and long-term challenges!

      • In the PDB deposition, Y154 from chain B should be fixed.

      • In the PDB deposition, some added glycerols seem to conflict. Although this is not important for the biological work discussed in this study, the authors should check if glycerol 403 in chain A and 402, 403 in chain B are properly modeled. Does the density justify placing a glycerol there?

      • In the PDB deposition, there are over 51 RSRZ outliers. The authors should double-check if they cannot fix them with additional refinements. While such outliers in poorly defined linkers are understandable, this is unexpected for well-defined regions in the map.

      We have made a number of updates to our PDB deposition to address the above three points. (1) We have reexamined and tweaked the loop region at Y154 chain B; this region of the structure has relatively poorly defined electron density, but we now have a model where Y154 is no longer a Ramachandran outlier. The PDB model is now free of any Ramachandran outliers. (2) We have reexamined each of the modeled glycerol molecules and removed one of these (GOL 402), which had a weaker fit to the electron density. The remaining two glycerols appear to be well-modeled (omit maps leaving out each glycerol show strong Fo-Fc density that clearly looks like a glycerol in shape, adding each glycerol back into the model decreases Rwork and Rfree, and the refined 2Fo-Fc map fits well to the modeled glycerols). (3) We agree there are a large number of RSRZ outliers in this structure. We have reexamined many of these, and come to the same conclusion as for our original deposition: that most of these result from residues where there is clear enough density for placing the backbone into the map, but very poor density for the sidechain. Modeling different sidechain positions for the RSRZ outliers we reexamined did not appreciably improve the model fit or change their RSRZ outlier status. For example, Y154 in chains A and B remain some of the worst RSRZ outliers; while the density for these loop regions is generally not very good, it is clear that the backbone atoms of Y154 can be modeled into the structure, but there is very very weak density for the sidechain. We tried modeling alternative and/or multiple sidechain conformations for Y154, but this did not significantly reduce the size of the RSRZ outlier. In short, while we could remove some of these residues or truncate the sidechain where the sidechain density is very poor to lower the total number of RSRZ outliers, we think the best model is one where we leave these residues built into the structure and accept the higher number of RSRZ outliers. Importantly, none of the significant RSRZ outliers are key residues of biological interest that would affect our interpretation of the structure and/or TRMT1-Mpro biochemistry.

      We have deposited a new, re-refined PDB model (9DW6) that incorporates these changes and supersedes our old PDB entry (8D35). We have updated the manuscript with the new PDB ID. We thank the reviewer for these suggestions that improved the overall structural model.

      Reviewer #3 (Recommendations For The Authors):

      The crystal structure entry in the PDB should mention the Cys-to-Ala substitution in Mpro.

      Thank you, we have made this change

      Fig 2A and 2B: Can the authors highlight the Gln520-Ala531 peptide bind with a different color, please? It gets lost in panel B.

      Yes, we have made significant revisions to what is now Figure 3, and have highlighted the scissile peptide bond atoms in orange in each of these panels. Thank you for this suggestion, we agree it helps readers to orient themselves within the structure.

      "Importantly, the identified Mpro-targeted residues in human TRMT1 are conserved in the human population (i.e. no missense polymorphisms), showing that human TRMT1 can be recognized and cleaved by SARS-CoV-2 Mpro." Is TRMT1 prone to a high frequency of missense polymorphisms? If so, then this point makes sense. If not, it is not clear if this really informs on any biologically relevant mechanism.

      Given (i) that primate TRMT1 was previously identified under positive selection (i.e. rapid evolution) in an evolutionary screen (Cariou et al PNAS 2022) and (ii) that our study is mostly in vitro, we thought it was important to, first, make sure that this sequence of TRMT1 used in functional assays is not specific to a reference sequence that we tested in vitro, but is actually the sequence of TRMT1 in the human population. Further, we were also looking for whether some variations in the Mpro cleavage site of TRMT1 were possibly present in some humans (could these be linked with severe COVID or susceptibility, for example?).

      Overall, this statement aims to anchor our in vitro results to the TRMT1 sequences actually present in humans. However, we agree this does not inform “biologically relevant mechanism”. We therefore took out the “Importantly” that was probably misleading.

      "TRMT1 engages the Mpro active site in a distinct binding conformation."

      This is reported as an observation with little analysis. What is the structural basis of this conformational difference between the bound peptides? Why are the psi angles different? Is there a steric factor that is different between these peptide chains? This section can be substantially improved in detail from its current state.

      See our related answer to the next comment below.

      "Molecular dynamics simulations suggest kinetic discrimination happens during later steps of Mpro-catalyzed substrate cleavage." This section could have partly addressed my previous comment. It is not clear why there is such a large difference in the psi-angle. With access to several peptide-bound structures, the authors should derive and provide insights into the underlying fundamental principles. After all, this is a major point of discovery in their investigation.

      We agree that it is not entirely clear why TRMT1 seems to favor the P3’-in conformation when binding to Mpro. The only other known peptide-bound structure that adopts a similar P2’ psi angle is nsp6/7, but there are not clear sequence, steric, or interaction features that distinguish TRMT1 and nsp6/7 from the other 6 peptide-Mpro structures that favor a P3’-out conformation with larger P2’ psi angle. In particular, the identity of the P1’ and P3’ residues, which would probably be expected to have the largest impact on this conformation, have no clear commonality in TRMT1 and nsp6/7 that give hints about why these adopt this unique conformation. As we describe in the discussion section of the manuscript, and has been observed by many other studies of Mpro, the protease active site is very plastic and able to accommodate a diverse range of sequences surrounding the invariant P1 Gln. Furthermore, while the crystal structures of TRMT1 and other nsp cleavage sequences bound to Mpro show a single peptide conformation in the active site, our MD simulations suggest that both P3’-in and P3’-out type conformations are present in solution for TRMT1, nsp4/5, and nsp8/9, just with different populations. It is very likely that there is a delicate energetic balance between these conformations that may depend subtly on multiple sequence features of the peptide and how they interact with each other and the flexible Mpro active site. As with our replies to questions from Reviewer 2 above about deciphering the underlying principles that connect peptide sequence to cleavage efficiency, we expect that dissecting the detailed links between sequence and binding conformation will be a long-term challenge for mechanistic and biocomputational groups focused on viral protease enzymes; systematic mutation of all residues in the cleavage sequence to multiple different amino acid identities followed by structure determination either experimentally and/or computationally will likely be required to uncover the key sequence or steric properties and interactions that underly and drive favored peptide binding conformations.

      To highlight these questions as significant and difficult future challenges toward understanding the fundamental principles underlying SARS-CoV Mpro proteolysis, we have added an additional paragraph (second from the last paragraph) in the discussion section.

      This work can be taken to a whole new level if the authors were to provide insights into how TRMT1 degradation by Mpro affects host cell biology and how the inhibition of this activity affects CoV biology.

      We certainly agree that showing the biological effects of TRMT1 degradation on host cell biology and/or viral biology could raise the impact of this work. But as discussed in more detail above in our response to the weakness listed in Reviewer 3’s public review, we see the main focus of this work as showing the biochemical and structural basis for TRMT1 recognition and cleavage by SARS-CoV-2 Mpro, and directly showing the immediate effects of this cleavage on the TRMT1-tRNA interaction and modification activity. As was the case with other viral proteases, like the HIV-1 protease, understanding the potentially diverse and nuanced downstream biological effects of host protein cleavage and its impacts on cellular phenotypes or viral fitness could take many years of careful cell biology and virology work. We hope that our paper provides the key first steps to viral biology labs taking on this significant but important challenge for TRMT1!

    1. Author response:

      Reviewer #1 (Public review):

      Summary:

      This work investigated the role of CXXC-finger protein 1 (CXXC1) in regulatory T cells. CXXC1-bound genomic regions largely overlap with Foxp3-bound regions and regions with H3K4me3 histone modifications in Treg cells. CXXC1 and Foxp3 interact with each other, as shown by co-immunoprecipitation. Mice with Treg-specific CXXC1 knockout (KO) succumb to lymphoproliferative diseases between 3 to 4 weeks of age, similar to Foxp3 KO mice. Although the immune suppression function of CXXC1 KO Treg is comparable to WT Treg in an in vitro assay, these KO Tregs failed to suppress autoimmune diseases such as EAE and colitis in Treg transfer models in vivo. This is partly due to the diminished survival of the KO Tregs after transfer. CXXC1 KO Tregs do not have an altered DNA methylation pattern; instead, they display weakened H3K4me3 modifications within the broad H3K4me3 domains, which contain a set of Treg signature genes. These results suggest that CXXC1 and Foxp3 collaborate to regulate Treg homeostasis and function by promoting Treg signature gene expression through maintaining H3K4me3 modification.

      Strengths:

      Epigenetic regulation of Treg cells has been a constantly evolving area of research. The current study revealed CXXC1 as a previously unidentified epigenetic regulator of Tregs. The strong phenotype of the knockout mouse supports the critical role CXXC1 plays in Treg cells. Mechanistically, the link between CXXC1 and the maintenance of broad H3K4me3 domains is also a novel finding.

      Weaknesses:

      (1) It is not clear why the authors chose to compare H3K4me3 and H3K27me3 enriched genomic regions. There are other histone modifications associated with transcription activation or repression. Please provide justification.

      Thank you for highlighting this important point. We prioritized H3K4me3 and H3K27me3 because they are well-established markers of transcriptional activation and repression, respectively. These modifications provide a robust framework for investigating the dynamic interplay of chromatin states in Treg cells, particularly in regulating the balance between activation and suppression of key genes. While histone acetylation, such as H3K27ac, is linked to enhancer activity and transcriptional elongation, our focus was on promoter-level regulation, where H3K4me3 and H3K27me3 are most relevant. Although other histone modifications could provide additional insights, we chose to focus on these two to maintain clarity and feasibility in our analysis. We are happy to further elaborate on this rationale in the manuscript if necessary.

      (2) It is not clear what separates Clusters 1 and 3 in Figure 1C. It seems they share the same features.

      We apologize for not clarifying these clusters clearly. Cluster 1 and 3 are both H3K4me3 only group, with H3K4me3 enrichment and gene expression levels being higher in Cluster 1. At first, we divided the promoters into four categories because we wanted to try to classify them into four categories: H3K4me3 only, H3K27me3 only, H3K4me3-H3K27me3 co-occupied, and None. However, in actual classification, we could not distinguish H3K4me3-H3K27me3 co-occupied group. Instead, we had two categories of H3K4me3 only, with cluster 1 having a higher enrichment level for H3K4me3 and gene expression levels.

      (3) The claim, "These observations support the hypothesis that FOXP3 primarily functions as an activator by promoting H3K4me3 deposition in Treg cells." (line 344), seems to be a bit of an overstatement. Foxp3 certainly can promote transcription in ways other than promoting H3K3me3 deposition, and it also can repress gene transcription without affecting H3K27me3 deposition. Therefore, it is not justified to claim that promoting H3K4me3 deposition is Foxp3's primary function.

      We appreciate the reviewer’s thoughtful observation regarding our claim about FOXP3’s role in promoting H3K4me3 deposition. We acknowledge that FOXP3 is a multifunctional transcription factor with diverse mechanisms of action, including transcriptional activation independent of H3K4me3 deposition and transcriptional repression that does not necessarily involve H3K27me3 deposition.

      Our intention was not to imply that promoting H3K4me3 deposition is the exclusive or predominant function of FOXP3 but rather to highlight that this mechanism contributes significantly to its role in regulating Treg cell function. We agree that our wording may have overstated this point, and we will revise the text to provide a more nuanced interpretation. Specifically, we will clarify that our observations suggest FOXP3 can facilitate transcriptional activation, in part, by promoting H3K4me3 deposition, but this does not preclude its other regulatory mechanisms.

      (4) For the in vitro suppression assay in Figure S4C, and the Treg transfer EAE and colitis experiments in Figure 4, the Tregs should be isolated from Cxxc1 fl/fl x Foxp3 cre/wt female heterozygous mice instead of Cxxc1 fl/fl x Foxp3 cre/cre (or cre/Y) mice. Tregs from the homozygous KO mice are already activated by the lymphoproliferative environment and could have vastly different gene expression patterns and homeostatic features compared to resting Tregs. Therefore, it's not a fair comparison between these activated KO Tregs and resting WT Tregs.

      Thank you for this insightful comment and for pointing out the potential confounding effects associated with using Treg cells from homozygous Foxp3Cre/Cre (or Cre/Y) Cxxc1fl/fl mice. We agree that using Treg cells from _Foxp3_Cre/+ _Cxxc1_fl/fl (referred to as “het-KO”) and their littermate _Foxp3_Cre/+ _Cxxc1_fl/+ (referred to as “het-WT”) female mice would provide a more balanced comparison, as these Treg cells are less likely to be influenced by the activated lymphoproliferative environment present in homozygous KO mice.

      To address this concern, we will perform additional experiments using Treg cells isolated from _Foxp3_Cre/+ _Cxxc1_fl/fl (“het-KO”) and their littermate _Foxp3_Cre/+ _Cxxc1_fl/+ (“het-WT”) female mice. We will update the manuscript with these new data to provide a more accurate assessment of the impact of CXXC1 deficiency on Treg cell function.

      (5) The manuscript didn't provide a potential mechanism for how CXXC1 strengthens broad H3K4me3-modified genomic regions. The authors should perform Foxp3 ChIP-seq or Cut-n-Taq with WT and Cxxc1 cKO Tregs to determine whether CXXC1 deletion changes Foxp3's binding pattern in Treg cells.

      Thank you for your insightful comments and valuable suggestions. We greatly appreciate your recommendation to explore the potential mechanism by which CXXC1 enhances broad H3K4me3-modified genomic regions.

      In response, we plan to conduct CUT&Tag experiments for Foxp3 in both WT and Cxxc1 cKO Treg cells.

      Reviewer #2 (Public review):

      FOXP3 has been known to form diverse complexes with different transcription factors and enzymes responsible for epigenetic modifications, but how extracellular signals timely regulate FOXP3 complex dynamics remains to be fully understood. Histone H3K4 tri-methylation (H3K4me3) and CXXC finger protein 1 (CXXC1), which is required to regulate H3K4me3, also remain to be fully investigated in Treg cells. Here, Meng et al. performed a comprehensive analysis of H3K4me3 CUT&Tag assay on Treg cells and a comparison of the dataset with the FOXP3 ChIP-seq dataset revealed that FOXP3 could facilitate the regulation of target genes by promoting H3K4me3 deposition.

      Moreover, CXXC1-FOXP3 interaction is required for this regulation. They found that specific knockdown of Cxxc1 in Treg leads to spontaneous severe multi-organ inflammation in mice and that Cxxc1-deficient Treg exhibits enhanced activation and impaired suppression activity. In addition, they have also found that CXXC1 shares several binding sites with FOXP3 especially on Treg signature gene loci, which are necessary for maintaining homeostasis and identity of Treg cells.

      The findings of the current study are pretty intriguing, and it would be great if the authors could fully address the following comments to support these interesting findings.

      Major points:

      (1) There is insufficient evidence in the first part of the Results to support the conclusion that "FOXP3 functions as an activator by promoting H3K4Me3 deposition in Treg cells". The authors should compare the results for H3K4Me3 in FOXP3-negative conventional T cells to demonstrate that at these promoter loci, FOXP3 promotes H3K4Me3 deposition.

      We appreciate the reviewer’s critical observation regarding our claim about FOXP3’s role in promoting H3K4me3 deposition. We acknowledge that FOXP3 is a multifunctional transcription factor with diverse mechanisms of action, including transcriptional activation independent of H3K4me3 deposition and transcriptional repression that does not necessarily involve H3K27me3 deposition.

      Our intention was not to imply that promoting H3K4me3 deposition is the exclusive or predominant function of FOXP3 but rather to highlight that this mechanism contributes significantly to its role in regulating Treg cell function. We agree that our wording may have overstated this point, and we will revise the text to provide a more nuanced interpretation. Specifically, we will clarify that our observations suggest FOXP3 can facilitate transcriptional activation, in part, by promoting H3K4me3 deposition, but this does not preclude its other regulatory mechanisms.

      We will compare H3K4me3 levels at the promoter loci of interest between FOXP3-negative conventional T cells and FOXP3-positive regulatory T cells. This comparison will help elucidate whether FOXP3 directly promotes H3K4me3 deposition at these loci.

      (2) In Figure 3 F&G, the activation status and IFNγ production should be analyzed in Treg cells and Tconv cells separately rather than in total CD4+ T cells. Moreover, are there changes in autoantibodies and IgG and IgE levels in the serum of cKO mice?

      We appreciate the reviewer’s constructive feedback on the analyses presented in Figures 3F and 3G and the additional suggestion to investigate autoantibodies and serum immunoglobulin levels.

      Regarding Figures 3F and 3G, we agree that separating Treg cells and Tconv cells for analysis of activation status and IFN-γ production would provide a more precise understanding of the cellular dynamics in Cxxc1 cKO mice.

      To address this, we will reanalyze the data to examine Treg and Tconv cells independently and include these results in the revised manuscript.

      As for the changes in autoantibodies and serum IgG and IgE levels, we acknowledge that these parameters are important indicators of systemic immune dysregulation.

      We will now measure serum autoantibodies and immunoglobulin levels in Cxxc1 cKO mice and WT controls.

      (3) Why did Cxxc1-deficient Treg cells not show impaired suppression than WT Treg during in vitro suppression assay, despite the reduced expression of Treg cell suppression assay -associated markers at the transcriptional level demonstrated in both scRNA-seq and bulk RNA-seq?

      Thank you for your thoughtful question. We appreciate your interest in understanding the apparent discrepancy between the reduced expression of Treg-associated suppression markers at the transcriptional level and the lack of impaired suppression observed in the in vitro suppression assay.

      There are several potential explanations for this observation:

      (1) Functional Redundancy: Treg cell suppression is a complex, multi-faceted process involving various effector mechanisms such as cytokine production (e.g., IL-10, TGF-β), cell-cell contact, and metabolic regulation. Thus, even though the transcriptional signature of suppression-associated genes is altered, compensatory mechanisms may still allow Cxxc1-deficient Treg cells to retain functional suppression capacity under these specific in vitro conditions.

      (2) In Vitro Assay Limitations: The in vitro suppression assay is a simplified model of Treg function that may not capture all the complexities of Treg-mediated suppression in vivo. While we observed altered gene expression in Cxxc1-deficient Treg cells, this might not directly translate to a functional defect under the specific conditions of the assay. In vivo, additional factors such as cytokine milieu, cell-cell interactions, and tissue-specific environments may be required for full suppression, which could be missing in the in vitro assay.

      (4) Is there a disease in which Cxxc1 is expressed at low levels or absent in Treg cells? Is the same immunodeficiency phenotype present in patients as in mice?

      Thank you for your insightful question regarding the role of CXXC1 in Treg cells and its potential link to human disease. To our knowledge, no specific human disease has been identified where CXXC1 is expressed at low levels or absent specifically in Treg cells. There is currently no direct evidence of an immunodeficiency phenotype in human patients that parallels the one observed in Cxxc1-deficient mice.

      Reviewer #3 (Public review):

      In the report entitled "CXXC-finger protein 1 associates with FOXP3 to stabilize homeostasis and suppressive functions of regulatory T cells", the authors demonstrated that Cxxc1-deletion in Treg cells leads to the development of severe inflammatory disease with impaired suppressive function. Mechanistically, CXXC1 interacts with Foxp3 and regulates the expression of key Treg signature genes by modulating H3K4me3 deposition. Their findings are interesting and significant. However, there are several concerns regarding their analysis and conclusions.

      Major concerns:

      (1) Despite cKO mice showing an increase in Treg cells in the lymph nodes and Cxxc1-deficient Treg cells having normal suppressive function, the majority of cKO mice died within a month. What causes cKO mice to die from severe inflammation?

      Considering the results of Figures 4 and 5, a decrease in Treg cell population due to their reduced proliferative capacity may be one of the causes. It would be informative to analyze the population of tissue Treg cells.

      We thank the reviewer for this insightful comment and acknowledge the importance of understanding the causes of severe inflammation and early mortality in cKO mice. Based on our data and previous studies, we propose the following explanations:

      (1) Reduced Treg Proliferative Capacity: As shown in Figure 5I, the decreased proportion of FOXP3+Ki67+ Treg cells in cKO mice likely reflects impaired proliferative capacity, which may limit the expansion of functional Treg cells in response to inflammatory cues, particularly in peripheral tissues where active suppression is required.

      (2) Altered Treg Function and Activation: Cxxc1-deficient Treg cells exhibit increased expression of activation markers (Il2ra, Cd69) and pro-inflammatory genes (Ifng, Tbx21). This suggests a functional dysregulation that may impair their ability to suppress inflammation effectively, despite their presence in lymphoid organs.

      (3) Tissue Treg Populations: Although our study focuses on lymph node-resident Treg cells, tissue-resident Treg cells play a crucial role in maintaining local immune homeostasis. It is plausible that Cxxc1 deficiency compromises the accumulation or functionality of tissue Treg cells, contributing to uncontrolled inflammation in non-lymphoid organs. Unfortunately, we currently lack data on tissue Treg populations, which limits our ability to directly address this hypothesis.

      Regarding the suggestion to analyze tissue Treg populations, we agree that this would be an important next step in understanding the cause of the severe inflammation and early mortality in Cxxc1-deficient mice.

      We plan to perform detailed analyses of Treg cell populations in various tissues, including the gut, lung, and liver, to determine if there are specific defects in tissue-resident Treg cells that could contribute to the observed phenotype.

      (2) In Figure 5B, scRNA-seq analysis indicated that Mki67+ Treg subset are comparable between WT and Cxxc1-deficient Treg cells. On the other hand, FACS analysis demonstrated that Cxxc1-deficient Treg shows less Ki-67 expression compared to WT in Figure 5I. The authors should explain this discrepancy.

      Thank you for pointing out the apparent discrepancy between the scRNA-seq and FACS analyses regarding Ki-67 expression in Cxxc1-deficient Treg cells.

      In Figure 5B, the scRNA-seq analysis identified the Mki67+ Treg subset as comparable between WT and Cxxc1-deficient Treg cells. This finding reflects the overall proportion of cells expressing Mki67 transcripts within the Treg population. In contrast, the FACS analysis in Figure 5I specifically measures Ki-67 protein levels, revealing reduced expression in Cxxc1-deficient Treg cells compared to WT.

      To address this discrepancy more comprehensively, we will further analyze the scRNA-seq data to directly compare Mki67 mRNA expression levels between WT and Cxxc1-deficient Treg cells.

      In addition, the authors concluded on line 441 that CXXC1 plays a crucial role in maintaining Treg cell stability. However, there appears to be no data on Treg stability. Which data represent the Treg stability?

      We appreciate the reviewer’s observation and recognize that our wording may have been overly conclusive. Our data primarily highlight the impact of Cxxc1 deficiency on Treg cell homeostasis and transcriptional regulation, rather than providing direct evidence for Treg cell stability. Specifically, the downregulation of Treg-specific suppressive genes (Nt5e, Il10, Pdcd1) and the upregulation of pro-inflammatory markers (Gzmb, Ifng, Tbx21) indicate a shift in functional states. While these findings may suggest an indirect disruption in the maintenance of suppressive phenotypes, they do not constitute a direct measure of Treg cell stability.

      To address the reviewer’s concern, we will revise our conclusion to more accurately state that our data support a role for CXXC1 in maintaining Treg cell homeostasis and functional balance, without overextending claims about Treg cell stability. Thank you for bringing this to our attention, as it will help us improve the clarity and precision of our manuscript.

      (3) The authors found that Cxxc1-deficient Treg cells exhibit weaker H3K4me3 signals compared to WT in Figure 7. This result suggests that Cxxc1 regulates H3K4me3 modification via H3K4 methyltransferases in Treg cells. The authors should clarify which H3K4 methyltransferases contribute to the modulation of H3K4me3 deposition by Cxxc1 in Treg cells.

      Thank you for pointing out the need to clarify the role of H3K4 methyltransferases in the modulation of H3K4me3 deposition by CXXC1 in Treg cells.

      In our study, we found that Cxxc1-deficient Treg cells exhibit reduced H3K4me3 levels, as shown in Figure 7. CXXC1 has been previously reported to function as a non-catalytic component of the Set1/COMPASS complex, which contains H3K4 methyltransferases such as SETD1A and SETD1B. These methyltransferases are the primary enzymes responsible for H3K4 trimethylation.

      References:

      (1) Lee J.H., Skalnik D.G. CpG-binding protein (CXXC finger protein 1) is a component of the mammalian Set1 histone H3-Lys4 methyltransferase complex, the analogue of the yeast Set1/COMPASS complex. J. Biol. Chem. 2005; 280:41725–41731.

      (2). J. P. Thomson, P. J. Skene, J. Selfridge, T. Clouaire, J. Guy, S. Webb, A. R. W. Kerr, A. Deaton, R. Andrews, K. D. James, D. J. Turner, R. Illingworth, A. Bird, CpG islands influence chromatin structure via the CpG-binding protein Cfp1. Nature 464, 1082–1086 (2010).

      (3) Shilatifard, A. 2012. The COMPASS family of histone H3K4 methylases: mechanisms of regulation in development and disease pathogenesis. Annu. Rev. Biochem. 81:65–95.

      (4) Brown D.A., Di Cerbo V., Feldmann A., Ahn J., Ito S., Blackledge N.P., Nakayama M., McClellan M., Dimitrova E., Turberfield A.H. et al. The SET1 complex selects actively transcribed target genes via multivalent interaction with CpG Island chromatin. Cell Rep. 2017; 20:2313–2327.

      Furthermore, it would be important to investigate whether Cxxc1-deletion alters Foxp3 binding to target genes.

      Thank you for this important suggestion regarding the impact of Cxxc1 deletion on FOXP3 binding to target genes. We agree that understanding whether Cxxc1 deficiency affects FOXP3’s ability to bind to its target genes would provide valuable insight into the regulatory role of CXXC1 in Treg cell function.

      To address this, we plan to perform CUT&Tag experiments to assess FOXP3 binding profiles in Cxxc1-deficient versus wild-type Treg cells. These experiments will allow us to determine if Cxxc1 loss disrupts FOXP3’s occupancy at key regulatory sites, which may contribute to the observed functional impairments in Treg cells.

      (4) In Figure 7, the authors concluded that CXXC1 promotes Treg cell homeostasis and function by preserving the H3K4me3 modification since Cxxc1-deficient Treg cells show lower H3K4me3 densities at the key Treg signature genes. Are these Cxxc1-deficient Treg cells derived from mosaic mice? If Cxxc1-deficient Treg cells are derived from cKO mice, the gene expression and H3K4me3 modification status are inconsistent because scRNA-seq analysis indicated that expression of these Treg signature genes was increased in Cxxc1-deficient Treg cells compared to WT (Figure 5F and G).

      Thank you for the insightful comment. To clarify, the Cxxc1-deficient Treg cells analyzed for H3K4me3 modification in Figure 7 were indeed derived from Cxxc1 conditional knockout (cKO) mice, not mosaic mice.

      The scRNA-seq analysis presented in Figures 5F and G revealed an upregulation of Treg signature genes in Cxxc1-deficient Treg cells. This finding suggests that the loss of Cxxc1 drives these cells toward a pro-inflammatory, activated state, underscoring the pivotal role of CXXC1 in maintaining Treg cell homeostasis and suppressive function.

      Regarding the apparent discrepancy between the reduced H3K4me3 levels and the increased expression of these genes, it is important to note that H3K4me3 primarily functions as an epigenetic mark that facilitates chromatin accessibility and transcriptional regulation, acting as an upstream modulator of gene expression. However, gene expression levels are also influenced by downstream compensatory mechanisms and complex inflammatory environments. In this context, the reduction in H3K4me3 likely reflects the direct role of CXXC1 in epigenetic regulation, whereas the upregulation of gene expression in Cxxc1-deficient Treg cells may result as a side effect of the inflammatory environment.

      To further substantiate our findings, we performed RNA-seq analysis on Treg cells from Foxp3_Cre/+ _Cxxc1_fl/fl (“het-KO”) and their littermate _Foxp3_Cre/+ _Cxxc1_fl/+ (“het-WT”) female mice, as presented in Figure S6C. This analysis revealed a notable reduction in the expression of key Treg signature genes, including _Icos, Ctla4, Tnfrsf18, and Nt5e, in het-KO Treg cells. Importantly, the observed changes in gene expression were consistent with the altered H3K4me3 modification status, further supporting the epigenetic regulatory role of CXXC1. These results further emphasize the critical role of CXXC1 promotes Treg cell homeostasis and function by preserving the H3K4me3 modification.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      The manuscript by Rowell et al aims to identify differences in TCR recombination and selection between foetal and adult thymus in mice. Authors sequenced the unpaired bulk TCR repertoire in foetal and adult mice thymi and studied both TCRB and TCRa characteristics in the double positive (DP, CD4+CD8+) and single positive (SP4 CD4+CD8CD3+ and SP8 CD4-CD8+CD3+) populations. They identified age-related differences in TCRa and TCRB segment usage, including a preferential bias toward 3'TRAV and 5' TRAJ rearrangements in foetal cells compared to adults who had a larger perveance for 5'TRAV segments. By depleting the thymocyte population in adult thymi using hydrocortisone, the authors demonstrated that the repertoire became more foetal like, they therefore argue that the preferential 5'TRAV rearrangements in adults may be resulting from prolonged/progressive TCRa rearrangements in the adult thymocytes. In line with previous studies, Authors demonstrate that the foetal TCR repertoire was less diverse, less evenly distributed and had fewer non-template insertions while containing more clonal expansions. In addition, the authors claim that changes in V-J usage and CDR1 and CDR2 in the DP vs SP repertoires indicated that positive selection of foetal thymocytes are less dependent on interactions with the MHC. 

      Strengths: 

      Overall, the manuscript provides an extensive analysis of the foetal and adult TCR repertoire in the thymus, resulting in new insights in T cell development in foetal and adult thymi. 

      Weaknesses: 

      Three major concerns arise:

      (1) the authors have analysed TCR repertoires of only 4 foetal and 4 adult mice, considering the high spread the study may have been underpowered. 

      Given the concerns of the reviewer we have sequenced more libraries and added more data to include repertoires from 7 embryos and 6 young adults (biological replicates from different sorts). We believe that including more replicates has indeed strengthened our study. 

      Our experimental approach was to sequence TCR transcripts, and in studies using RNA-sequencing of inbred mice, often only 3 individuals (biological replicates) are sequenced.

      Our study sequenced from 7 foetal thymuses (generating TCRα and TCRβ repertoires from 4 FACS-sorted cell populations); 6 adult thymuses (generating TCRα and TCRβ repertoires from 4 FACS-sorted cell populations); and 5 adult thymuses from hydrocortisone-treated mice (generating TCRα and TCRβ repertoires from FACS-sorted CD3lo and CD3hi DP populations). We thus analysed 124 distinct repertoires from different populations and libraries, and many tens of thousands of unique sequences.  

      (2) Gating strategies are missing and 

      We have included gating strategies for cell-sorting as SFig7 and SFig8.

      (3) the manuscript is very technical and clearly aimed for a highly specialised audience with expertise in both thymocyte development and TCR analysis. Authors are recommended to provide schematics of the TCR rearrangements/their findings and include a summary conclusions/implications of their findings at the end of each results section rather than waiting till the discussion. This will help the reader to interpret their findings while reading the results. 

      We have modified the manuscript to include a more general introductory paragraph (page 3) to introduce the reader to the topic and we have included brief summaries of the findings at the end of each result section (pages 7,9,10,12,13,15).

      Reviewer #2 (Public Review): 

      Summary: 

      The authors comprehensively assess differences in the TCRB and TCRA repertoires in the fetal and adult mouse thymus by deep sequencing of sorted cell populations. For TCRB and

      TCRA they observed biased gene segment usage and less diversity in fetal thymocytes. The TCRB repertoire was less evenly distributed and displayed more evidence of clonal expansions and repertoire sharing among individuals in fetal thymocytes. In both fetal and adult thymocytes they show skewing of V segment (CDR1-2) repertoires in CD4 and CD8 as compared to DP thymocytes, which they attribute to MHC-I vs MHC-II restriction during positive selection. However the authors assess these effects to be weaker in fetal thymocytes, suggesting weaker MHC-restriction. They conclude that in multiple respects fetal repertoires are distinct from and more innate-like than adult. 

      Strengths: 

      The analyses of the F18.5 and adult thymic repertoires are comprehensive with respect to the cell populations analyzed and the diversity of approaches used to characterize the repertoires. Because repertoires were analyzed in pre- and post-selection thymocyte subsets, the data offer the potential to assess repertoire selection at different developmental stages. The analysis of repertoire selection in fetal thymocytes may be unique. 

      Weaknesses: 

      (1) Problematic experimental design and some lack of familiarity with prior work have resulted in highly problematic interpretations of the data, particularly for TCRA repertoire development. 

      The authors note fetal but not adult thymocytes to be biased towards usage of 3' V segments and 5'J segments. It should be noted that these basic observations were made 20 years ago using PCR approaches (Pasqual et al., J.Exp.Med. 196:1163 (2002)), and even earlier by others.

      We have cited this manuscript (Introduction, page 5) which used PCR of genomic DNA to investigate some TCRα VJ rearrangements in foetal and adult thymus. In contrast, our study uses next generation sequencing of transcripts to investigate all possible combinations of TCRα and TCRβ VJ combinations in different sorted thymocyte populations ex vivo. The greater sensitivity of this more modern technology has thus enabled us to detect many more TCRαVJ rearrangements than the 2002 study, and to conclude on basis of stringent statistical testing that the foetal repertoire is enriched for 3’V to 5’J combinations (Fig. 4). 

      The authors also note that in fetal thymus this bias persists after positive selection, and it can be reproduced in adults during recovery from hydrocortisone treatment. The authors conclude that there are fewer rounds of sequential TCRA rearrangements in the fetal thymus, perhaps due to less time spent in the DP compartment in fetus versus adult. However, the repertoire difference noted by the authors does not require such an explanation. What the authors are analyzing in the fetus is the leading edge of a synchronous wave of TCRA rearrangements, whereas what they are analyzing in adults is the unsynchronized steady state distribution. It is certainly true, as has been shown previously, that the earliest TCRA rearrangements use 3' TRAV and 5'TRAJ segments. But analysis of adult thymocytes has shown that the progression from use of 3' TRAV and 5' TRAJ to use of 5' TRAV and 3' TRAJ takes several days (Carico et al., Cell Rep. 19:2157 (2017)). The same kinetics, imposed on fetal development, would put development of a more complete TCRA repertoire at or shortly after birth. In fact, Pasqual showed exactly this type of progression from F18 through D1 after birth, and could reproduce the progression by placing F16 thymic lobes in FTOC. It is not appropriate to compare a single snapshot of a synchronized process in early fetal thymocytes to the unsynchronized steady state situation in adults. In fact, the authors' own data support this contention, because when they synchronize adult thymocytes by using hydroxycortisone, they can replicate the fetal distribution. Along these lines, the fact that positive selection of fetal thymocytes using 3' TRAV and 5' TRAJ segments occurs within 2 days of thymocyte entry into the DP compartment does not mean that DP development in the fetus is intrinsically rapid and restricted to 2 days. It simply means that thymocytes bearing an early rearranging TCR can be positively selected shortly after TCR expression. The expectation would be that those DP thymocytes that had not undergone early positive selection using a 3' TRAV and a 5' TRAJ would remain longer in the DP compartment and continue the progression of TCRA rearrangements, with the potential for selection several days later using more 5'TRAV and 3'TRAJ. 

      We agree with this summary provided by the reviewer which corresponds closely to the points we made ourselves in the manuscript. Indeed, we discuss the synchronization and kinetics of first wave of T-cell development in Results page 13 and Discussion page 17, which was the rationale for the hydrocortisone experiment.  We have also discussed findings from Carico et al 2017 in this context (see pages 13, 16, 17).  

      (2) The authors note 3' V and 5'J biases for TCRB in fetal thymocytes. The previously outlined concerns about interpreting TCRA repertoire development do not directly apply here. But it would be appropriate to note that by deep sequencing, Sethna (PNAS 114:2253 (2017)) identified skewed usage of some of the same TRBV gene segments in fetal versus adult.  It should also be noted that Sethna did not detect significantly skewed usage of TRBJ  segments. Regardless, one might question whether the skewed usage of TRBJ segments detected here should be characterized as relating to chromosomal location. There are two logical ways one can think about chromosomal location of TRBJ segments - one being TRBJ1 cluster vs TRBJ2 cluster, the other being 5' to 3' within each cluster. The variation reported here does not obviously fit either pattern. Is there a statistically significant difference in aggregate use of the two clusters? There is certainly no clear pattern of use 5' to 3' across each cluster. 

      We have included a statistical comparison of the aggregate TRBJ use between the J1 cluster and the J2 cluster (see SFig5) and Results page 9. 

      (3) The authors show that biases in TCRA and TCRB V and J gene usage between fetal and adult thymocytes are mostly conserved between pre- and post-selection thymocytes (Fig 2). In striking contrast, TCRA and TCRB combinatorial repertoires show strong biases preselection that are largely erased in post-selection thymocytes (Fig 3). This apparent discrepancy is not addressed, but interpretation is challenging. 

      I think the reviewer is referring to heatmaps for individual gene segment usage shown in Figure 2 in comparison to combinatorial usage shown in Figure 4. There is not a discrepancy in the data, but rather the differences between these two figures lie in the way in which the comparisons are made and visualised.  The heatmaps in Figure 2A-D show mean proportional usage of each individual gene segment for each cell type in the two life stages, clustered by Euclidian distance. This visualisation clearly shows bias in foetal 3’ TRAV usage and 5’TRAJ usage (looking at areas of red, which have higher usage), with less pronounced enrichment for TRBV and TRBJ.  The heatmaps also show differences in intensity between different cell populations in each life-stage. 

      In contrast, in Figure 4 the tiles show combinations with statistically significant (P<0.05) differences in mean counts for each VJ combination in each cell type between 7 foetal and 6 adult repertoires by Student’s t-test, after correcting for False discovery rate (FDR) due to multiple combinations.  It is the case, that there are fewer significant differences in proportional combinatorial VxJ use between foetal and adult repertoires after selection. We find this an interesting finding and have expanded our discussion of this aspect of the data (page 10).  More than half of the significant differences persist after repertoire selection, and the reduction in each individual SP population, of course in part reflects the lineage divergence.

      (4) The observation that there is a higher proportion of nonproductive TCRB rearrangements in fetal thymus compared to adult is challenging to interpret, given that the results are based upon RNA sequencing so are unlikely to reflect the ratio in genomic DNA due to processes like NMD.

      We have added two sentences to explain that transcripts of non-productive rearrangements are eliminated by nonsense-mediated decay (NMD), but some non-productive transcripts are detected in many studies of TCR repertoire sequencing, and we have cited three studies from different groups that document this (see Results, page 10-11). We have not commented on how the increase in non-productive TCR rearrangements in the foetal populations (in comparison to adult) relates to rearrangements in genomic DNA or NMD.   We have likewise not commented on the possible significance or biological role of nonproductive TCR transcripts, but simply reported our findings.

      (5) An intriguing and paradoxical finding is that fetal DP, CD4 and CD8 thymocytes all display greater sharing of TCRB CDR3 sequences among individuals than do adults (Fig 5DE), whereas DP and CD8 thymocytes are shown to display greater CDR3 amino acid triplet motif sharing in adults (with a similar trend in CD4). 

      As foetal DP, CD4SP and CD8SP TCRbeta repertoires have fewer non-template insertions and lower means CDR3 length, they are expected to share more CDR3 repertoires than their adult counterparts.  However, in the case of CDR3 amino acid triplet motifs (k-mers) what is being analysed is the sharing of each possible individual k-mer. If k-mers are shared more in the adult for some populations, but CDR3 repertoires are shared more in the foetus, we think it means that some k-mers appear in many different CDR3 sequences in the adult, so that they are over-represented in multiple different CDR3s (presumably due to selection processes, although we agree that this is just an assumption).  

      The authors attribute high amino acid triplet sharing to the result of selection of recurrent motifs by contact with pMHC during positive selection. But this interpretation seems highly problematic because the difference between fetal and adult thymocytes is dramatic even in unfractionated DP thymocytes, the vast majority of which have not yet undergone positive selection. How then to explain the differences in CDR3 sharing visualized by the different approaches? 

      The TCRβ repertoire has been selected in the adult DP population through the process of β-selection, which is believed to involve immune synapse formation and MHC-interactions (Allam et al 2021,10.1083/jcb.201908108). We have now included this reference in the introduction to make this clear (page 4). However, we agree with the reviewer’s comments that it is challenging to explain the k-mer analysis and that we have not been able to actually show that increased k-mer sharing in the adult is a direct consequence of increased positive selection: it was our interpretation of this seemingly paradoxical finding.  For clarity, we have therefore removed the k-mer analyses from the manuscript.

      (6) The authors conclude that there is less MHC restriction in fetal thymocytes, based on measures of repertoire divergence from DP to CD4 and CD8 populations (Fig. 6). But the authors point to no evidence of this in analysis of TRBV usage, either by PC or heatmap analyses (A,B,D). The argument seems to rest on PC analysis of TRAV usage (Fig S6), despite the fact that dramatic differences in the SP4 and SP8 repertoires are readily apparent in the fetal thymocyte heatmaps. The data do not appear to be robust enough to provide strong support for the authors' conclusion. 

      We have written the text very carefully so as not to make the claim too strong, stating in the abstract: “In foetus we identified less influence of MHC-restriction on α-chain and β-chain combinatorial VxJ usage and CDR1xCDR2 (V region) usage in SP compared to adult, indicating weaker impact of MHC-restriction on the foetal TCR repertoire.” We are not saying that MHC-restriction does not impact VJ gene usage in foetal repertoires, but rather that it has less influence (particularly when compared to life-stage).  Evidence for this comes from:  [1] Heatmaps in Fig2A-D which show that all repertoires cluster first by life-stage ahead of cell type; [2] Fig3A and B: PCA of adult and foetal TCRβ VXJ combinations: All repertoires cluster by life-stage on PC1.  PC2 separates adult repertoires by cell type (adult SP8 are positive on PC2 while adult SP4 are negative on PC2, and DP cells are between them) but for foetal repertoires the SP8 and SP4 are highly dispersed with some SP4 cells falling on positive side of PC2.  Only foetal DP repertoires cluster tightly. [3] Fig6A-C: PCA of β−chain CDR1xCDR2 (corresponding to Vβ gene segment usage) again shows the same pattern.  Adult repertoires separate by cell type on PC2, (SP8 positive on PC2, SP4 negative on PC2, with DP in between), but foetal SP8 repertoires are much more dispersed.  [5] SFig6J-K: PCA of α−chain CDR1xCDR2 (Vα usage) frequency distributions: adult repertoires cluster together and are separated by cell type on PC2 (SP4 positive, SP8 negative), but foetal populations are highly dispersed and fail to cluster by cell type on either axis. [6] We have additionally added new PCA analyses to explore differences in MHC-restriction between foetal and adult SP populations.  This is shown in the new Figure 7. We reasoned that in a PCA that included foetal and adult repertoires together, the foetal repertoires might not segregate by SP cell type (MHC-restriction) because of their overall bias towards particular VJ combinations, which would mean that effectively the PCA would be imposing adult MHC restriction on the foetal repertoires.  We therefore carried out PCA in which we analysed the adult repertoires separately from the foetal repertoires.  As expected for adult repertoires, PCA separated SP4 repertoires from SP8 repertoires on PC1 in each comparison (β-chain VxJ (Fig. 7B), α-chain VxJ (Fig. 7F), β-chain CDR1xCDR2 (V region) (Fig. 7H) and α-chain CDR1xCDR2 (V region) (Fig. 7L)). In contrast, for foetal TCRα repertoires (α-chain VxJ and α-chain CDR1xCDR2 (V region)), PCA failed to separate SP4 from SP8 repertoires on PC1 or PC2, so we did not detect impact of MHC-restriction on foetal TCRβ repertoires (Fig. 7E and K).  For foetal TCRβ repertoires, PCA separated SP4 β-chain VxJ from SP8 on PC2, accounting for only 11.1% of variance (Fig. 7A) (in contrast to the 44.2% of variance accounted for by MHC-restriction in adult β-chain VxJ PCA (Fig. 7B)). Thus, in adult repertoires ~4-fold more of the variance in β-chain VxJ usage can be accounted for by MHC-restriction than in foetal repertoires. PCA of foetal β-chain CDR1xCDR2 (V region) separated SP4 from SP8 on PC1, accounting for 28.8% of variance, whereas in PCA of adult β-chain CDR1xCDR2, MHCrestriction accounted for 56.1% (>2-foldmore than in foetus).  Thus, even when we  considered only V-region usage alone, we detected a stronger influence of MHC-restriction on the TCRβ repertoire in adult compared to foetal thymus.  

      Reviewer #3 (Public Review): 

      Summary:

      This study provides a comparison of TCR gene segment usage between foetal and adult thymus.

      Strengths:

      Interesting computational analyses was performed to find interesting differences in TCR gene usage within unpaired TCRa and TCRb chains between foetal and adult thymus.  

      Weaknesses:

      This study was significantly lacking insight and interpretation into what the data analysed actually means for the biology. The dataset discussed in the paper is from only two experiments. One comparing foetal and adult thymi from 4 mice per group and another which involved hydrocortisone treatment. The paper uses TCR sequencing methodology that sequences each TCR alpha and beta chains in an unpaired way, meaning that the true identity of the TCR heterodimer is lost. This also has the added problem of overestimating clonality, and underestimating diversity.

      We have discussed the limitations and benefits of our approach of sequencing TCRβ and TCRα repertoires separately in the Discussion (page 19).  This approach allows the analysis of thousands of sequences from different cell types and different individuals at relatively low cost. We have made no claims in our manuscript about overall diversity or pairing, and given that each chain’s gene locus rearranges at a different time point in development, we believe it is of interest to consider the repertoires individually within this context.

      Limited detail in the methods sections also limits the ability for readers to properly interpret the dataset. What sex of mice were used? Are there any sex differences? What were the animal ethics approvals for the study?

      We have included this information in the Methods (page 19).  Both sexes were used and we found no sex differences, although that was not the focus of our study. All animal experimentation in the UK is carried out under UK Home Office Regulations (following ethical review). This is included in the Methods (page 19).  

      Recommendations for the authors:  

      Reviewer #1 (Recommendations For The Authors): 

      Major points: 

      - Group sizes are very small (4 foetal and 4 adult mice). Considering the spread in TCR analysis (eg fig 1 B-H, Sup figures 2-4), the study is likely underpowered as it often looks like one mouse prevents or supports a statistical difference. Authors should therefore consider increasing the group size. 

      We have sequenced more libraries and included more data, from 7 foetal and 6 young adult animals (biological replicates).  

      - The authors should include a gating strategy for their sorted cells. This is essential to verify the quality of their findings. 

      We have added this to the Methods and SFig7 and SFig8.

      Authors should include a summary sentence at the end of each result section which interprets the main finding. Furthermore, the manuscript would greatly benefit from a schematic figure of their main findings, particularly with regards to the rearrangements and selection differences in foetal and adult thymi. 

      We have added a summary sentence to the end of each results section.

      - Authors should be more careful with their claim that MHC has less of an effect foetal TCR selection. Authors demonstrated that there is a difference in VJ recombination between the foetal and adult TCR repertoire, skewing the foetal TCR repertoire to certain variable and junctional segments. Since both CDR1 and CDR2 are encoded by the variable gene, this is likely to affect their ability to interact with the MHC during positive selection. Have Authors considered whether the selection process is actually a bystander effect of the differences in the rearrangement process? One way to support the authors claim is to demonstrate that mice with an alternative MHC background, have similar foetal/adult gene rearrangements but a different TCR repertoire in the SP populations. 

      Time and resources have prevented us from repeating our experiments in another strain of inbred mice.  However, we note that a previous PCR study that showed 3’TRAV to 5’TRAJ bias in foetal repertoires was carried out in BALB/c mice (Pasqual JEM 2002). We have added this point to the Discussion (page 17). 

      - (supplementary) tables have not been provided. 

      Supplementary Tables were uploaded with the submission.  STables 1 and 2 show antibodies used for cell sorts and STable 3 primers used.

      Moderate points: 

      - The loading plots in Figure 3 onward are visually strong. Authors could consider including an V and J (separate) loading plots for Figure 3 E, F and G to demonstrate preferential V and J usage. 

      We have included additional loading plots in Figure 7 for the new PCA we have added (see Fig. 7C, D,I and J).

      - "the proportion of non-productive rearrangements was higher in the foetal SP8 population than adults (Fig 5A)" Authors should explain how non-productive TCRs end up in SP populations as they need to pass positive and negative selection which both require interactions between the TCR and the MHC. 

      As we used RNA sequencing in our study, we did not comment on how the increase in nonproductive TCRbeta rearrangements in the foetal populations (in comparison to adult) relates to rearrangements in genomic DNA or to nonsense-mediated decay (NMD) that is believed to down-regulate transcripts of non-productively rearranged TCR.  We have not commented on the possible significance or biological role of non-productive TCR transcripts, but simply reported our findings. 

      - Authors have studied CDR3 sequential amino acid triplets (k-mers). However, CDR3 regions are longer than 3 amino acids in length, hence authors should provide 1) an overview/comparison of the identified k-mers in foetal or adult thymocytes 2) explain how different k-mers relate to each other, eg whether they are expressed in the same TCR. Have authors considered using alternative programs to identify CDR3 motifs that are based on the full CDR3amino acid sequence, eg TCRdist provides motifs and indicated which amino acids are germline encoded or inserted. 

      In light of this comment from this reviewer and also comments from Reviewer 2, we have removed the comparison of k-mers from the manuscript.  Please see response to point 5 of Reviewer 2.  

      - The term "innate-like" is confusing as it implies that foetal cells are not antigen specific.

      However, once in the circulation, foetal cells will respond in an antigen-specific manner.

      Hence authors should use another term. 

      We have removed the term “innate-like” from the abstract and the first time we used it in the first paragraph of the Discussion. However, the second time we used the term, we are actually taking it from the manuscript we cited (Beaudin et al 2016) and in this case we left it in. We agree that foetal cells are likely to respond in an antigen-specific manner. 

      - To support their hypothesis in the discussion "However, as TCRd gene segments are nested.... so that 5' TRAV segments are not favoured" can authors confirm that there are indeed less yd T cells in the foetal repertoire? 

      We have removed this section from the discussion, because although it is interesting, it is highly speculative, and the manuscript is already quite complicated to interpret.

      Minor points: 

      - The authors may find the publication by De Greef 2021 PNAS of interest to identify TRBD segments 

      - Authors need to clarify that they mean CDR3-beta in the sentence "The mean predicted CDR3 length.... compared to young adult" 

      We have included new data in the manuscript to show that mean CDR3 length is lower in all foetal populations of beta (Fig5C) and alpha (SFig5C) and clarified which we are referring to in the text. 

      - Authors should bring the section "During TCRb gene rearrangement, these segments.... Initiating the sequence of rearrangements" forward and include a schematic." Forward to figure 2 and provide the reader with a visual schematic of the foetal vs adult recombination events. 

      - Discussion: "The first wave of foetal abT-cells that leave the thymus... tolerant to both self and maternal MHC/antigens". Have Authors considered the alternative hypothesis published by Thomas 2019 in Curr Opin System Biol that the observed bias could potentially provide better protection against childhood pathogens? 

      We have indeed considered this, as stated in the first paragraph of the Discussion “The first wave of foetal αβT-cells that leave the thymus must provide early protection against infection in the neonatal animal”. We have now cited the Thomas 2019 study.

      - Discussion: Authors should rephrase the sentence "The transition from DP to SP cell in the foetus.... From DN3 to SP cell may be slower" as it is unclear what the authors mean. 

      We have rephrased this (see page 17)

      - Discussion "TRAV and TRAJ Array" do authors mean "TRAV and TRAJ area"? 

      We did indeed mean array (as in series of gene segments) but we have changed the wording for clarity (page 14).

      - Methods, Fluorescence activated cell sorting: can authors clarify whether they stained, sorted and sequenced the full thymus and /or specify how many cells were included. Can authors also explain why foetal and adult cells were treated differently (eg the volume of master mix)? 

      - Methods Fluorescence activated cell sorting authors should specify what they mean with "mastermix of either 1:50 (foetal thymus) or 1:100 (adult thymus)". Does this mean all antibodies in the foetal mastermix were 1:50 and all antibodies in the adult master mix were 1:100? If so, why were different concentrations used and why were antibodies not individually titrated before use?  

      We have clarified the methods and antibodies used are listed with clones in supplementary tables.

      Figures: 

      - Several figures did not fit on the page and therefore missed the top or side 

      - Figure 1A: missing a label on the Y axis

      This is visible

      - Figure 2A-D: please indicate the 5' and 3' terminus in each graph. The cell type legend should include two separate colours for the two DP populations. 

      We have added 5’ and 3’ labels.  The two DP populations are clearly labelled.

      - Figure 4: please indicate the 5' and 3' terminus in each graph. 

      We have added 5’ and 3’ labels.   

      - Figure 5C: y axis should read mean CDR3B length (aa), Figure 5D and E: y axis should read Jaccard Index CDR3B, Figure 5 F and G: y axis should read Jaccard index CDR3B k-mers. Same comment for Sup Fig 5 but then CDR3a. 

      We have added these labels for both Figure 5 and Supplementary Figure 6 (was SFig5 previously).

      - Figure 6C top label should read CDR1B x CDR2B with highest contribution 

      We have added this label.

      - Figure 7: please indicate the 5' and 3' terminus in each graph. 

      We have added 5’ and 3’ labels.  This is now Figure 8, as we have added new analyses (new Figure 7).

      - Supplementary Figure 1-4 are missing a colour legend next to the graphs.

      We have added the legends in.  

      Reviewer #2 (Recommendations For The Authors): 

      (1) The authors need to provide better support for the notion that the fetal thymus produces ab T cells with properties and functions that are distinct from adult T cells. There are several  ways they might provide a more meaningful assessment: (1) They could analyze the fetal repertoire at multiple time points. (2) They could compare instead the steady state distributions in early postnatal and adult thymus samples. (3) They could compare the peripheral T cell repertoires in the first week of life versus adult. This last approach would allow them to draw the most impactful conclusion. 

      We appreciate these suggestions.  Sadly, it is beyond our budget for the current manuscript and beyond the scope of our current study that we believe provides interesting new information.

      (2) Fig S2D shows TRBJ1-4 in black lettering meant to indicate no significant difference whereas the figure shows use of this gene segment to be elevated in adult. I believe TRBJ1-4 should be in blue lettering.

      This is now coloured correctly.

      (3) The figure call out on p11 (Fig5I-J) should be H-I.

      This is now corrected.

      (4) Please indicate in the main text that Jaccard analysis in Fig 5 D-E is for TCRB.

      This is now corrected.

      (5) The analysis of usage of TCRB CDR1xCDR2 combinations in Fig6D is said to "reflect the bias observed in their TRBV gene usage (Fig 2C)". Isn't it the case that every TRBV gene presents a distinct CDR1xCDR2 combination, meaning that there is no difference between TRBV usage and TRBV CDR1xCDR2 usage? If so, please make this clearer.

      Yes, this is the case, we have made this clearer in the text.

      Reviewer #3 (Recommendations For The Authors): 

      In general, although there is lots of interesting analyses that can be done with these large datasets, I feel as though the authors did not fully interpret the real meaning and significance of many of these results. Whilst there were some speculation on why a foetal repertoire might be different to those of adults in the discussion sections, the rationale for each individual analyses was not clearly explained. I would suggest that the rationale and a thorough explanation of each analyses be added to the results section, including a finishing sentence on what it means. 

      We have added short summaries to each results section to make the points we are making clearer.

      The authors did not mention how many cells were sorted for from each thymus for sequencing. Was the cell number normalised between each population? As this might have an influence on various downstream measurements of diversity, evenness and clonality, if there is a sampling issue. 

      This is explained in the methods.  We used sampling to allow comparisons between repertoires of different sizes, and this is also explained in the methods.

      The authors should include the cell sorting profiles and example flow cytometry plots, including gating strategies and the post sort purity of each sorted population. 

      We have included sorting strategies in the methods (SFig7 and SFig8).

      I think the manuscript could also be improved if there were some basic characterisation of foetal vs. adult thymus development. How many thymocytes are in a foetal vs adult thymus at the timepoints chosen? 

      I think there were some interesting findings in this paper. Given that overall, the foetal thymus appeared to be less diverse than that of the adult, one question I thought would be interesting to discuss was the overlap between the two repertoires. Is the foetal thymus simply a sub-fraction of the adult repertoire or is it totally distinct with no overlapping sequences? 

      Our analyses indicate that the repertoires are actually different. This is evident in Fig4 and in PCA loading plots shown in Fig, 3C and new Fig. 7C, D, I and J.

      I think that some of the interpretation in the results section may be a bit vague. "When we compaired by thymocyte population, each adult population clustered together, with adult SP4 separating from adult SP8 on PC2 and DP cells scoring in between, suggesting that PC2 might correspond to MHC restriction of the adult populations." - whilst I think I know what the authors mean, I do believe that this could be explained in clearer detail and more explicit. SP4 and SP8 are known to be positively selected in the thymus on distinct MHC class I and MHC class II molecules for example. 

      We have tried to clarify the text describing that PCA and additionally added a new Figure (new Fig. &) to compare the influence of MHC-restriction on the TCR repertoire in foetal and adult thymus.

      In the methods section, the age and sex of mice used were not explained at all. What was used in the experiment? Are there any sex differences? 

      Age and sex of mice is given in the methods.  We have not detected sex differences.

      This is a huge omission from the manuscript. In general, I don't believe the methods section has described the analysis in sufficient detail for replication. All analysis code and data should be publicly accessible and be in a format that allows for the reader to replicate the figures in the paper upon running the code. Perhaps even allowing them to run their own TCR datasets.  Overall, I think the manuscript needs some rewriting to include additional details and deeper interpretation of each individual analyses. 

      Sequencing data files will be made publicly available on UCL Research Data Repository.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors report compound heterozygous deleterious variants in the kinase domains of the non-receptor tyrosine kinases (NRTK) TNK2/ACK1 in familial SLE. They suggest that ACK1 and BRK deficiencies are associated with human SLE and impair efferocytosis.

      Strengths: 

      The identification of similar mutations in non-receptor tyrosine kinases (NRTKs) in two different families with familial SLE is a significant finding in human disease. Furthermore, the paper provides a detailed analysis of the molecular mechanisms behind the impairment of efferocytosis caused by mutations in ACK1 and BRK.

      Weaknesses: 

      A critical point in this paper is whether the loss of function of ACK1 or BRK contributes to the onset of familial SLE. The authors emphasize that inhibitors of ACK1/BRK worsened IgG deposition in the kidneys in a pristane-induced SLE model, which contributes not to the onset but to the exacerbation of SLE, thus only partially supporting their claim.

      The evidence supporting that the loss of function of ACK1 or BRK contributes to the onset of SLE in the patients from the 2 families mostly relies on the genetic analysis. As the reviewer states, the observation that inhibitors of ACK1/BRK worsened IgG deposition in the kidneys in a pristane-induced SLE model supports the genetic evidence.

      To further address the possible role of ACK1 or BRK variants in the onset of autoimmunity in vivo, we treated wild-type (WT) BALB/cByJ female mice with inhibitors in the absence of pristane.

      The results indicated that mice that had received a weekly injection of ACK1 or BRK inhibitors developed a large array of serum anti-nuclear IgG antibodies, including but not limited to autoantibodies associated with SLE such as anti-histones, anti-chromatin, anti U1-snRNP, anti-SSA, and anti-Ku in comparison to the control group inhibitor treated mice (Revised Fig 3A). However, they did not develop glomerular deposit of IgG after 12 weeks of treatment, in contrast to mice that have received Pristane (Revised Fig. 3B,C, Figure 3-figure supplement 1).

      These additional data suggests that inhibition of ACK1 and BRK stimulates the production of serum autoantibodies, which strengthen the claim that ACK1 and BRK kinase deficiency contribute to autoimmunity in BALB/cByJ.

      Reviewer #2 (Public Review):

      Summary: 

      In this manuscript, the authors revealed that genetic deficiencies of ACK1 and BRK are associated with human SLE. First, the authors found that compound heterozygous deleterious variants in the kinase domains of the non-receptor tyrosine kinases (NRTK) TNK2/ACK1 in one multiplex family and PTK6/BRK in another family. Then, by an experimental blockade of ACK1 or BRK in a mouse SLE model, they found an increase in glomerular IgG deposits and circulating autoantibodies. Furthermore, they reported that ACK and BRK variants from the SLE patients impaired the MERTK-mediated anti-inflammatory response to apoptotic cells in human induced pluripotent stem cells (hiPSC)-derived macrophages. This work identified new SLE-associated ACK and BRK variants and a role for the NRTK TNK2/ACK1 and PTK6/BRK in efferocytosis, providing a new molecular and cellular mechanism of SLE pathogenesis.

      Strengths: 

      This work identified new SLE-associated ACK and BRK variants and a role for the NRTK TNK2/ACK1 and PTK6/BRK in efferocytosis, providing a new molecular and cellular mechanism of SLE pathogenesis.

      Weaknesses: 

      Although the manuscript is well-organized and clearly stated, there are some points below that should be considered:

      In this study, the authors used forward genetic analyses to identify novel gene mutations that may cause SLE, combined with GWAS studies of SLE. To further explore the importance of these variants, haplotype analysis of two candidate genes could be performed, to observe the evolution and selection relationship of candidate genes in the population (UK 1000 biobank, for example). 

      To investigate whether ACK1/TNK2 or BRK/PTK6 were subject to selection, we gathered data using different metrics quantifying negative selection in the human genome. We collected the f parameter from SnIPRE1, lofTool2, and evoTol3, as well as intraspecies metrics from RVIS4, LOEUF5, and pLI6 (including pRec). We also used our in-house CoNeS metric7. None of these indicators suggest that the genes are under strong negative selection (Revised Figure 2-figure supplement 2). This is consistent with the deficiency being recessive. We also tested the variants with a MAF greater than 0.005. We found them to be neutral. We therefore did not test whether they were associated with any phenotype in the UK Biobank.

      Although the authors focused on SLE and macrophage efferocytosis in their studies, direct evidence of how macrophage efferocytosis significantly affects SLE is lacking. This point should at least be explicitly introduced and discussed by citing appropriate literature.

      We provide a more detailed description of the role of macrophage efferocytosis in autoimmunity and SLE in the revised manuscript. Specifically, we state (in the results section, paragraph: ACK1 and BRK kinase domain variants may lose the ability to link MERTK to RAC1, AKT and STAT3 activation for efferocytosis): “NRTKs such as ACK1 8 and PTK2/FAK 9 are also downstream targets of the TAM family receptor MERTK which is expressed on macrophages and controls the anti-inflammatory engulfment of apoptotic cells, a process known as efferocytosis 10-12. Efferocytosis allows for the clearance of apoptotic cells before they undergo necrosis and release intracellular inflammatory molecules, and simultaneously leads to increased production of anti-inflammatory molecules (TGFb, IL-10, and PGE2) and a decreased secretion of proinflammatory cytokines (TNF-alpha, IL-1b, IL-6) 10-14. In line with these findings, mice deficient in molecular components used by macrophages to efficiently perform efferocytosis, such as MFG-E8, MERTK, TIM4, and C1q, develop phenotypes associated with autoimmunity10,11,14-27. Furthermore, defects in efferocytosis are also observed in patients with SLE and glomerulonephritis14,28-31.“

      It is still not clear how the target molecules identified in this paper may influence macrophage efferocytosis. More direct evidence should be established. 

      Our studies show that wt -but not variants- of ACK1 and BRK are activated by MERTK, a key receptor that mediates the recognition of apoptotic cells. Our studies also show that wt -but not variants- activate RAC1 which is necessary for engulfment and phosphorylate AKT and STAT3 which are involved in the anti-inflammatory response to PtdSer recognition.

      The TAM family receptor MERTK mediates recognition of PtdSer on apoptotic cells via GAS6 and Protein S 10,15,32 leading to their engulfment, which involves activation of RAC1 for actin reorganization and the formation of a phagocytic cup 9,33. Using IP kinase assays we show that MERTK and GAS6 can activate the kinase activity of wild-type ACK1 8 or BRK but not of the patient’s ACK1 or BRK variant alleles (Figure 4D). To further support the role of ACK1 and BRK downstream from PtdSer recognition and uptake of apoptotic cells, we show that reference ACK1 and BRK alleles, in contrast to the patient variant alleles, can activate RAC1 to generate RAC-GTP which is necessary for engulfment 9,33 (Figure 4C).

      PtdSer recognition also typically stimulates an anti-inflammatory process mediated in part via AKT 34 and STAT3 and their target genes such as SOCS3 35-41 and results in the inhibition of LPS-mediated production of inflammatory mediators such as TNF and IL-1b, and the production of cytokines such as IL-10, TGFb 11,25-27,42. Consistent with this literature and the findings of the paper, we show that reference ACK1 and BRK, unlike the patient’s variant alleles, can phosphorylate AKT and STAT3 (Figure 4A, B). The role of ACK1 and BRK in these signaling pathways is further supported by our transcriptomics data comparing the response of controls, patients, and inhibitor-treated iPSC-derived macrophages to apoptotic thymocytes by RNA-seq. Specifically, we show Transcriptional repressors including the AKT targets ATF3, TGIF1, NFIL3, and KLF4, the STAT3 targets SOCS3 and DUSP5, as well as CEBPD and the inhibitor of E-BOX DNA Binding ID3 were among the top-ten genes which expression is induced by apoptotic cells in WT macrophages (Figure 4F), but this regulation was lost in mutant and inhibitor-treated macrophages (Figure 4F).

      For some transcriptional repressors mentioned in their studies, the authors should check whether there is clear experimental evidence. If not, it is recommended to supplement the experimental verifications for clarity.

      Transcriptional repressors including the AKT targets ATF3, TGIF1, NFIL3, and KLF4, the STAT3 targets SOCS3 and DUSP5, as well as CEBPD and the inhibitor of E-BOX DNA Binding ID3 were among the top-ten genes which expression is induced by apoptotic cells in WT macrophages (Figure 4F), but this regulation was lost in mutant and inhibitor-treated macrophages (Figure 4F).

      In the manuscript we cited published evidence, to the best of our knowledge, for the role of these genes in the regulation of inflammatory responses. Specifically we state: “ATF3, TGIF1, NFIL3, and KLF4 are involved in the negative regulation of inflammation in macrophages 35-38, SOCS3 is an inhibitor of the macrophage inflammatory response and DUSP5 is a negative regulator of ERK activation 39,40,43. These data suggest that the kinase domain of ACK1 and BRK contribute to the macrophage anti-inflammatory gene expression program driven by apoptotic cells.”

      In Figures 4C and 4D, it is seen that the usage of inhibitors causes cytoskeletal changes, however this reviewer would not have expected such large change. Did the authors check whether the cells die after heavy treatment by the inhibitors?

      We carefully examine the viability of Isogenic WT, BRK and ACK1 mutant macrophages (left panel) and of WT macrophages treated with ACK1 or BRK inhibitors and we did not observed changes in viability (Figure 4-figure supplement 2).

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      A crucial step in the development of SLE is the production of autoantibodies. It is shown in Figure 2F that inhibitors of ACK1/BRK enhanced the production of autoantibodies against histones and SSA in a pristane-induced SLE model, which is a significant result that could support the authors' claim. Strangely, this autoantigen panel does not include double-stranded DNA, RNP, or Sm, which should be presented regarding antibody production.

      We thank the reviewer for this comment. In the revised manuscript (Revised Figure 3 – Supplement 1) we added the remainder of the autoantibody panel, which includes double-stranded DNA, RNP, and Sm autoantibody levels. We also added the results for serum IgG autoantibody levels in BALB/cByJ mice treated for three months with DMSO, ACK1, or BRK inhibitors but did not receive a pristane injection (Revised Figure 3A). This data shows that mice which received ACK1 or BRK inhibitors had increased serum IgG autoantibodies in comparison to DMSO treated controls.

      Additionally, if there is information that inhibitors of ACK1/BRK promote the differentiation of follicular helper T cells, memory B cells, and plasma cells in a pristane-induced SLE model, it could be considered indirect evidence supporting the authors' claims.

      These are not available at present to the best of our knowledge.

      Reviewer #2 (Recommendations For The Authors):

      Minor points:

      * In the literature, unpaired t-tests and ordinary one-way ANOVA (Tukey's multiple comparisons test) were used for statistical analysis, which requires data to be normally distributed. This part of the proposal is reflected in the text, and the non-conforming results need to be statistically analyzed using the non-parametric test of graphpad prism.

      We would like to thank the reviewer for pointing out this oversight. In the revised manuscript, for all applicable datasets, we tested whether the data was normally distributed using a Shapiro-Wilk normality test. For datasets that were normally distributed statistical significance was determined by a Student t test or ordinary one-way ANOVA with Tukey’s multiple comparisons test depending on the number of conditions being compared and the experimental setup. In contrast, for datasets that were not normally distributed statistical significance was determined using a Mann-Whitney, Kruskal-Wallis multiple comparisons tests, or Wilcoxon matched-pairs signed rank test depending on the experimental setup. P values below 0.05 were considered significant for all statistical tests.

      The authors used different methods to represent the level of significant difference. Therefore, it is suggested that the significance level should be expressed by letters. 

      As suggested by the reviewer, in the revised manuscript we have designated the significance level throughout all figures using letters (p, or q values).

      For RNA-seq, more information should be provided in the paper. For example, the correlation between sample biological replicates, the total number of differentially expressed genes, and randomly selected genes for qRT-PCR results verification.

      We would like to thank the reviewer for pointing out this oversight. In the revised manuscript we provided more information regarding the RNA-seq dataset, including a Principal Component Analysis (PCA) showing correlation between sample replicates (Revised Figure 4-figure supplement 1A), as well as a table indicating the number of upregulated and downregulated genes between relevant datasets (Revised Figure 4-figure supplement 1B).

      The results of the RNA-seq analysis indicated that ACK1 and BRK contribute to the macrophage anti-inflammatory gene expression program driven by apoptotic cells. MERTK-dependent anti-inflammatory program elicited by apoptotic cells on macrophages is best evidenced by the reduction of LPS-mediated production of inflammatory mediators such as TNF or IL1b 25-27,34,44. Therefore, to validate the RNA-seq results in a functional manner we tested the decrease of LPS-induced production of TNF and IL1b by apoptotic cells in isogenic WT, ACK1 deficient, and BRK deficient macrophages. Consistent with the RNA-seq data, the functional assays indicated that ACK1 and BRK kinase activities are required for the decrease of TNF and IL1b production induced by LPS in response to apoptotic cells (Revised Figure 4H,I).

      The raw data files for the RNA-seq analysis have been deposited in the NCBI Gene Expression Omnibus under accession number GEO: GSE118730.

      The authors did not have the formats for some of the citations correct. This should be fixed. 

      References were reformatted.

      (1) Eilertson, K. E., Booth, J. G. & Bustamante, C. D. SnIPRE: selection inference using a Poisson random effects model. PLoS Comput Biol 8, e1002806 (2012). https://doi.org:10.1371/journal.pcbi.1002806

      (2) Fadista, J., Oskolkov, N., Hansson, O. & Groop, L. LoFtool: a gene intolerance score based on loss-of-function variants in 60 706 individuals. Bioinformatics 33, 471-474 (2017). https://doi.org:10.1093/bioinformatics/btv602

      (3) Rackham, O. J., Shihab, H. A., Johnson, M. R. & Petretto, E. EvoTol: a protein-sequence based evolutionary intolerance framework for disease-gene prioritization. Nucleic Acids Res 43, e33 (2015). https://doi.org:10.1093/nar/gku1322

      (4) Petrovski, S., Wang, Q., Heinzen, E. L., Allen, A. S. & Goldstein, D. B. Genic intolerance to functional variation and the interpretation of personal genomes. PLoS Genet 9, e1003709 (2013). https://doi.org:10.1371/journal.pgen.1003709

      (5) Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434-443 (2020). https://doi.org:10.1038/s41586-020-2308-7

      (6) Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285-291 (2016). https://doi.org:10.1038/nature19057

      (7) Rapaport, F. et al. Negative selection on human genes underlying inborn errors depends on disease outcome and both the mode and mechanism of inheritance. Proc Natl Acad Sci U S A 118 (2021). https://doi.org:10.1073/pnas.2001248118

      (8) Mahajan, N. P., Whang, Y. E., Mohler, J. L. & Earp, H. S. Activated tyrosine kinase Ack1 promotes prostate tumorigenesis: role of Ack1 in polyubiquitination of tumor suppressor Wwox. Cancer Res 65, 10514-10523 (2005). https://doi.org:10.1158/0008-5472.CAN-05-1127

      (9) Wu, Y., Singh, S., Georgescu, M. M. & Birge, R. B. A role for Mer tyrosine kinase in alphavbeta5 integrin-mediated phagocytosis of apoptotic cells. J Cell Sci 118, 539-553 (2005). https://doi.org:10.1242/jcs.01632

      (10) Scott, R. S. et al. Phagocytosis and clearance of apoptotic cells is mediated by MER. Nature 411, 207-211 (2001). https://doi.org:10.1038/35075603

      (11) Henson, P. M. & Bratton, D. L. Antiinflammatory effects of apoptotic cells. J Clin Invest 123, 2773-2774 (2013). https://doi.org:10.1172/JCI69344

      (12) Henson, P. M. Cell Removal: Efferocytosis. Annu Rev Cell Dev Biol 33, 127-144 (2017). https://doi.org:10.1146/annurev-cellbio-111315-125315

      (13) deCathelineau, A. M. & Henson, P. M. The final step in programmed cell death: phagocytes carry apoptotic cells to the grave. Essays Biochem 39, 105-117 (2003). https://doi.org:10.1042/bse0390105

      (14) Nagata, S. Apoptosis and Clearance of Apoptotic Cells. Annu Rev Immunol 36, 489-517 (2018). https://doi.org:10.1146/annurev-immunol-042617-053010

      (15) Cohen, P. L. et al. Delayed apoptotic cell clearance and lupus-like autoimmunity in mice lacking the c-mer membrane tyrosine kinase. J Exp Med 196, 135-140 (2002). https://doi.org:10.1084/jem.20012094

      (16) Hanayama, R. et al. Autoimmune disease and impaired uptake of apoptotic cells in MFG-E8-deficient mice. Science 304, 1147-1150 (2004). https://doi.org:10.1126/science.1094359

      (17) Miyanishi, M., Segawa, K. & Nagata, S. Synergistic effect of Tim4 and MFG-E8 null mutations on the development of autoimmunity. Int Immunol 24, 551-559 (2012). https://doi.org:10.1093/intimm/dxs064

      (18) Colonna, L., Parry, G. C., Panicker, S. & Elkon, K. B. Uncoupling complement C1s activation from C1q binding in apoptotic cell phagocytosis and immunosuppressive capacity. Clin Immunol 163, 84-90 (2016). https://doi.org:10.1016/j.clim.2015.12.017

      (19) Nagata, S., Hanayama, R. & Kawane, K. Autoimmunity and the clearance of dead cells. Cell 140, 619-630 (2010). https://doi.org:10.1016/j.cell.2010.02.014

      (20) Kimani, S. G. et al. Contribution of Defective PS Recognition and Efferocytosis to Chronic Inflammation and Autoimmunity. Front Immunol 5, 566 (2014). https://doi.org:10.3389/fimmu.2014.00566

      (21) Hanayama, R., Tanaka, M., Miwa, K., Shinohara, A., Iwamatsu, A. & Nagata, S. Identification of a factor that links apoptotic cells to phagocytes. Nature 417, 182-187 (2002). https://doi.org:10.1038/417182a

      (22) Kawano, M. & Nagata, S. Lupus-like autoimmune disease caused by a lack of Xkr8, a caspase-dependent phospholipid scramblase. Proc Natl Acad Sci U S A 115, 2132-2137 (2018). https://doi.org:10.1073/pnas.1720732115

      (23) Watanabe-Fukunaga, R., Brannan, C. I., Copeland, N. G., Jenkins, N. A. & Nagata, S. Lymphoproliferation disorder in mice explained by defects in Fas antigen that mediates apoptosis. Nature 356, 314-317 (1992). https://doi.org:10.1038/356314a0

      (24) Singer, G. G., Carrera, A. C., Marshak-Rothstein, A., Martinez, C. & Abbas, A. K. Apoptosis, Fas and systemic autoimmunity: the MRL-lpr/lpr model. Current opinion in immunology 6, 913-920 (1994).

      (25) Cvetanovic, M. & Ucker, D. S. Innate immune discrimination of apoptotic cells: repression of proinflammatory macrophage transcription is coupled directly to specific recognition. J Immunol 172, 880-889 (2004). https://doi.org:10.4049/jimmunol.172.2.880

      (26) Fadok, V. A., Bratton, D. L., Konowal, A., Freed, P. W., Westcott, J. Y. & Henson, P. M. Macrophages that have ingested apoptotic cells in vitro inhibit proinflammatory cytokine production through autocrine/paracrine mechanisms involving TGF-beta, PGE2, and PAF. J Clin Invest 101, 890-898 (1998). https://doi.org:10.1172/JCI1112

      (27) Voll, R. E., Herrmann, M., Roth, E. A., Stach, C., Kalden, J. R. & Girkontaite, I. Immunosuppressive effects of apoptotic cells. Nature 390, 350-351 (1997). https://doi.org:10.1038/37022

      (28) Herrmann, M., Voll, R. E., Zoller, O. M., Hagenhofer, M., Ponner, B. B. & Kalden, J. R. Impaired phagocytosis of apoptotic cell material by monocyte-derived macrophages from patients with systemic lupus erythematosus. Arthritis Rheum 41, 1241-1250 (1998). https://doi.org:10.1002/1529-0131(199807)41:7<1241::AID-ART15>3.0.CO;2-H

      (29) Baumann, I. et al. Impaired uptake of apoptotic cells into tingible body macrophages in germinal centers of patients with systemic lupus erythematosus. Arthritis Rheum 46, 191-201 (2002). https://doi.org:10.1002/1529-0131(200201)46:1<191::AID-ART10027>3.0.CO;2-K

      (30) Schrijvers, D. M., De Meyer, G. R. Y., Kockx, M. M., Herman, A. G. & Martinet, W. Phagocytosis of apoptotic cells by macrophages is impaired in atherosclerosis. Arterioscl Throm Vas 25, 1256-1261 (2005). https://doi.org:10.1161/01.ATV.0000166517.18801.a7

      (31) Morioka, S., Maueroder, C. & Ravichandran, K. S. Living on the Edge: Efferocytosis at the Interface of Homeostasis and Pathology. Immunity 50, 1149-1162 (2019). https://doi.org:10.1016/j.immuni.2019.04.018

      (32) Seitz, H. M., Camenisch, T. D., Lemke, G., Earp, H. S. & Matsushima, G. K. Macrophages and dendritic cells use different Axl/Mertk/Tyro3 receptors in clearance of apoptotic cells. J Immunol 178, 5635-5642 (2007). https://doi.org:10.4049/jimmunol.178.9.5635

      (33) Mao, Y. & Finnemann, S. C. Regulation of phagocytosis by Rho GTPases. Small GTPases 6, 89-99 (2015). https://doi.org:10.4161/21541248.2014.989785

      (34) Sen, P. et al. Apoptotic cells induce Mer tyrosine kinase-dependent blockade of NF-kappaB activation in dendritic cells. Blood 109, 653-660 (2007). https://doi.org:10.1182/blood-2006-04-017368

      (35) Vergadi, E., Ieronymaki, E., Lyroni, K., Vaporidi, K. & Tsatsanis, C. Akt Signaling Pathway in Macrophage Activation and M1/M2 Polarization. J Immunol 198, 1006-1014 (2017). https://doi.org:10.4049/jimmunol.1601515

      (36) Byles, V. et al. The TSC-mTOR pathway regulates macrophage polarization. Nat Commun 4, 2834 (2013). https://doi.org:10.1038/ncomms3834

      (37) Liao, X. et al. Kruppel-like factor 4 regulates macrophage polarization. J Clin Invest 121, 2736-2749 (2011). https://doi.org:10.1172/JCI45444

      (38) Roberts, A. W., Lee, B. L., Deguine, J., John, S., Shlomchik, M. J. & Barton, G. M. Tissue-Resident Macrophages Are Locally Programmed for Silent Clearance of Apoptotic Cells. Immunity 47, 913-927 e916 (2017). https://doi.org:10.1016/j.immuni.2017.10.006

      (39) Matsukawa, A. et al. Stat3 in resident macrophages as a repressor protein of inflammatory response. J Immunol 175, 3354-3359 (2005).

      (40) Sica, A. & Mantovani, A. Macrophage plasticity and polarization: in vivo veritas. J Clin Invest 122, 787-795 (2012). https://doi.org:10.1172/JCI59643

      (41) Yi, Z., Li, L., Matsushima, G. K., Earp, H. S., Wang, B. & Tisch, R. A novel role for c-Src and STAT3 in apoptotic cell-mediated MerTK-dependent immunoregulation of dendritic cells. Blood 114, 3191-3198 (2009). https://doi.org:10.1182/blood-2009-03-207522

      (42) Rothlin, C. V., Carrera-Silva, E. A., Bosurgi, L. & Ghosh, S. TAM receptor signaling in immune homeostasis. Annu Rev Immunol 33, 355-391 (2015). https://doi.org:10.1146/annurev-immunol-032414-112103

      (43) Seo, H. et al. Dual-specificity phosphatase 5 acts as an anti-inflammatory regulator by inhibiting the ERK and NF-kappaB signaling pathways. Sci Rep 7, 17348 (2017). https://doi.org:10.1038/s41598-017-17591-9

      (44) Camenisch, T. D., Koller, B. H., Earp, H. S. & Matsushima, G. K. A novel receptor tyrosine kinase, Mer, inhibits TNF-alpha production and lipopolysaccharide-induced endotoxic shock. J Immunol 162, 3498-3503 (1999).

    1. Author Response:

      eLife Assessment

      This manuscript makes an important contribution to the understanding of protein-protein interaction (PPI) networks by challenging the widely held assumption that their degree distributions uniformly follow a power law. The authors present convincing evidence that biases in study design, such as data aggregation and selective research focus, may contribute to the appearance of power-law-like distributions. While the power law assumption has already been questioned in network biology, the methodological rigor and correction procedures introduced here are valuable for advancing our understanding of PPI network structure.

      Thanks for this assessment which perfectly reflects our study.

      Reviewer #1 (Public Review):

      This manuscript was previously reviewed and this earlier evaluation resulted in two conflicting assessments. I fully endorse the favourable opinion of former Reviewer 1 and find most negative comments of former Reviewer 2 inappropriate.

      This work is absolutely necessary. Even though the authors find it difficult to be fully assertive in the end, their ground work in trying to demonstrate the existence of bias in PPI data is undeniably valuable. Other authors have tried before to show the limitation of unequivocally assigning the degree distribution to a power law but these doubts have had a weak impact. This new study is a great opportunity to discuss further a concern for a simplistic view of PPI network topology. The recent contribution of Broido & Clauset was definitely one to bounce on. The approach of this new manuscript is compelling. Dividing the study in several parts, each reflecting an attempt to bring out commonly used shortcuts in PPI network analyses, makes sense.

      Surprisingly, the authors do not refer to the endless controversy of labeling hubs as party or date, which is another manifestation of the interpretative bias of PPI data.

      This is a good point. In particular, it may be interesting if hub nodes that emerge from considering only prey interactions differ regarding party and date nodes. We now refer to this distinction in the Discussion:

      “[...] Further work will be needed to establish if true hub proteins exist in the PPI network and what their role is. For instance, it was previously claimed (Han et al., 2004) – and controversially discussed (Agarwal et al., 2010) – that the correlation of gene expression values between hub nodes with their interaction partners follows a bimodal distribution, leading to the distinction of party (high correlation) and date (low correlation) hubs. In the future, it would be interesting to study if the ratio of party and date hubs changes when considering prey degree only.”

      The only worthy point prompted by former Reviewer 2 is the effect of spoke expansion. In their response, the authors suggest that it would probably extend questioning and even if it is considered as future work, it could be mentioned in the main manuscript.

      Thank you for this comment. We agree that considering different expansion methods is an interesting research question regarding its effect on the PL property. We have added the following sentences to the Discussion to highlight the opportunity for future work:

      “[...] An additional complexity arising in AP-MS studies is that more than two interaction partners can be detected. These -ary interactions are commonly transformed into binary interactions using either the spoke model, which reports all interactions with the bait protein (as used by IntAct, for example), or the matrix expansion model, which reports all pairwise interactions. Both expansion models can, in principle, introduce false positives and it would be interesting to consider the effect of expansion model choice on the PL property in future work.”

      In the end, this submission is an invitation to constructively rethink the analysis of PPI networks and it feeds the discussion on modelling degree distributions that should not be considered as a solved issue.

      Reviewer #2 (Public Review):

      Many naturally occurring networks are assumed to have a power-law (PL) degree distribution. This assumption has certainly been widely held in the field of protein interactomes (PPIs), although important studies around 2010 have conclusively shown that many of these PL distributions are either the result of data mis-handling or of sloppy statistical procedures (see e.g. Porter and Stumpf in Science around 2014, which I would advise the authors to cite). The value of the present study is to introduce a new mechanism, experiment bias, to explain the appearance of such distributions in the PPI case, and in particular to show how correcting empirically for this mechanism can lead to a reappraisal of which proteins are genuine hubs in these networks. The claims are well supported by empirical evidence and some theoretical analysis. Overall, this is a worthwhile contribution and, while its significance is somewhat dented by the fact that the PL enthusiasm of many had already been tempered by the studies mentioned above,

      Thanks a lot for your constructive feedback. We now cite the work by Porter and Stumpf and have addressed your specific recommendations as detailed below.

      Reviewer #3 (Public Review):

      I would like to congratulate the authors to an impressive piece of work highlighting important real and potential biases, which may lead to power-law distributed node degrees in protein-protein interaction networks. This manuscript is easy to follow and very well written manuscript. I truly enjoyed the concise and convincing scientific presentation. Even if some of the concerns have already been discussed or raised in the past, the manuscript assesses potential biases in PPIs in a rigorous manner.

      I deem the following observations highly relevant to be communicated to the community again:

      (1) PL-like distributions emerge by aggregation of data sets alone.

      (2) Research interest in itself is PL-distributed and drives PL-like properties in PPI networks

      (3) Bait usage is a major driver of PL-like behaviour.

      (4) Accounting for biases changes the biological interpretation of the networks

      (5) Simulation studies further corroborate these findings.

      Thank you for this positive assessment of our work.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewing editor:

      The biological significance of the results presented in this manuscript is the potential absence of active sequestration mechanisms in certain species, leading to variation in their ability to transport and store specific compounds, such as alkaloids. The concept of passive accumulation is introduced as an evolutionary intermediate between toxin consumption and sequestration.

      I agree with the reviewers' comments on the limitations of the current manuscript. Additionally, I'd like to raise a point about combining data from LC/MS and GC/MS as these techniques have different sensitivities. GC-MS excels in annotation, allowing for confident identification of detected compounds. However, it may have limitations in the number of extractable substances. Conversely, LC-MS/MS offers a broader range of detectable substances, but annotation can be more challenging. While methods to bridge this gap exist, the current approach might not fully account for the potential influence of the analysis equipment on the observed differences in alkaloid numbers between the Texas and Panama samples analyzed by LC-MS/MS. To address this, consider including data from both methods (if possible) to gain a more comprehensive understanding of the alkaloid profiles. Alternatively, analyzing the Texas and Panama samples with GC-MS could be considered for a more focused comparison with the other samples.

      Thank you for the suggestion. Unfortunately, we do not have GC-MS data for the Texas and Panama samples. While the strength of these two datasets is that they present two independent lines of data corroborating that “undefended” frogs have detectable alkaloid levels, we have more explicitly made clear for readers that the datasets should not be compared directly. We reviewed the text to check that we carefully acknowledge in the manuscript the higher sensitivity of our LC-MS assay, and we added more detail about the differences between the two assay types (section 4d): “The UHPLC-HESI-MSMS pipeline used on the samples from Panama and Texas allows for higher sensitivity to detect a broader array of compounds compared to our GC-MS methods, but has lower retention-time resolution and produces less reliable structural predictions. Furthermore, due to the lack of liquid-chromatography-derived references for poison-frog alkaloids, precise alkaloid annotations from the UHPLC-HESI-MSMS dataset could not be obtained. Therefore, the UHPLC-HESI-MSMS and GC-MS datasets are not directly comparable, and UHPLC-HESI-MSMS data are not included in Fig. 2”. We have also revised the asterisk accompanying the table to further reinforce that alkaloid numbers between the two assay types should not be compared. It now states: “Note that the UHPLC-HESI-MS/MS and GC-MS assays differed in both instrument and analytical pipeline, so “Alkaloid Number” values from the two assay types should not be compared to each other directly”. We further point out differences between the two assay types in section 2b: “Similarly, the analysis of UHPLC-HESI-MS/MS data was untargeted, and thus enables a broader survey of chemistry compared to that from prior GC-MS studies.”

      Finally, we point out that the output from the analytical pipeline for UHPLC-HESI-MSMS annotates compounds as “alkaloids,” using broader criteria than the targeted GC-MS component of our study. In an effort to make the datasets more comparable, at least conceptually, we now include an assessment of which alkaloids identified by UHPLC-HESI-MSMS match known molecular formulae and structural classes in frogs (see Table S6 and revised text on lines 335-343 and 410-415.

      Reviewer #1 (Public Review):

      This is a very relevant study, clearly with the potential of having a high impact on future research on the evolution of chemical defense mechanisms in animals. The authors present a substantial number of new and surprising experimental results, i.e., the presence in low quantities of alkaloids in amphibians previously deemed to lack these toxins. These data are then combined with literature data to weave the importance of passive accumulation mechanisms into a 4-phases scenario of the evolution of chemical defense in alkaloid-containing poison frogs.

      In general, the new data presented in the manuscript are of high quality and high scientific interest, the suggested scenario compelling, and the discussion thorough. Also, the manuscript has been carefully prepared with a high quality of illustrations and very few typos in the text. Understanding that the majority of dendrobatid frogs, including species considered undefended, can contain low quantities of alkaloids in their skin provides an entirely new perspective to our understanding of how the amazing specializations of poison frogs evolved. Although only a few non-dendrobatids were included in the GCMS alkaloid screening, some of these also included minor quantities of alkaloids, and the capacity of passive alkaloid accumulation may therefore characterize numerous other frog clades, or even amphibians in general.

      Thank you for the kind evaluation.

      While the overall quality of the work is exceptional, major changes in the structure of the submitted manuscript are necessary to make it easier for readers to disentangle scope, hypotheses, evidence and newly developed theories.

      Based on reviewer comments, we revised the manuscript structure substantially to make the different aspects of the paper more readily identifiable to readers. Specifically we moved the content of Figure 2 into a new section in the introduction. We also added more introductory text to better introduce the main ideas of the new model and to summarize the scope and aim of the paper. We reorganized the result section headings and moved Figure 1 (now Fig. 3) down into section 2c.

      Reviewer #2 (Public Review):

      Summary:

      This was a well-executed and well-written paper. The authors have provided important new datasets that expand on previous investigations substantially. The discovery that changes in diet are not so closely correlated with the presence of alkaloids (based on the expanded sampling of non-defended species) is important, in my opinion.

      Strengths:

      Provision of several new expanded datasets using cutting edge technology and sampling a wide range of species that had not been sampled previously. A conceptually important paper that provides evidence for the importance of intermediate stages in the evolution of chemical defense and aposematism.

      Thank you for kind comments.

      Weaknesses:

      There were some aspects of the paper that I thought could be revised. One thing I was struck by is the lack of discussion of the potentially negative effects of toxin accumulation, and how this might play out in terms of different levels of toxicity in different species.

      Thank you for the suggestion. We now explicitly address the possible negative effects of toxin accumulation and how costs may play out with respect to varying levels of chemical defense among different organisms, including poison frogs. We note early on that, “short-term alkaloid feeding experiments (e.g., Daly et al., 1994; Sanchez et al., 2019) demonstrate that both defended and undefended dendrobatids can survive the immediate effects of alkaloid intake, although the degree of resistance and the alkaloids that different species can resist vary'' (section 2c), and we address the sparse literature suggesting some species-level variation in alkaloid resistance in frogs. Later, we make the point that, “origins of chemical defenses are also shaped by the cost of resisting and accumulating toxins, which can change over evolutionary time as animals adapt to novel relationships with toxins” (section 2d). We broadly discuss costs of target-site resistance, a common mode of molecular resistance in poison frogs and other animals, and compensatory molecular adaptations that offset the costs. We also discuss examples from the literature of negative effects of high levels of resistance and toxin accumulation that are not completely offset. We also note that to the best of our knowledge, potential lifetime fitness costs to alkaloid consumption by dendrobatids have not been evaluated.

      Further, are there aspects of ecology or evolutionary history that might make some species less vulnerable to the accumulation of toxins than others? This could be another factor that strongly influences the ultimate trajectory of a species in terms of being well-defended. I think the authors did a good job in terms of describing mechanistic factors that could affect toxicity (e.g. potential molecular mechanisms) but did not make much of an attempt to describe potential ecological factors that could impact trajectories of the evolution of toxicity. This may have been done on purpose (to avoid being too speculative), but I think it would be worth some consideration.

      We agree that other factors can influence the trajectory of chemical defense. We incorporated these ideas into the new section 2d, which provides a somewhat brief overview of ecological factors that could influence the origins of chemical defense, the physiological costs of toxin resistance and accumulation, and some of the possible eco-evo factors that shape chemical defense once it evolves.

      In the discussion, the authors make the claim that poison frogs don't (seem to) suffer from eating alkaloids. I don't think this claim has been properly tested (the cited references don't adequately address it). To do so would require an experimental approach, ideally obtained data on both lifespan and lifetime reproductive success.

      We agree with the reviewer that more data are necessary to make this broad claim, which we have removed. We revised this to state: “regardless, it is clear that all or nearly all dendrobatid poison frogs consume alkaloid-containing arthropods as part of their regular diet” (section 2c). We then expand on this statement with data from short-term experimental work that support the notion that at least some dendrobatids are resistant (i.e., can survive) the immediate effects of alkaloids. We also point out later in the manuscript that, “as far as we are aware, the possible lifetime fitness costs (e.g., in reproductive success) of alkaloid consumption in dendrobatids have not been measured” (section 2d).

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      While in general I am very open to "unorthodox" ways to write a manuscript (i.e., differing from the standard structure intro-methods-results-discussion) I feel there is much room for improvement in this case. When reading the manuscript line by line, I was several times totally uncertain about the scope and content of the original data in the manuscript. It is too often unclear which of the outlined theories are new and why they are presented, which hypotheses were tested and why, which data were newly obtained, which technological improvements led to the novel and surprising results, and why no alternative hypotheses are tested. I feel the authors need to fundamentally reconsider the structure of the manuscript - which does not mean everything needs to be rewritten, but some major reshuffling of paragraphs from one section to the other may already lead to substantial improvement. I will in the following list (not ordered by priority) different issues that I encountered, without always providing a specific suggestion for improvement - please come up with an improved structure that removes these issues in one way or the other!

      Thank you for the suggestions. We did our best to improve the structure of the paper. Specifically, we substantially revised the introduction to provide a clearer background of the ideas leading up to the new evolutionary model. We moved most of what was previously figure 2 (now Fig. 1) into an earlier part of the introduction in the main text. We moved what was previously figure 1 (now Fig. 3) to much later in the discussion (section 2c). We attempted to clarify and separate throughout the text the new data from existing data. Please see our responses below for additional details.

      Line 42-45: Please provide a reference on this statement on traversing adaptive landscapes.

      We added the following reference: Martin, CH and PC Wainwright. 2013. Multiple fitness peaks on the adaptive landscape drive adaptive radiation in the wild. Science 339: 208-211. https://doi.org/10.1126/science.1227710

      Line 50: Why are these phases "likely" to occur? - no evidence is presented for this hypothesized high likelihood. Presenting this scenario already in the second paragraph of the intro is very weird. Are these really the only possible phases? Wouldn't it be possible to come up with totally different scenarios? In my opinion, this specific four-phase scenario should be more clearly labelled as a novel theory presented in this paper, and perhaps it should come much later in the introduction.

      Thank you for the suggestion. We moved this paragraph down into a new subsection of the introduction. We also revised the language to clarify that the model is a new evolutionary theory based on new and existing ideas.

      Line 51: Here you use for the first time the term "elimination". While it is intuitively clear what is meant by it, there still could be different meanings. The alkaloids could simply be passively excreted, or they could be actively biochemically decomposed. Later in the Discussion the authors imply that elimination requires some kind of metabolic process, but this perhaps should be made clearer already in the introduction.

      We now spend more time in the introduction describing pharmacokinetics as well as the terms we used (including elimination), which are slightly modified from terms in pharmacokinetics.

      Figure 1. I have major concerns about this figure. I found the figure very confusing, and the authors really need to reconsider and modify (simplify) it. The figure caption starts with "Major processes involved..." as if this was established textbook knowledge rather than a totally hypothetical illustration of how different factors (sequestration, elimination....) can lead to defended or undefended phenotypes. Only later on in the caption it becomes clear this is just a suggestion/hypothesis/model: "we hypothesize...".

      We revised the figure (now Fig. 3) and its legend. It now starts with the following text: “Hypothesized physiological processes that interact to determine the defense phenotype.” We also simplify the figure by removing two lines and recoding the table (see comment below).

      Secondly, the way the graph is drawn suggests some kind of experimental result where specific evolutionary pathways lead to very specific degrees of "defendedness", recognizable by the points on the right axis stacked very precisely one above the other. Do you really want to imply that you want to suggest such a specific model, where particular accumulation/intake/elimination rates lead to exactly these outcomes? Also, wouldn't it be possible to somewhat simplify the categories in the table? Again, why so specific, is there any experimental evidence for it? Why sometimes 1 plus, 2 plus, 3 plus? Wouldn't it be better to just suggest categories such as strong, weak and absent?

      We simplified the figure by removing the secondary (dashed) passive accumulation and active sequestration lines. We also changed the + signs to “low,” “med,” or “high” and tried to simplify the text in the figure and in the legend.

      Line 101-103: "We propose ..." Here, as the concluding statement of the introduction, the authors suggest a very general hypothesis which seems rather disconnected from the four-phase model and from the experimental results. Here, at the latest, I would have expected to learn (1) what the overall scope of the paper is, (2) which kind of approaches were followed and which novel experimental results will be presented in the following, and (3) how the experimental results will be used to derive a new theory / novel. Again, it is obvious that the scope of the paper is broader than testing just a single and narrow hypothesis, but rather to support and develop a broader theory and evolutionary model, but this should be clear to readers once they arrive at this line.

      Thank you for the suggestion. We added a paragraph to the end of the first section of the introduction that outlines the content of the rest of the paper. We also reorganized some of the subheadings to make the flow of ideas and the source of data in each subsection clearer. We split up and moved what was previously in section 2a into parts of the introduction and discussion. We moved the results text about diet and the discussion about resistance to section 2a, to better provide data and discussion of phases 1 and 2.

      Figure 2. My opinion on this figure is much less strong than on Fig. 1. However, the authors may want to reconsider whether it really makes sense to here show all the historical trees and theories (which are not really systematically reviewed in the text) or if they maybe wish to go on with panel D only (the most recent tree and scenario which is also used to consistently for further discussion in the manuscript).

      We moved the content from Fig. 2A–C to the main text (now section 1b) and narrowed the focus of Fig. 2 (now Fig. 1) to what was previously panel 2D.

      Results and Discussion: The whole section on phases 1 to 2 is not based on any new results. This is OK (as I said, I have no problems with "unorthodox" manuscript structure) but it should be clearer to readers why this is presented here and what it represents. A new theory? A recapitulation of textbook knowledge? Something necessary to later understand the experimental results?

      We split up and moved what was previously in section 2a into parts of the introduction and discussion. Now, section 2a still focuses on phases 1 and 2 but presents the diet data from our study (phase 1) and a review of known resistance mechanisms (phase 2; previously in the discussion section).

      Line 168. Here we have arrived at the "core" of the paper, that is, the actual experimental results. Surprisingly, you find alkaloids in dendrobatids usually considered "undefended". This is great, surprising and of high importance. However, I am missing at least some technical/methodological discussion about this finding, except for the statement that it was based on GCMS. Why have previous studies not detected these alkaloids? Did you use particularly sensitive GCMS instruments? Did you look more in depth than it was done in previous studies? Can you totally exclude these contaminations/artefacts?

      We added the following paragraph to section 2b: “The large number of structures that we identified is in part due to the way we reviewed GC-MS data: in addition to searching for alkaloids with known fragmentation patterns, we also searched for anything that could qualify as an alkaloid mass spectrometrically but that may not match a previously known structure in a reference database. Similarly, the analysis of UHPLC-HESI-MS/MS data was untargeted, and thus enables a broader survey of chemistry compared to that from prior GC-MS studies. Structural annotations in our UHPLC-HESI-MS/MS analysis were made using CANOPUS, a deep neural network that is able to classify unknown metabolites based on MS/MS fragmentation patterns, with 99.7% accuracy in cross-validation (Dührkop et al., 2021).” We also moved the paragraph on contamination from the methods section into section 2b.

      Line 169. This sentence (and several others in the subsequent paragraphs) do a poor job in explaining the taxon and specimen sampling. The particular sentence in this line is unclear: Did you include 27 species of dendrobatids AND IN ADDITION representatives of the main undefended clades, or did these 27 species INCLUDE representatives of the main undefended clades?

      We now present a brief overview of sampling in the last paragraph of the introduction (section 1c). We clarified sampling of the species: “In total we surveyed 104 animals representing 32 species of Neotropical frogs including 28 dendrobatid species, two bufonids, one leptodactylid, and one eleutherodactylid (see Methods). Each of the major undefended clades in Dendrobatidae (Fig. 1, Table 1) is represented in our dataset, with a total of 14 undefended dendrobatid species surveyed.” We also reviewed and clarified similar language in other places in the text (e.g., section 2b).

      Line 177. "undefended lineages" - of dendrobatids or of frogs in general? Given that you also include non-dendrobatids.

      Dendrobatids. The sentence now reads “Overall, we detected alkaloids in skins from 13 of 14 undefended dendrobatid species included in our study, although often with less diversity and relatively lower quantities than in defended lineages (Fig. 2, Table 1, Table S3, Table S4).”

      Line 188: "defe" should probably changed to "defended"?

      Corrected.

      Table 1. The taxon sampling clearly focuses on dendrobatids, with only a few other taxa. This is fine, however, it does not allow to test the hypothesis that something "special" predisposes dendrobatids to passive accumulation and alkaloid resistance. For this, a wider taxon sampling of other frog families would have been necessary to have a larger number of "control" data. Again, this is fine for the purpose of the study and is discussed later (line 399) but only very briefly. I feel it should be mentioned earlier on.

      Thank you for the suggestion. We now address this point earlier in the manuscript so that readers will not have the impression that there are sufficient data to infer that dendrobatids are predisposed to passive accumulation. We propose several phylogenetic alternatives, making it clear that determining the number and timing of origins of passive accumulation is not possible with our data (section 2c), ultimately noting that “discriminating a single origin [of passive accumulation] – no matter the timing – from multiple ones would require better phylogenetic resolution and more extensive alkaloid surveys, as we only assessed four non-dendrobatid species”.

      Reviewer #2 (Recommendations For The Authors):

      P2L60 - The description of figure 1 is somewhat confusing, as it first focuses on the graph in the bottom panel, then moves to describing aspects of the table (top panel), then back to the graph. I think it might make more sense to describe these two panels separately and in order.

      Thank you for the suggestion. We revised the figure (now Fig. 3) and its legend for clarity.

      P3L94 - Saying that three transitions makes this group "ideal" for studying complex phenotypic transitions is a bit hyperbolic, in my opinion. I suggest toning down this description.

      Thank you for the suggestion. We changed “ideal” to “suitable.”

      P3L101 - "We propose that changes in toxin metabolism through selection on mechanisms of toxin resistance likely play a major role in the evolution of acquired chemical defenses." This hypothesis appears to be a combination of earlier ideas, with a somewhat different emphasis. The authors acknowledge this and go through some of the earlier ideas, in the legend of figure 2. I would have preferred to see more discussion of this (particularly with reference to the history of the idea in reference to poison frogs) in the main body of the text.

      Thank you for the suggestion. We now more extensively discuss these prior studies in the introduction (section 1b and 1c). We also revised this figure (now Fig. 1) to focus on what was previously figure 2 panel D.

      P3L102 - Figure 2 - the phrase "Resistance to consuming some alkaloids" seems inappropriate - perhaps "Resistance to alkaloid poisoning after consumption" (or something similar) would be more accurate?

      We changed this to “Low alkaloid resistance”.

      P4L153 - "Accumulation of alkaloids in skin glands could help to prevent alkaloids from reaching their targets". This could be true, but why would skin glands be a preferred location of sequestration to avoid toxicity? The authors should explain why such glands would be particularly likely to serve as places of sequestration.

      Thank you for pointing out this ambiguity. We decided to remove our discussion of sequestration into skin glands, because it is challenging to discuss this process in toxin resistance without too much speculation.

      P4L154 - "Although direct evidence is lacking, some poison frogs may biotransform alkaloids into less toxic forms until they can be eliminated from the body, e.g., using cytochrome p450s". This would seem to contradict the argument of this process being a precursor to accumulating effective toxins.

      We agree that these processes seem contradictory. However, a few papers are starting to suggest that metabolic detoxification may be initially useful for lineages that eventually evolve toxin sequestration. This is because detoxification or elimination (clearance) of toxins allows increased intake of toxins. Because there is some delay in the removal of toxins from an animal’s body, increased consumption ultimately leads to higher toxin exposure and possible toxin diffusion into various body cavities, which can increase selective pressure to evolve other kinds of resistance mechanisms. This pattern was shown in an experiment with toxin-resistant fruit flies (Douglas et al., 2022). Many toxin-sequestering species still metabolize some toxins even if they sequester the majority – as we argue, the defense phenotype is the result of a balance among intake, elimination, and accumulation, all of which can interact simultaneously. In poison frogs specifically there is some evidence that p450s are upregulated after toxin consumption (Caty et al. 2019). One possible prediction is that the type of resistance that an animal has changes as toxin sequestration evolves. We talk a bit more about these patterns in section 2e.

      P5L186 - Table 1 legend - change "defe" to "defended"

      Corrected.

      P12L414 - "do not appear to suffer substantially from doing so as it is part of their regular diet". I don't think this claim has been properly tested, as of yet. It would require looking at the effects of a diet with and without toxins over the lifespan of the frogs, and the impact of that difference on both survival and fertility.

      Reviewer 1 also made this important observation, which we address above.

      P12L432 - "for toxin-resistant organisms, there is little cost to accumulating a toxin, yet there may be benefits in doing so." Yet toxin resistance may itself be a continuous trait, so there may be a cost that depends on the degree of toxin resistance. I don't see why the authors are proposing toxin resistance as a discrete trait when their main point is that toxin accumulation is not.

      We agree and removed this statement.

    1. Author response:

      ANALYTICAL

      (1) Figure 3 shows that the relationship between learning rate and informativeness for our rats was very similar to that shown with pigeons by Gibbon and Balsam (1981). We used multiple criteria to establish the number of trials to learn in our data, with the goal of demonstrating that the correspondence between the data sets was robust. To establish that they are effectively the same does require using an equivalent decision criterion for our data as was used for Gibbon and Balsam’s data. However, the criterion they used—at least one peck at the response key on at least 3 out of 4 consecutive trials—cannot be sensibly applied to our magazine entry data because rats make magazine entries during the inter-trial interval (whereas pigeons do not peck at the response key in the inter-trial interval). Therefore, evidence for conditioning in our paradigm must involve comparison between the response rate during CS and the baseline response rate. There are two ways one could adapt the Gibbon and Balsam criterion to our data. One way is to use a non-parametric signed rank test for evidence that the CS response rate exceeds the pre-CS response rate, and adopting a statistical criterion equivalent to Gibbon and Balsam’s 3-out-of-4 consecutive trials (p<.3125). The second method estimates the nDkl for the criterion used by Gibbon and Balsam. This could be done by assuming there are no responses in the inter-trial interval and a response probability of at least 0.75 during the CS (their criterion). This would correspond to an nDkl of 2.2 (odds ratio 27:1). The obtained nDkl could then be applied to our data to identify when the distribution of CS response rates has diverged by an equivalent amount from the distribution of pre-CS response rates.

      (2) A single regression line, as shown in Figure 6, is the simplest possible model of the relationship between response rate and reinforcement rate and it explains approximately 80% of the variance in response rate. Fixing the log-log slope at 1 yields the maximally simple model. (This regression is done in the logarithmic domain to satisfy the homoscedasticity assumption.) When transformed into the linear domain, this model assumes a truly scalar relation (linear, intercept at the origin) and assumes the same scale factor and the same scalar variability in response rates for both sets of data (ITI and CS). Our plot supports such a model. Its simplicity is its own motivation (Occam’s razor).

      If regression lines are fitted to the CS and ITI data separately, there is a small increase in explained variance (R2 = 0.82). We leave it to further research to determine whether such a complex model, with 4 parameters, is required. However, we do not think the present data warrant comparing the simplest possible model, with one parameter, to any more complex model for the following reasons:

      · When a brain—or any other machine—maps an observed (input) rate to a rate it produces (output rate), there is always an implicit scalar. In the special case where the produced rate equals the observed rate, the implicit scalar has value 1. Thus, there cannot be a simpler model than the one we propose, which is, in and of itself, interesting.

      · The present case is an intuitively accessible example of why the MDL (Minimum Description Length) approach to model complexity (Barron, Rissanen, & Yu, 1998; Grünwald, Myung, & Pitt, 2005; Rissanen, 1999) can yield a very different conclusion from the conclusion reached using the Bayesian Information Criterion (BIC) approach. The MDL approach measures the complexity of a model when given N data specified with precision of B bits per datum by computing (or approximating) the sum of the maximum-likelihoods of the model’s fits to all possible sets of N data with B precision per datum. The greater the sum over the maximum likelihoods, the more complex the model, that is, the greater its measured wiggle room, it’s capacity to fit data. Recall that von Neuman remarked to Fermi that with 4 parameters he could fit an elephant. His deeper point was that multi-parameter models bring neither insight nor predictive power; they explain only post-hoc, after one has adjusted their parameters in the light of the data. For realistic data sets like ours, the sums of maximum likelihoods are finite but astronomical. However, just as the Sterling approximation allows one to work with astronomical factorials, it has proved possible to develop readily computable approximations to these sums, which can be used to take model complexity into account when comparing models. Proponents of the MDL approach point out that the BIC is inadequate because models with the same number of parameters can have very different amounts of wiggle room. A standard illustration of this point is the contrast between logarithmic model and power-function model. Log regressions must be concave; whereas power function regressions can be concave, linear, or convex—yet they have the same number of parameters (one or two, depending on whether one counts the scale parameter that is always implicit). The MDL approach captures this difference in complexity because it measures wiggle room; the BIC approach does not, because it only counts parameters.

      · In the present case, one is comparing a model with no pivot and no vertical displacement at the boundary between the black dots and the red dots (the 1-parameter unilinear model) to a bilinear model that allows both a change in slope and a vertical displacement for both lines. The 4-parameter model is superior if we use the BIC to take model complexity into account. However, 4-parameter has ludicrously more wiggle room. It will provide excellent fits—high maximum likelihood—to data sets in which the red points have slope > 1, slope 0, or slope < 0 and in which it is also true that the intercept for the red points lies well below or well above the black points (non-overlap in the marginal distribution of the red and black data). The 1-parameter model, on the other hand, will provide terrible fits to all such data (very low maximum likelihoods). Thus, we believe the BIC does not properly capture the immense actual difference in the complexity between the 1-parameter model (unilinear with slope 1) to the 4-parameter model (bilinear with neither the slope nor the intercept fixed in the linear domain).

      · In any event, because the pivot (change in slope between black and red data sets), if any, is small and likewise for the displacement (vertical change), it suffices for now to know that the variance captured by the 1-parameter model is only marginally improved by adding three more parameters. Researchers using the properly corrected measured rate of head poking to measure the rate of reinforcement a subject expects can therefore assume that they have an approximately scalar measure of the subject’s expectation. Given our data, they won’t be far wrong even near the extremes of the values commonly used for rates of reinforcement. That is a major advance in current thinking, with strong implications for formal models of associative learning. It implies that the performance function that maps from the neurobiological realization of the subject’s expectation is not an unknown function. On the contrary, it’s the simplest possible function, the scalar function. That is a powerful constraint on brain-behavior linkage hypotheses, such as the many hypothesized relations between mesolimbic dopamine activity and the expectation that drives responding in Pavlovian conditioning (Berridge, 2012; Jeong et al., 2022; Y.  Niv, Daw, Joel, & Dayan, 2007; Y. Niv & Schoenbaum, 2008).

      The data in Figure 6 are taken from the last 5 sessions of training. The exact number of sessions was somewhat arbitrary but was chosen to meet two goals: (1) to capture asymptotic responding, which is why we restricted this to the end of the training, and (2) to obtain a sufficiently large sample of data to estimate reliably each rat’s response rate. We have checked what the data look like using the last 10 sessions, and can confirm it makes very little difference to the results.<br /> Finally, as noted by the reviews, the relationship between the contextual rate of reinforcement and ITI responding should also be evident if we had measured context responding prior to introducing the CS. However, there was no period in our experiment when rats were given unsignalled reinforcement (such as is done during “magazine training” in some experiments). Therefore, we could not measure responding based on contextual conditioning prior to the introduction of the CS. This is a question for future experiments that use an extended period of magazine training or “poor positive” protocols in which there are reinforcements during the ITIs as well as during the CSs. The learning rate equation has been shown to predict reinforcements to acquisition in the poor-positive case (Balsam, Fairhurst, & Gallistel, 2006).

      (3) One of us (CRG) has earlier suggested that responding appears abruptly when the accumulated evidence that the CS reinforcement rate is greater than the contextual rate exceeds a decision threshold (C.R.  Gallistel, Balsam, & Fairhurst, 2004). The new more extensive data require a more nuanced view. Evidence about the manner in which responding changes over the course of training is to some extent dependent on the analytic method used to track those changes. We presented two different approaches. The approach shown in Figures 7 and 8, extending on that developed by Harris (2022), assumes a monotonic increase in response rate and uses the slope of the cumulative response rate to identify when responding exceeds particular milestones (percentiles of the asymptotic response rate). This analysis suggests a steady rise in responding over trials. Within our theoretical model, this might reflect an increase in the animal’s certainty about the CS reinforcement rate with accumulated evidence from each trial. While this method should be able to distinguish between a gradual change and a single abrupt change in responding (Harris, 2022) it may not distinguish between a gradual change and multiple step-like changes in responding and cannot account for decreases in response rate.<br /> The other analytic method we used relies on the information theoretic measure of divergence, the nDkl (Gallistel & Latham, 2023), to identify each point of change (up or down) in the response record. With that method, we discern three trends. First, the onset tends to be abrupt in that the initial step up is often large (an increase in response rate by 50% or more of the difference between its initial value and its terminal value is common and there are instances where the initial step is to the terminal rate or higher). Second, there is marked within-subject variability in the response rate, characterised by large steps up and down in the parsed response rates following the initial step up, but this variability tends to decrease with further training (there tend to be fewer and smaller steps in both the ITI response rates and the CS response rate as training progresses). Third, the overall trend, seen most clearly when one averages across subjects within groups is to a moderately higher rate of responding later in training than after the initial rise. We think that the first tendency reflects an underlying decision process whose latency is controlled by diminishing uncertainty about the two reinforcement rates and hence about their ratio. We think that decreasing uncertainty about the true values of the estimated rates of reinforcement is also likely to be an important part of the explanation for the second tendency (decreasing within-subject variation in response rates). It is less clear whether diminishing uncertainty can explain the trend toward a somewhat greater difference in the two response rates as conditioning progresses. It is perhaps worth noting that the distribution of the estimates of the informativeness ratio is likely to be heavy tailed and have peculiar properties (as witness, for example, the distribution of the ratio of two gamma distributions with arbitrary shape and scale parameters) but we are unable at this time to propound an explanation of the third trend.

      (4) There is an error in the description provided in the text. The pre-CS period used to measure the ITI responding was 10 s rather than 20 s. There was always at least a 5-s gap between the end of the previous trial and the start of the pre-CS period.

      (5) Details about model fitting will be added in a revision. The question about fitting a single model or multiple models to the data in Figure 6 is addressed in response 2 above. In Figure 6, each rat provides 2 behavioural data points (ITI response rate and CS response rate) and 2 values for reinforcement rate (1/C and 1/T). There is a weak but significant correlation between the ITI and CS response rates (r = 0.28, p < 0.01; log transformed to correct for heteroscedasticity). By design, there is no correlation between the log reinforcement rates (r = 0.06, p = .404).

      CONCEPTUAL

      (1) It is important for the field to realize that the RW model cannot be used to explain the results of Rescorla’s (Rescorla, 1966; Rescorla, 1968, 1969) contingency-not-pairing experiments, despite what was claimed by Rescorla and Wagner (Rescorla & Wagner, 1972; Wagner & Rescorla, 1972) and has subsequently been claimed in many modelling papers and in most textbooks and reviews (Dayan & Niv, 2008; Y. Niv & Montague, 2008). Rescorla programmed reinforcements with a Poisson process. The defining property of a Poisson process is its flat hazard function; the reinforcements were equally likely at every moment in time when the process was running. This makes it impossible to say when non-reinforcements occurred and, a fortiori, to count them. The non-reinforcements are causal events in RW algorithm and subsequent versions of it. Their effects on associative strength are essential to the explanations proffered by these models. Non-reinforcements—failures to occur, updates when reinforcement is set to 0, hence also the lambda parameter—can have causal efficacy only when the successes may be predicted to occur at specified times (during “trials”). When reinforcements are programmed by a Poisson process, there are no such times. Attempts to apply the RW formula to reinforcement learning soon foundered on this problem (Gibbon, 1981; Gibbon, Berryman, & Thompson, 1974; Hallam, Grahame, & Miller, 1992; L.J. Hammond, 1980; L. J. Hammond & Paynter, 1983; Scott & Platt, 1985). The enduring popularity of the delta-rule updating equation in reinforcement learning depends on “big-concept” papers that don’t fit models to real data and discretize time into states while claiming to be real-time models (Y. Niv, 2009; Y. Niv, Daw, & Dayan, 2005).

      The information-theoretic approach to associative learning, which sometimes historically travels as RET (rate estimation theory), is unabashedly and inescapably representational. It assumes a temporal map and arithmetic machinery capable in principle of implementing any implementable computation. In short, it assumes a Turing-complete brain. It assumes that whatever the material basis of memory may be, it must make sense to ask of it how many bits can be stored in a given volume of material. This question is seldom posed in associative models of learning, nor by neurobiologists committed to the hypothesis that the Hebbian synapse is the material basis of memory. Many—including the new Nobelist, Geoffrey Hinton— would agree that the question makes no sense. When you assume that brains learn by rewiring themselves rather than by acquiring and storing information, it makes no sense.

      When a subject learns a rate of reinforcement, it bases its behavior on that expectation, and it alters its behavior when that expectation is disappointed. Subjects also learn probabilities when they are defined. They base some aspects of their behavior on those expectations, making computationally sophisticated use of their representation of the uncertainties (Balci, Freestone, & Gallistel, 2009; Chan & Harris, 2019; J. A. Harris, 2019; J.A. Harris & Andrew, 2017; J. A. Harris & Bouton, 2020; J. A. Harris, Kwok, & Gottlieb, 2019; Kheifets, Freestone, & Gallistel, 2017; Kheifets & Gallistel, 2012; Mallea, Schulhof, Gallistel, & Balsam, 2024 in press).

      (2) Rate estimation theory is oblivious to the temporal order in which experience with different predictors occurs. The matrix computation finds the additive solution, if it exists, to the data so far observed, on the assumption that predicted rates have remained the same. This is the stationarity assumption, which is implicit in a rate computation and was made explicit in the formulation of RET (C.R. Gallistel, 1990). When the additive solution does not exist, the RET algorithm treats the compound of two predictors as a third predictor, and computes the additive solution to the 3-predictor problem. Because it is oblivious to the order in which the data have been acquired, it predicts one-trial overshadowing and retroactive blocking and unblocking (C.R. Gallistel, 1990 pp 439 & 452-455).

      The RET algorithm is but one component of the information-theoretic model of associative learning (aka, TATAL, The Analytic Theory of Associative Learning Wilkes & Gallistel, 2016)). It solves the assignment-of-credit problem, not the change-detection problem. Because rates of reinforcement do sometimes change, the stationarity assumption, which is essential to the RET algorithm, must be tested when each new reinforcement occurs and when the interval since the last reinforcement has become longer than would be expected or the number of reinforcements has become significantly fewer than would be expected given the current estimate of the probability of reinforcement (C. R. Gallistel, Krishan, Liu, Miller, & Latham, 2014). In the information-theoretic approach to associative learning, detecting non-stationarity is done by an information-theoretic change-detecting algorithm. The algorithm correctly predicts that omitted reinforcements to extinction will be a constant (C.R. Gallistel, 2024 under review; Gibbon, Farrell, Locurto, Duncan, & Terrace, 1980). To put the prediction another way, unreinforced trials to extinction will increase in proportional to the trials/reinforcement during training (C.R. Gallistel, 2012; Wilkes & Gallistel, 2016). In other words, it predicts the best and most systematic data on the partial reinforcement extinction effect (PREE) known to us. The profound challenge to neo-Hullian delta-rule updating models that is posed by the PREE has been recognized for the better part of a century. To the best of our knowledge, no other formalized model of associative learning has overcome this challenge (Dayan & Niv, 2008; Mellgren, 2012). Explaining extinction algorithmically is straightforward when one adopts an information-theoretic perspective, because computing reinforcement-by-reinforcement the Kullback-Leibler divergence in a sequence of earlier rate (or probability!) estimates from the most recent estimate and multiplying the vector of divergences by the vector of effective sample sizes (C. R. Gallistel & Latham, 2022) detects and localized changes in rates and probabilities of reinforcement (C.R. Gallistel, 2024 under review). The computation presupposes the existence of a temporal map, a time-stamped record of past events. This supposition is strongly resisted by neuroscience-oriented reinforcement-learning modelers, who try to substitute the assumption of decaying eligibility traces.

      The very interesting Pearce-Ganesan findings (Ganesan & Pearce, 1988) are not predicted by RET, but nor do they run counter its predictions. RET has nothing to say about how subjects categorize appetitive reinforcements; nor, at this time, does the information-theoretic approach to an understanding of associative have anything to say about that.

      The same is not true for the Betts, Brandon & Wagner results (Betts, Brandon, & Wagner, 1996). They pretrained a blocking cue that predicted a painful paraorbital shock to one eye of a rabbit. This cue elicited an anticipatory blink in the threatened eye. It also potentiated the startle reflex made to a loud noise in one ear. A new cue that was then introduced, which always occurred in compound with the pretrained blocking cue. In one group, the painful shock continued to be delivered to the same eye as before; in another group, it was delivered to the skin around the other eye. In the group that continued to receive the shock to the same eye, the old cue effectively blocked conditioning of the new cue for both the eyeblink and the potentiated startle response. However, in the group for which the location of the shock changed to the other eye, the old cue did not block conditioning of the eyeblink response to the new cue but did block conditioning of the startle response to the new cue. The information-theoretic analysis of associative learning focusses on the encoding of measurable predictive temporal relationships, rather than on general and, to our mind, vague notions like CS processing and US processing. A painful shock elicits fear in a rabbit no matter where on the body surface it is experienced, because fear is a reaction to a very broad category of dangers, and fear potentiates the startle reflex regardless of the threat that causes fear. Once that prediction of such a threat is encoded; redundant cues will not be encoded that same way because the RET algorithm blocks the encoding of redundant predictions. A painful shock near an eye elicits a blink of the threatened eye as well as the fear that potentiates the startle. An appropriate encoding for the eye blink must specify the location of the threat. RET will attribute prediction of the threat to the new eye to the new cue—and not to the old cue, the pretrained blocker— while continuing to attribute to the old cue the prediction of a fear-causing threat, because the change in location does not alter that prediction. Therefore, the new cue will be encoded as predicting the new location of the threat to the eye, but not as predicting the large category non-specific threats that elicit fear and the potentiation of the startle, because that prediction remains valid. Changing that prediction would violate the stationarity assumption; predictive relations do not change unless the data imply that they must have changed. Unless we have made a slip in our logic, this would seem to explain Betts et al’s (1996) results. It does so with no free parameters, unlike AESOP, which has a notoriously large number of free parameters.

      Balci, F., Freestone, D., & Gallistel, C. R. (2009). Risk assessment in man and mouse. Proceedings of the National Academy of Science U S A, 106(7), 2459-2463. doi:10.1073/pnas.0812709106

      Balsam, P. D., Fairhurst, S., & Gallistel, C. R. (2006). Pavlovian contingencies and temporal information. Journal of Experimental Psychology: Animal Behavior Processes, 32, 284-294.

      Barron, A., Rissanen, J., & Yu, B. (1998). The minimum description length principle in coding and modeling. IEEE Transactions on Information Theory, 44(6), 2743-2760.

      Berridge, K. C. (2012). From prediction error to incentive salience: Mesolimbic computation of reward motivation. European Journal of Neuroscience.

      Betts, S. L., Brandon, S. E., & Wagner, A. R. (1996). Dissociation of the blocking of conditioned eyeblink and conditioned fear following a shift in US locus. Animal Learning and Behavior, 24(4), 459-470.

      Chan, C. K. J., & Harris, J. A. (2019). The partial reinforcement extinction effect: The proportion of trials reinforced during conditioning predicts the number of trials to extinction. Journal of Experimental Psychology: Animal Learning and Cognition, 45(1). doi:http://dx.doi.org/10.1037/xan0000190

      Dayan, P., & Niv, Y. (2008). Reinforcement learning: The good, the bad and the ugly. Current Opinion in Neurobiology, 18(2), 185-196.

      Gallistel, C. R. (1990). The organization of learning. Cambridge, MA: Bradford Books/MIT Press.

      Gallistel, C. R. (2012). Extinction from a rationalist perspective. Behav Processes, 90, 66-88. doi:10.1016/j.beproc.2012.02.008

      Gallistel, C. R. (2024 under review). Reconceptualized associative learning. Perspectives on Behavioral Science (Special Issue for SQAB 2024).

      Gallistel, C. R., Balsam, P. D., & Fairhurst, S. (2004). The learning curve: Implications of a quantitative analysis. Proceedings of the National Academy of Sciences, 101(36), 13124-13131.

      Gallistel, C. R., Krishan, M., Liu, Y., Miller, R. R., & Latham, P. E. (2014). The perception of probability. Psychological Review, 121, 96-123. doi:10.1037/a0035232

      Gallistel, C. R., & Latham, P. E. (2022). Bringing Bayes and Shannon to the Study of Behavioral and Neurobiological Timing. Timing & Time Perception. timing & TIME Perception, 1-61. doi:10.1163/22134468-bja10069

      Ganesan, R., & Pearce, J. M. (1988). Effect of changing the unconditioned stimulus on appetitive blocking. Journal of Experimental Psychology: Animal Behavior Processes, 14, 280-291.

      Gibbon, J. (1981). The contingency problem in autoshaping. In C. M. Locurto, H. S. Terrace, & J. Gibbon (Eds.), Autoshaping and conditioning theory (pp. 285-308). New York: Academic.

      Gibbon, J., & Balsam, P. (1981). Spreading association in time. In C. M. Locurto, H. S. Terrace, & J. Gibbon (Eds.), Autoshaping and conditioning theory (pp. 219-253). New York: Academic Press.

      Gibbon, J., Berryman, R., & Thompson, R. L. (1974). Contingency spaces and measures in classical and instrumental conditioning. Journal of the Experimental Analysis of Behavior, 21(3), 585-605. doi: 10.1901/jeab.1974.21-585

      Gibbon, J., Farrell, L., Locurto, C. M., Duncan, H. J., & Terrace, H. S. (1980). Partial reinforcement in autoshaping with pigeons. Animal Learning and Behavior, 8, 45–59. doi:doi.org/10.3758/BF03209729

      Grünwald, P. D., Myung, I. J., & Pitt, M. A. (2005). Advances in minimum description length: theory and applications. Cambridge, MA: MIT Press.

      Hallam, S. C., Grahame, N. J., & Miller, R. R. (1992). Exploring the edges of Pavlovian contingency space: An assessment of contignency theory and its various metrics. Learning and Motivation, 23, 225-249.

      Hammond, L. J. (1980). The effect of contingency upon the appetitive conditioning of free operant behavior. Journal of  the Experimental Analysis of Behavior, 34, 297-304. doi:10.1901/jeab.1980.34-297

      Hammond, L. J., & Paynter, W. E. (1983). Probabilistic contingency theories of animal conditioning: A critical analysis. Learning and Motivation, 14, 527-550. doi:10.1016/0023-9690(83)90031-0

      Harris, J. A. (2019). The importance of trials. Journal of Experimental Psychology: Animal Learning and Cognition, 45(4).

      Harris, J. A. (2022). The learning curve, revisited. Journal of Experimental Psychology: Animal Learning and Cognition, 48, 265-280.

      Harris, J. A., & Andrew, B. J. (2017). Time, Trials and Extinction. Journal of Experimental Psychology: Animal Learning and Cognition, 43(1), 15-29.

      Harris, J. A., & Bouton, M. E. (2020). Pavlovian conditioning under partial reinforcement: The effects of non-reinforced trials versus cumulative CS duration. The Journal of Experimental Psychology: Animal Learning & Cognition, 46, 256-272.

      Harris, J. A., Kwok, D. W. S., & Gottlieb, D. A. (2019). The partial reinforcement extinction effect depends on learning about nonreinforced trials rather than reinforcement rate. Journal of Experimental Psychology: Animal Behavior Learning and Cognition, 45(4). doi:10.1037/xan0000220

      Jeong, H., Taylor, A., Floeder, J. R., Lohmann, M., Mihalas, S., Wu, B., . . . Namboodiri, V. M. K. (2022). Mesolimbic dopamine release conveys causal associations. Science. doi:10.1126/science.abq6740

      Kheifets, A., Freestone, D., & Gallistel, C. R. (2017). Theoretical Implications of Quantitative Properties of Interval Timing and Probability Estimation in Mouse and Rat. Journal of the Experimental Analysis of Behavior, 108(1), 39-72. doi:doi.org/10.1002/jeab.261

      Kheifets, A., & Gallistel, C. R. (2012). Mice take calculated risks. Proceedings of the National Academy of Science, 109, 8776-8779. doi:doi.org/10.1073/pnas.1205131109

      Mallea, J., Schulhof, A., Gallistel, C. R., & Balsam, P. D. (2024 in press). Both probability and rate of reinforcement can affect the acquisition and maintenance of conditioned responses. Journal of Experimental Psychology: Animal Learning and Cognition.

      Mellgren, R. (2012). Partial reinforcement extinction effect. In N. M. Seel (Ed.), Encyclopedia of the Sciences of Learning. Boston, MA: Springer.

      Niv, Y. (2009). Reinforcement learning in the brain. Journal of Mathematical Psychology, 53, 139-154.

      Niv, Y., Daw, N. D., & Dayan, P. (2005). How fast to work: response vigor, motivation and tonic dopamine. In Y. Weiss, B. Schölkopf, & J. R. Platt (Eds.), NIPS 18 (pp. 1019–1026). Cambridge, MA: MIT Press.

      Niv, Y., Daw, N. D., Joel, D., & Dayan, P. (2007). Tonic dopamine: Opportunity costs and the control of response vigor. Psychopharmacology, 191(3), 507-520.

      Niv, Y., & Montague, P. R. (2008). Theoretical and empirical studies of learning. In  (., eds), pp. , Academic Press. In P. W. e. a. Glimcher (Ed.), Neuroeconomics: Decision-Making and the Brain (pp. 329–349). New York: Academic Press.

      Niv, Y., & Schoenbaum, G. (2008). Dialogues on prediction errors. Trends in Cognitive Sciences, 12(7), 265-272. doi:10.1016/j.tics.2008.03.006

      Rescorla, R. A. (1966). Predictability and the number of pairings in Pavlovian fear conditioning. Psychonomic Science, 4, 383-384.

      Rescorla, R. A. (1968). Probability of shock in the presence and absence of CS in fear conditioning. Journal of Comparative and Physiological Psychology, 66(1), 1-5. doi:10.1037/h0025984

      Rescorla, R. A. (1969). Conditioned inhibition of fear resulting from negative CS-US contingencies. Journal of Comparative and Physiological Psychology, 67, 504-509.

      Rescorla, R. A., & Wagner, A. R. (1972). A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. In A. H. Black & W. F. Prokasy (Eds.), Classical conditioning II (pp. 64-99). New York: Appleton-Century-Crofts.

      Rissanen, J. (1999). Hypothesis selection and testing by the MDL principle. The Computer Journal, 42, 260–269. doi:10.1093/comjnl/42.4.260

      Scott, G. K., & Platt, J. R. (1985). Model of response-reinforcement contingency. Journal of  Experimental Psychology: Animal Behavior Processes, 11(2), 152-171.

      Wagner, A. R., & Rescorla, R. A. (1972). Inhibition in Pavlovian conditioning: Appllication of a theory. In R. A. Boakes & S. Halliday (Eds.), Inhibition and learning. New York: Academic.

      Wilkes, J. T., & Gallistel, C. R. (2016). Information Theory, Memory, Prediction, and Timing in Associative Learning (original long version).

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This important study addresses how 3' splice site choice is modulated by the conserved spliceosome-associated protein Fyv6. The authors provide compelling evidence Fyv6 functions to enable selection of 3' splice sites distal to a branch point and in doing so antagonizes more proximal, suboptimal 3' splice sites. The study would be improved through a more nuanced discussion of alternative possibilities and models, for instance in discussing the phenotypic impact of Fyv6 deletion.

      We thank the editors and reviewers for their supportive comments and assessment of this manuscript. We have improved the discussion at several points as suggested by the reviewers to include discussion of alternative possibilities.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      A key challenge at the second chemical step of splicing is the identification of the 3' splice site of an intron. This requires recruitment of factors dedicated to the second chemical step of splicing and exclusion of factors dedicated to the first chemical step of splicing. Through the highest resolution cyroEM structure of the spliceosome to-date, the authors show the binding site for Fyv6, a factor dedicated to the second chemical step of splicing, is mutually exclusive with the binding site for a distinct factor dedicated to the first chemical step of splicing, highlighting that splicing factors bind to the spliceosome at a specific stage not only by recognizing features specific to that stage but also by competing with factors that bind at other stages. The authors further reveal that Fyv6 functions at the second chemical step to promote selection of 3' splice sites distal to a branch point and thereby discriminate against proximal, suboptimal 3' splice site. Lastly, the authors show by cyroEM that Fyv6 physically interacts with the RNA helicase Prp22 and by genetics Fyv6 functionally interacts with this factor, implicating Fyv6 in 3'SS proofreading and mRNA release from the spliceosome. The evidence for this study is robust, with the inclusion of genomics, reporter assays, genetics, and cyroEM. Further, the data overall justify the conclusions, which will be of broad interest.

      Strengths:

      (1) The resolution of the cryoEM structure of Fyv6-bound spliceosomes at the second chemical step of splicing is exceptional (2.3 Angstroms at the catalytic core; 3.0-3.7 Angstroms at the periphery), providing the best view of this spliceosomal intermediate in particular and the core of the spliceosome in general.

      (2) The authors observe by cryoEM three distinct states of this spliceosome, each distinguished from the next by progressive loss of protein factors and/or RNA residues. The authors appropriately refrain from overinterpreting these states as reflecting distinct states in the splicing cycle, as too many cyroEM studies are prone to do, and instead interpret these observations to suggest interdependencies of binding. For example, when Fyv6, Slu7, and Prp18 are not observed, neither are the first and second residues of the intron, which otherwise interact, suggesting an interdependence between 3' splice site docking on the 5' splice site and binding of these second step factors to the spliceosome.

      (3) Conclusions are supported from multiple angles.

      (4) The interaction between Fyv6 and Syf1, revealed by the cyroEM structure, was shown to account for the temperature-sensitive phenotypes of a fyv6 deletion, through a truncation analysis.

      (5) Splicing changes were observed in vivo both by indirect copper reporter assays and directly by RT-PCR.

      (6) Changes observed by RNA-seq are validated by RT-PCR.

      (7) The authors go beyond simply observing a general shift to proximal 3'SS usage in the fyv6 deletion by RNA-seq by experimentally varying branch point to 3' splice site distance experimentally in a reporter and demonstrating in a controlled system that Fyv6 promotes distal 3' splice sites.

      (8) The importance of the Fyv6-Syf1 interaction for 3'SS recognition is demonstrated by truncations of both Fyv6 and of Syf1.

      (9) In general, the study was executed thoroughly and presented clearly.

      We thank the reviewer for their recognition of the strengths of our multi-faceted approach that led to highly supported conclusions.

      Weaknesses:

      (1) Despite the authors restraint in interpreting the three states of the spliceosome observed by cyroEM as sequential intermediates along the splicing pathway, it would be helpful to the general reader to explicitly acknowledge the alternative possibility that the difference states simply reflect decomposition from one intermediate during isolation of the complex (i.e., the loss of protein is an in vitro artifact, if an informative one).

      We thank the reviewer for noticing our restraint in interpreting these structures, and we agree that the scenario described by the reviewer is a possibility. We have now explicitly mentioned this in the Discussion on lines 755-757.

      (2) The authors acknowledge that for prp8 suppressors of the fyv6 deletion, suppression may be indirect, as originally proposed by the Query and Konarska labs - that is, that defects in the second step conformation of the spliceosome can be indirectly suppressed by compensating, destabilizing mutations in the first step spliceosome. Whereas some of the other suppressors of the fyv6 deletion can be interpreted as impacting directly the second step spliceosome (e.g., because the gene product is only present in the second step conformation), it seems that many more suppressors beyond prp8 mutants, especially those corresponding to bulky substitutions, which would more likely destabilize than stabilize, could similarly act indirectly by destabilization of first step conformation. The authors should acknowledge this where appropriate (e.g., for factors like Prp8 that are present in both first and second step conformations).

      We agree that this is also a possibility and have now included this on lines 480-486.

      Reviewer #2 (Public Review):

      In this manuscript, Senn, Lipinski, and colleagues report on the structure and function of the conserved spliceosomal protein Fyv6. Pre-mRNA splicing is a critical gene expression step that occurs in two steps, branching and exon ligation. Fyv6 had been recently identified by the Hoskins' lab as a factor that aids exon ligation (Lipinski et al., 2023), yet the mechanistic basis for Fyv6 function was less clear. Here, the authors combine yeast genetics, transcriptomics, biochemical assays, and structural biology to reveal the function of Fyv6. Specifically, they describe that Fyv6 promotes the usage of distal 3'SSs by stabilizing a network of interactions that include the RNA helicase PRP22 and the spliceosome subunit SYF1. They discuss a generalizible mechanism for splice site proofreading by spliceosomsal RNA helicases that could be modulated by other, regulatory splicing factors.

      This is a very high quality study, which expertly combines various approaches to provide new insights into the regulation of 3'SS choice, docking, and undocking. The cryo-EM data is also of excellent quality, which substantially extends on previous yeast P complex structures. This is also supported by the authors use of the latest data analysis tools (Relion-5, AlphaFold2 multimer predictions, Modelangelo). The authors re-evaluate published EM densities of yeast spliceosome complexes (B*, C,C*,P) for the presence or absence of Fyv6, substantiate Fyv6 as a 2nd step specific factor, confirm it as the homolog of the human protein FAM192A, and provide a model for how Fyv6 may fit into the splicing pathway. The biochemical experiments on probing the splicing effects of BP to 3'SS distances after Fyv6 KO, genetic experiments to probe Fyv6 and Syf1 domains, and the suppressor screening add substantially to the study and are well executed. The manuscript is clearly written and we particularly appreciated the nuanced discussions, for example for an alternative model by which Prp22 influences 3'SS undocking. The research findings will be of great interest to the pre-mRNA splicing community.

      We thank the reviewer for their positive comments on our manuscript.

      We have only few comments to improve an already strong manuscript.

      Comments:

      (1) Can the authors comment on how they justify K+ ion positions in their models (e.g. the K+ ion bridging G-1 and G+1 nucleotides)? How do they discriminate e.g. in the 'G-1 and G+1' case K+ from water?

      The assignment of K+ at this position is justified by both longer coordination distances and relatively high cryo-EM density compared to structured water molecules in the same vicinity. We have added a panel to figure3-figure supplement 4C to show the density for the G-1/G+1 bridging K+ ion and to show the adjacent density for putative water molecules which coordinate the ion. The K+ ion density is larger and has stronger signal than the adjacent water molecules. The coordination distances are also longer than would be expected for a Mg2+. For these reasons and because K+ was present in the purification buffer, we modelled the density as K+.

      (2) The authors comment on Yju2 and Fyv6 assignments in all yeast structures except for the ILS. Can the authors comment on if they have also looked into the assignment of Yju2 in the yeast ILS structure in the same manner? While it is possible that Fyv6 could dissociate and Yju2 reassociate at the P to ILS transition, this would merit a closer look given that in the yeast P complex Yju2 had been misassigned previously.

      We thank the reviewer for pointing out this very interesting topic! We have used ModelAngelo to analyze the S. cerevisiae ILS structure for support of density assignment as Yju2 (and not Fyv6). This analysis supports the assignment as Yju2 in this structure and we have no evidence to doubt its presence in those particular purified spliceosomes. We have updated Figure 4- figure supplement 1B accordingly.

      That being said, we do think that this issue should be studied more carefully in the future. The S. cerevisiae ILS structure (5Y88) was determined by purifying spliceosome complexes with a TAP-tag on Yju2. So the conclusion that Yju2 is part of the ILS spliceosome involves some circular logic: Yju2 is part of ILS spliceosome complexes because it is present in ILS complexes purified with Yju2. We also note that Yju2 was absent in ILS complexes recently determined from metazoans by the Plaschka group.  We have added some additional nuance to the Discussion to raise this important mechanistic point at lines 711-718.

      (3) For accessibility to a general reader, figures 1c, d, e, 2a, b, would benefit from additional headings or labels, to immediately convey what is being displayed. It is also not clear to us if Fig 1e might fit better in the supplement and be instead replaced by Supplementary Figure 1a (wt) , b (delta upf1), and a new c (delta fyv6) and new d (delta upf1, delta fyv6). This may allow the reader to better follow the rationale of the authors' use of the Fyv6/Upf1 double deletion.

      We thank the reviewer for the suggestion and have updated Figures 1 C-E to include additional information in the headings and labels. We have not changed the labels in Figures 2A, B but have added additional clarifying language to the legend.

      In terms of rearranging the figures, we thank the reviewer for the suggestion but have decided that the figures are best left in their current ordering.

      (4) The authors carefully interpret the various suppressor mutants, yet to a general reader the authors may wish to focus this section on only the most critical mutants for a better flow of the text.

      We thank the reviewer for this suggestion. While this section of the manuscript does contain (to quote Reviewer #3) “extensive new information regarding functional interactions”, it was a bit long. We have reduced this section of the manuscript by ~200 words for a more focused presentation for general readers.

      Reviewer #3 (Public Review):

      In this manuscript the authors expand their initial identification of Fyv6 as a protein involved in the second step of pre-mRNA splicing to investigate the transcriptome-wide impact of Fyv6 on splicing and gain a deeper understanding of the mechanism of Fyv6 action.

      They first use deep sequencing of transcripts in cells depleted of Fyv6 together with Upf1 (to limit loss of mis-spliced transcripts) to identify broad changes in the transcriptome due to loss of Fyv6. This includes both changes in overall gene expression, that are not deeply discussed, as well as alterations in choice of 3' splice sites - which is the focus of the rest of the manuscript

      They next provide the highest resolution structure of the post-catalytic spliceosome to date; providing unparalleled insight into details of the active site and peripheral components that haven't been well characterized previously.

      Using this structure they identify functionally critical interactions of Fyv6 with Syf1 but not Prp22, Prp8 and Slu7. Finally, a suppressor screen additionally provides extensive new information regarding functional interactions between these second step factors.

      Overall this manuscript reports new and essential information regarding molecular interactions within the spliceosome that determine the use of the 3' splice site. It would be helpful, especially to the non-expert, to summarize these in a table, figure or schematic in the discussion.

      We thank the reviewer for the positive comments and suggestions. We did include a summary figure in panel 7H. However, it was a bit buried. To highlight the summary figure more clearly, we have moved panel 7H to its own figure (Fig. 8).

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) The resolution of some panels is poor, nearly illegible (e.g., Supp Fig 1A, B).

      The resolution of panels in supplemental figure 1 has been increased. However, this may be an artifact of the PDF conversion process. We will pay attention to this during the publication process.

      (2) Panel S6B: 6HYU is a structure of DHX8, not DDX8

      We have corrected DDX8 to DHX8 in Supplemental Fig. S6D and associated figure legend.

      (3) The result that Syf1 truncations can suppress the Fyv6 deletion is impressive. The subsequent discussion seems muddled. A discussion of Fyv6 binding at the first step, instead of Yju2, doesn't seem relevant here (though worthy of consideration in the discussion), given that the starting mutation is the Fyv6 deletion. Further, conjuring rebinding of Yju2 based on the data in the paper seems unnecessarily speculative (assumes that biochemical state III is on pathway), unless I am unaware of some other evidence for such rebinding. Instead, a simpler explanation would seem to be that in the absence of Fyv6, Syf1 inappropriately binds Yju2 instead at the second step and that deletion of the common Fyv6/Yju2 binding site on Syf1 suppresses this defect. In this case, the ts phenotype of the Fyv6 deletion would result from inappropriate binding of Yju2, and the splicing defect would be due to loss of Fyv6 activity. Alternatively, especially considering the work of the labs of Query and Konarska, the authors should consider the possibility that i) the Fyv6 deletion destabilizes the second step conformation, shifting an equilibrium to the first step conformation, and that ii) the Syf1 truncation destabilizes binding of Yju2, thereby restoring the equilibrium. In this case the ts phenotype of the Fyv6 deletion is due to a disturbed equilibrium and the splicing defect is due to the failure of Fyv6 to function at the second step.

      We believe the reviewer is specifically referencing the final paragraph of this Results section (the paragraph that comes just before the section “Mutations in many different splicing factors…”). In retrospect, we agree that our discussion was convoluted. In particular, we emphasized rebinding of Yju2 based on its presence in the cryo-EM structure of the yeast ILS complex. However, given some uncertainties about whether or not Yju2 is a bona fide ILS component (as discussed above). We don’t think it is appropriate to over-emphasize rebinding of Yju2 and have decided to incorporate the elegant mechanisms proposed by the reviewer. This paragraph has now been edited accordingly (lines 386-395).

      (4) The authors imply they have performed biochemical studies, which I think is misleading. Of course, RT-PCR and primer extension assays for example are performed in vitro, but these are an analysis of RNA events that occurred in vivo. In my view a higher threshold should be used for defining "biochemistry". To me "biochemistry" would imply that the authors have, for example, investigated 3' splice site usage in splicing extracts of the fyv6 deletion or engaged in an analysis of the Syf1-Fyv6 interaction involving the expression of the interacting domains in bacteria followed by a binding analysis in the test tube.

      We disagree with the reviewer on this point. Biochemistry is defined as the “branch of sciences concerned with the chemical substances, reactions, and physico chemical processes which occur within living organisms; biological or physical chemistry.” (Oxford English Dictionary). Biochemical studies are not defined by whether or not they take place in vitro, in vivo, or even in silico. Indeed, much of the history of biochemistry (especially in studies of metabolism, for example) involved experiments occurring in vivo that reported on the molecular properties and mechanisms of biological processes. We think many of our experiments fall into this category including our structure/function analysis of splicing factors and the use of the ACT1-CUP1 reporter substrate.

      (5) The monovalents are shown; inositol phosphate is shown; is the binding of Prp22 to RNA shown?

      We have added a panel to Figure 3-figure supplement 4D showing density for the 3' exon within Prp22.

      (6) The authors invoke undocking of the 3'SS in the P complex. Where is the 3'SS in the ILS? The author's model predicts: undocked.

      In all ILS structures to date, the 3′ SS is undocked, in agreement with this prediction. We have now noted this observation in line 760.

      (7) Would be helpful to show fyv6 deletion in Fig 1b.

      We have included growth data for an additional fyv6 deletion strain (in a cup1Δ background) in Figure 1b. The results are quite similar to the upf1_Δ_ background except with slightly worse growth at 23°C.

      Reviewer #2 (Recommendations For The Authors):

      Minor comments

      (1) Fig.3b is the arrow indicating the right rotation?

      This typo has been fixed.

      (2) Fig.4b, panel H is annotated, which should read 'F'.

      This typo has been fixed.

      (3) Line 178: "Finally, we analyzed the sequence features of the alternative 3ʹ SS activated by loss of Fyv6." We would suggest 'used after' instead of 'activated by'.

      We have replaced ‘activated by’ with ‘with increased use after’.

      (4) In Line 544, the authors speculate on a Slu7 requirement for 3'SS docking and on 3'SS docking maintenance. In the results section (Line 265) they however only mention the latter possibility. These statements should be consistent.

      We thank the reviewer for pointing this out. We have added a reference to docking maintenance to the results section at line 325.

      (5) Line 476: "Unexpectedly, Prp22 I1133R was actually deleterious when Fyv6 was present for this reporter." We suggest removing "actually".

      We have removed ‘actually’.

      (6) The authors describe the observed changes in splicing events in absolute numbers (e.g. in Fig 1c). To better assess for the reader whether these numbers reflect large or small effects of Fyv6 in defining mRNA isoforms, it would be more useful to state these as percent changes of total events or to provide a reference number for how many introns are spliced in S.c. See for example the statements in Lines 132 and 145.

      We have added a percentage at line 138 that indicates ~20% of introns in yeast showed splicing changes.

      Reviewer #3 (Recommendations For The Authors):

      Do the authors have a proposed explanation for the observed DGE in non-intron containing genes in the Fyv6 depleted cells?

      The simplest explanation is that this is an indirect effect due to splicing changes occurring in other genes (such as transcription factors, ribosomal protein genes, etc..). It is possible that this can be further dissected in the future using shorter-term knockdown of Fyv6 using Anchors Away or AID-tagging. However, that is beyond the scope of the current manuscript, and we do not wish to comment on these non-intron containing genes further at present.

      Figure 2A - What is going on with the events that show no FAnS value under one condition (i.e. are up against the X or Y axis)? These are of interest as most on the Y- axis are blue.

      The events along one of the axes denote alternative splice sites that are only detected under one condition (either when Fyv6 is present or when it is absent). At this stage, we do not wish to interpret these events further since most have a relatively low number of reads overall.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This study reports single-cell RNA sequencing results of lung adenocarcinoma, comparing 4 treatment-naive and 5 post-neoadjuvant chemotherapy tumor samples.<br /> The authors claim that there are metabolic reprogramming in tumor cells as well as stromal and immune cells after chemotherapy.

      The most significant findings are in the macrophages that there are more pro-tumorigenic cells after chemotherapy, i.e. CD45+CD11b+ARG+ cells. In the treatment-naive samples, more anti-tumorigenic CD45+CD11b+CD86+ macrophages are found. They sorted each population and performed functional analyses.

      Strengths:

      Comparison of the treatment-naive and post-chemotherapy samples of lung adenocarcinoma.

      Weaknesses:

      (1) Lengthy descriptive clustering analysis, with indistinct direct comparisons between the treatment-naive and the post-chemotherapy samples.

      Thank you for your detailed review and valuable feedback. We have simplified the descriptive clustering analysis by removing redundant parts and retaining only the key content relevant to our findings. This should help readers to more easily grasp and focus on the main results.

      (2) No statistical analysis was performed for the comparison.

      We appreciate your constructive feedback and are committed to improving our research methodology and reporting to enhance the scientific rigor of our studies.

      (3) Difficult to match data to the text.

      Thank you for your feedback. We understand that there were difficulties in matching the data to the text. We have reviewed the manuscript carefully to ensure that all data points are clearly linked to the corresponding sections in the text.

      (4) ARG1 is a cytosolic enzyme that can be detected by intracellular staining after fixation. It is unclear how the staining and sorting was performed to measure function of sorted cells.

      We apologize for the error caused by miscommunication within our research team. We are currently using both ARG1 and CD206 antibodies in our studies. Due to a communication error, the technician mistakenly assumed ARG1 was another name for CD206 (MRC1), resulting in the incorrect labeling of CD206 as ARG1 in our experimental records. In reality, we used the CD206 antibody, which is consistent with the same surface marker shown in figure 6e. We have made corrections in the manuscript and experimental figures. Thank you for pointing this out, and we regret any misunderstanding this may have caused.

      Reviewer #2 (Public Review):

      In this study, Huang et al. performed a scRNA-seq analysis of lung adenocarcinoma (LUAD) specimens from 9 human patients, including 5 who received neoadjuvant chemotherapy (NCT), and 4 without treatment (control). The new data was produced using 10 × Genomics technology and comprises 83622 cells, of which 50055 and 33567 cells were derived from the NCT and control groups, respectively. Data was processed via R Seurat package, and various downstream analyses were conducted, including CNV, GSVA, functional enrichment, cell-cell interaction, and pseudotime trajectory analyses. Additionally, the authors performed several experiments for in vitro and in vivo validation of their findings, such as immunohistochemistry, immunofluorescence, flow cytometry, and animal experiments.

      The study extensively discusses the heterogeneity of cell populations in LUAD, comparing the samples with and without chemotherapy. However, there are several shortcomings that diminish the quality of this paper:

      • The number of cells included in the dataset is limited, and the number of patients from different groups is low, which may reduce the attractiveness of the dataset for other researchers to reuse. Additionally, there is no metadata on patients' clinical characteristics, such as age, sex, history of smoking, etc., which would be valuable for future studies.

      Thank you for your insightful feedback. We recognize that the limited number of cells and the small number of patients from different groups in our dataset may affect its appeal for reuse by other researchers. Additionally, we acknowledge the absence of metadata on patients' clinical characteristics, such as age, sex, and smoking history, which would indeed be valuable for future studies. We have compiled statistics on the patient's metadata and other information in the Supplementary Table 2.

      We appreciate your suggestions and will consider incorporating these aspects in future research to enhance the dataset's utility and attractiveness.

      • Several crucial details about the data analysis are missing: How many PCs were used for reduction? Which versions of Seurat/inferCNV/other packages were used? Why monocle2 was used and not monocle3 or other packages? Also, the authors use R version 3.6.1, and the current version is 4.3.2.

      Thank you for your detailed review and valuable suggestions. Below are our responses to the points you raised:

      Principal Components (PCs) Used for Reduction: We used the first 20 principal components (PCs) for dimensionality reduction. This choice was based on preliminary tests showing that 20 PCs captured the major variation in our data effectively.

      Versions of Packages: The versions of the packages used are as follows:

      Seurat: Version 4.0.1

      inferCNV: Version 1.18.1

      monocle2: Version 2.14.0

      Choice of monocle2 over monocle3 or Other Packages: We chose monocle2 because it performed better on our specific dataset, and its algorithms suited our research needs. Additionally, we are more familiar with the functionalities and outputs of monocle2, which allowed us to better interpret and apply the results.

      R Version: We used R version 3.6.1 at the beginning of our study to ensure consistency and reproducibility throughout the analysis. Although the current version of R is 4.3.2, we maintained the same version throughout our research. We will consider upgrading to the latest version of R and re-testing for compatibility and performance in future studies.

      We appreciate your attention to these details and will include this information in the revised manuscript.

      • It seems that the authors may lack a fundamental understanding of scRNA-seq data processing and the functions of Seurat. For instance, they state, 'Next, we classified cell types through dimensional reduction and unsupervised clustering via the Seurat package.' However, dimensional reduction and unsupervised clustering are not methods for cell classification. Typically, cell types are classified using marker genes or other established methods.

      Thank you for your insightful comments. We appreciate your guidance on the proper understanding and application of scRNA-seq data processing and the functions of Seurat.

      You are correct in noting that dimensional reduction and unsupervised clustering are not methods for cell classification. We apologize for the confusion in our original statement. What we intended to convey was that we performed dimensional reduction and unsupervised clustering using the Seurat package as preliminary steps in our analysis. Following these steps, we classified cell types based on established marker genes.

      "Therefore, to identify subclusters within each of these nine major cell types, we performed principal component analysis" (Line 127). Principal component analysis is a method for dimensionality reduction, not cell clustering.

      The authors did not mention the normalization or scaling of the data, which are crucial steps in scRNA-seq data preprocessing.

      Thank you for your insightful comments. We apologize for any confusion caused by our description in the manuscript. You are correct that principal component analysis (PCA) is primarily a method for dimensionality reduction rather than cell clustering. To clarify, we used PCA to reduce the dimensionality of our single-cell RNA-seq (scRNA-seq) data, which is a preliminary step before clustering the cells.

      In the revised manuscript, we have provided a more detailed description of our data preprocessing pipeline, including the normalization and scaling steps that are indeed crucial for scRNA-seq data analysis. Specifically, we performed the following steps:

      Normalization: We normalized the gene expression data to account for differences in sequencing depth and other technical variations.

      Scaling: We scaled the normalized data to ensure that each gene contributes equally to the PCA, which mitigates the effect of highly variable genes dominating the analysis.

      Following these preprocessing steps, we conducted PCA to reduce the dimensionality of the data, which facilitated the subsequent clustering of cells into subclusters.

      We hope this addresses your concerns, and we appreciate your valuable feedback that helped us improve the clarity and accuracy of our manuscript.

      • Numerous style and grammar mistakes are present in the main text. For instance, certain sections of the methods are written in the present tense, suggesting that parts of a protocol were copied without text editing. Furthermore, some sections of the introduction are written in the past tense when the present tense would be more suitable. Clusters are inconsistently referred to by numbers or cell types, leading to confusion. Additionally, the authors frequently use the term "evolution" when describing trajectory analysis, which may not be appropriate. Overall, significant revisions to the main text are required.

      Thank you for your detailed review and valuable feedback on our manuscript. We highly appreciate your suggestions and have made the following revisions to address the issues you pointed out:

      Tense Consistency: We have thoroughly reviewed and corrected the use of tenses throughout the manuscript. The Methods section now consistently uses the past tense, while the Introduction section uses the present tense where appropriate, ensuring coherence and consistency.

      Cluster Naming Consistency: We have standardized the naming conventions for clusters, consistently using either numbers or cell types to avoid any confusion.

      Appropriate Terminology: We have reviewed our use of the term "evolution" in the context of trajectory analysis. Where necessary, we have replaced it with more accurate terms such as "trajectory progression" or "developmental pathway" to better convey the intended meaning.

      • Some figures are not mentioned in order or are not referenced in the text at all, such as Figure 5l (where it is also unclear how the authors selected the root cells). Additionally, many figures have text that is too small to be read without zooming in. Overall, the quality of the figures is inconsistent and sometimes very poor.

      Thank you for your detailed review and valuable feedback on our manuscript. We have addressed the issues you raised as follows:

      Unreferenced Figures in the Text:

      We acknowledge the oversight regarding Figure 5l not being mentioned in the text. In the revised version, we will ensure that all figures are properly referenced and discussed within the relevant sections of the manuscript.

      Text Size in Figures:

      We understand the difficulty in reading small text within the figures. We will redesign all figures to ensure that text and annotations are legible at normal viewing sizes. This will involve increasing the resolution and text size in all figures to enhance readability.

      Inconsistent Quality of Figures:

      To address the inconsistency in figure quality, we will standardize the formatting of all figures and ensure they meet a high standard of clarity and presentation. This will improve the overall visual quality and professionalism of the manuscript.

      The results section lacks clarity on several points:<br /> • The authors state that "myofibroblasts exclusively originated from the control group". However, pathways up-regulated in myofibroblasts (such as glycolysis) were enhanced after chemotherapy, as indicated by GSVA score. Similarly, why are some clusters of TAMs from the control group associated with pathways enriched in chemotherapy group?

      Thank you for your insightful comments and questions regarding our manuscript. We appreciate the opportunity to clarify these points.

      Regarding the statement that "myofibroblasts exclusively originated from the control group," we acknowledge the confusion and would like to provide a more detailed explanation. While the initial identification indicated that myofibroblasts were predominantly found in the control group, subsequent analyses, including the Gene Set Variation Analysis (GSVA), revealed that certain pathways up-regulated in myofibroblasts, such as glycolysis, were indeed enhanced following chemotherapy. This suggests that chemotherapy may induce or enhance specific functional states in these cells that are not initially apparent from their origin alone.

      Similarly, the observation that some clusters of Tumor-Associated Macrophages (TAMs) from the control group are associated with pathways enriched in the chemotherapy group can be explained by the dynamic nature of cellular responses to treatment. TAMs, like other immune cells, can exhibit plasticity and adapt to the tumor microenvironment altered by chemotherapy. This plasticity may result in the activation of pathways typically associated with a chemotherapy response, even in cells originating from the control group.

      We will revise the manuscript to better articulate these findings and include additional data to support our explanations. This will help clarify the observed discrepancies and provide a more comprehensive understanding of the cellular dynamics in response to chemotherapy.

      • Further explanation is necessary regarding the distinctions between malignant and non-malignant cells, as well as regarding the upregulation of metabolism-related pathways in fibroblasts from the NCT group. Additionally, clarification is needed regarding why certain TAMs from the control group are associated with pathways enriched in the chemotherapy group.

      Thank you for your detailed review and for highlighting the areas that require further clarification. We appreciate the opportunity to provide additional explanations and improve our manuscript.

      We recognize the need to more clearly differentiate between malignant and non-malignant cells in our manuscript. We will include additional details on the criteria and markers used to distinguish these cell types. Specifically, we will elaborate on the molecular and phenotypic characteristics that were used to identify malignant cells, such as specific genetic mutations, aberrant signaling pathways, and distinct cell surface markers, as opposed to those used for identifying non-malignant cells.

      As mentioned above, the association of certain TAMs from the control group with pathways enriched in the chemotherapy group can be attributed to the inherent plasticity and adaptability of TAMs. We will provide a more detailed explanation of how TAMs can exhibit different functional states based on microenvironmental cues. This will include a discussion on the potential pre-existing heterogeneity within TAM populations and how even in the absence of direct chemotherapy exposure, some TAMs may display pathway activities similar to those seen in the chemotherapy group due to microenvironmental influences or intrinsic properties.

      • In the section titled 'Chemo-driven Pro-mac and Anti-mac Metabolic Reprogramming Exerted Diametrically Opposite Effects on Tumor Cells': The markers selected to characterize the anti- and pro-macrophages are commonly employed for describing M1 or M2 polarization. It is uncertain whether this new classification into anti- and pro-macrophages is necessary. Additionally, it should be noted that pro-macrophages are anti-inflammatory, while anti-macrophages are pro-inflammatory, which could lead to confusion. M2 macrophages are already recognized for their role in stimulating tumor relapse after chemotherapy.

      Thank you for your feedback. We appreciate the opportunity to clarify the rationale behind our terminology and the focus on functional phenotypic changes in macrophages before and after chemotherapy.

      Our intention in introducing the terms "pro-macrophages" and "anti-macrophages" was to highlight the distinct functional phenotypic changes in macrophages observed before and after chemotherapy. These terms were chosen to emphasize the functional roles these macrophages play in the tumor microenvironment in response to chemotherapy, rather than strictly adhering to the conventional M1/M2 polarization paradigm.

      We acknowledge that M2 macrophages are well-documented in stimulating tumor relapse after chemotherapy. Our use of "pro-macrophages" is intended to build on this established knowledge by providing a more nuanced understanding of their role in the post-chemotherapy tumor microenvironment. Similarly, "anti-macrophages" highlight the macrophages' role in mounting an anti-tumor response.

      • The authors suggest that there is "reprogramming of CD8+ cytotoxic cells" following chemotherapy (Line 409). It remains unclear whether they imply the reprogramming of other CD8+ T cells into cytotoxic cells. While it is indicated that cytotoxic cells from the control group differ from those in the NCT group and that NCT cytotoxic T cells exhibit higher cytotoxicity, the authors did not assess the expression of NK and NK-like T cell markers (aside from NKG7), which may possess greater cytotoxic potential than CD8+ cytotoxic cells. This could also elucidate why cytotoxic cells from the NCT and control groups are positioned on separate branches in trajectory analysis. Overall, with 22.5k T cells in the dataset, only 3 subtypes were identified, suggesting a need for improved cell annotations by the authors.

      Thank you for your valuable feedback regarding the classification and characterization of CD8+ cytotoxic cells following chemotherapy, and the need for improved cell annotations.

      We appreciate your point on the potential ambiguity around the "reprogramming of CD8+ cytotoxic cells" post-chemotherapy. In our study, we observed that CD8+ T cells from the control and NCT groups differ significantly in their cytotoxic profiles, with the NCT group's cytotoxic T cells displaying enhanced cytotoxicity. However, we did not imply the reprogramming of other CD8+ T cells into cytotoxic cells. Instead, our findings suggest a shift in the functional state of existing CD8+ cytotoxic cells, driven by chemotherapy, which aligns with the upregulation of genes associated with cytotoxic functions.

      We acknowledge that the expression of NK and NK-like T cell markers (apart from NKG7) was not comprehensively assessed. We agree that these markers may possess greater cytotoxic potential and could elucidate the separation observed in the trajectory analysis between cytotoxic cells from the NCT and control groups. This distinction may be attributed to differential cytotoxic potentials and functional states induced by chemotherapy.

      Furthermore, with 22,530 T cells in the dataset, only three subtypes were initially identified. We recognize the need for more refined cell annotations to capture the full spectrum of T cell diversity. This could involve a deeper analysis of additional markers to distinguish between various cytotoxic populations, including NK and NK-like T cells, and their respective roles in the tumor microenvironment post-chemotherapy.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      I would recommend simplifying the manuscript and focusing on the differences between the treatment-naive and post-chemotherapy samples.

      Thank you for your valuable feedback on our manuscript. We greatly appreciate your suggestions and have carefully considered the proposed modifications.

      Upon re-evaluating our manuscript, we believe that the current structure and content most effectively convey our research findings. Our study aims to not only compare the treatment-naive and post-chemotherapy samples but also to highlight several important secondary findings that are integral to the overall research.

      Nevertheless, we understand your recommendation to simplify the manuscript. To address this, we have made some subtle adjustments to improve the readability and conciseness of the text. Additionally, we have included a section in the discussion that more explicitly highlights the differences between the treatment-naive and post-chemotherapy samples.

      IRB number for the human sample collection as well as animal experiments need to be provided.

      Thank you for your thorough review and for highlighting the need for the inclusion of the IRB number for the human sample collection and animal experiments.

      We apologize for this oversight and appreciate your attention to this important detail. The Institutional Review Board (IRB) approval number for the human sample collection is [B2019-436].

      This number has been added to the Methods section of our revised manuscript to ensure compliance with ethical standards and to provide transparency for our research.

      I put a question on the macrophage sorting experiment in the public review. Please clarify how the ARG1 staining was achieved with the preservation of cell viability.

      We apologize for the error caused by miscommunication within our research team. We are currently using both ARG1 and CD206 antibodies in our studies. Due to a communication error, the technician mistakenly assumed ARG1 was another name for CD206 (MRC1), resulting in the incorrect labeling of CD206 as ARG1 in our 0experimental records. In reality, we used the CD206 antibody, which is consistent with the same surface marker shown in figure 6e. We have made corrections in the manuscript and experimental figures. Thank you for pointing this out, and we regret any misunderstanding this may have caused.

      Reviewer #2 (Recommendations For The Authors):

      Minor comments:

      • Line 65- "Chemotherapy drugs, however, are very toxic and are prone to invalid". Line 75-77: "This heterogeneity in the TME includes the differences between tumor cells and tumor cells and the differences between various stromal cells and immune cells. Actively exploring the changes of multiple cells in the TME of LUAD after chemotherapy may finally find an excellent way to overcome chemotherapy resistance for LUAD." Please rewrite these parts.

      Thank you for your valuable comment. We have revised the manuscript according to your suggestion:

      Original (Line 65): "Chemotherapy drugs, however, are very toxic and are prone to invalid." Revised: "However, chemotherapy drugs are highly toxic and can often become ineffective."

      Original (Line 75-77): "This heterogeneity in the TME includes the differences between tumor cells and tumor cells and the differences between various stromal cells and immune cells. Actively exploring the changes of multiple cells in the TME of LUAD after chemotherapy may finally find an excellent way to overcome chemotherapy resistance for LUAD."

      Revised: "The heterogeneity within the tumor microenvironment (TME) encompasses not only the variations between different tumor cells but also among various stromal and immune cell types. Investigating the dynamic changes in multiple cell populations within the TME of LUAD following chemotherapy may provide crucial insights into overcoming chemotherapy resistance in LUAD."

      • Line 87: "The internal processes of the cells respectively drive immune cells and cancer cells to obtain glucose and glutamine preferentially."-> The internal metabolic changes in the cells drive...

      Thank you for your valuable comment. We have revised the manuscript according to your suggestion:

      Original (Line 87): "The internal processes of the cells respectively drive immune cells and cancer cells to obtain glucose and glutamine preferentially."

      Revised: "The internal metabolic changes in the cells drive immune cells and cancer cells to preferentially obtain glucose and glutamine."

      • Line 93: "an essential feature that affects the effect of chemotherapy"-> an essential feature that affects chemotherapy.

      Thank you for your valuable comment. We have revised the manuscript according to your suggestion:

      Original (Line 93): "Metabolic reprogramming in various cell types in the tumor microenvironment after undergoing chemotherapy may be an essential feature that affects the effect of chemotherapy."

      Revised: "Metabolic reprogramming in various cell types in the tumor microenvironment after undergoing chemotherapy may be an essential feature that affects chemotherapy."

      • Line 84: What do the immune cells depend on glucose for?

      Thank you for your valuable comment. We have revised the manuscript according to your suggestion:

      Original (Line 84): "However, recent studies have shown that tumor-infiltrating immune cells depend on glucose and immune cells especially macrophages consume more glucose than malignant cells."

      Revised: "However, recent studies have shown that tumor-infiltrating immune cells rely on glucose for their energy needs and functionality, with immune cells, particularly macrophages, consuming more glucose than malignant cells."

      • Line 223: "According to previous research, myofibroblast has been described"-> myofibroblasts have been described.

      Thank you for your valuable comment. We have revised the manuscript according to your suggestion:

      Original (Line 223): "According to previous research, myofibroblast has been described as a cancer-associated fibroblast that participated in extensive tissue remodeling, angiogenesis, and tumor progression."

      Revised: "According to previous research, myofibroblasts have been described as cancer-associated fibroblasts that participate in extensive tissue remodeling, angiogenesis, and tumor progression."

      • Line 239: "Considering the essential fibroblasts"-> Considering the essential role of fibroblasts.

      Thank you for your valuable comment. We have revised the manuscript according to your suggestion:

      Original (Line 239): "Considering the essential fibroblasts and their complicated function in shaping the tumor microenvironment..."

      Revised: "Considering the essential role of fibroblasts and their complicated function in shaping the tumor microenvironment..."

      • Line 251: "Further in vitro studies were required to elucidate these notable fibroblasts' potential function..." -> are required.

      Thank you for your valuable comments. We have revised the manuscript according to your suggestions:

      Original (Line 251): "Further in vitro studies were required to elucidate these notable fibroblasts' potential function..."

      Revised: "Further in vitro studies are required to elucidate these notable fibroblasts' potential function..."

      • Line 309: "Interestingly, we found that two subtypes, Anti-mac and Mix, can be converted to Pro-mac through pseudotime time analysis." -> via trajectory analysis we found that two subtypes...

      Thank you for your valuable comments. We have revised the manuscript according to your suggestions:

      Original (Line 309): "Interestingly, we found that two subtypes, Anti-mac and Mix, can be converted to Pro-mac through pseudotime time analysis."

      Revised: "Interestingly, via trajectory analysis we found that two subtypes, Anti-mac and Mix, can be converted to Pro-mac."

      • Line 458: "the interactions between malignant and macrophages"-> the interactions between malignant cells and macrophages.

      Thank you for your valuable comments. We have revised the manuscript according to your suggestions:

      Original (Line 458): "the interactions between malignant and macrophages"

      Revised: "the interactions between malignant cells and macrophages."

      • Line 486: "The 5-year survival rate is still gloomy" -> The 5-year survival rate is still low.

      Thank you for your valuable comments. We have revised the manuscript according to your suggestions:

      Original (Line 486): "The 5-year survival rate is still gloomy."

      Revised: "The 5-year survival rate is still low."

      • Line 491: "More and more efforts are devoted to targeted metabolism to overcome chemoresistance" -> More efforts are devoted to target cell metabolism...

      Thank you for your valuable comments. We have revised the manuscript according to your suggestions:

      Original (Line 491): "More and more efforts are devoted to targeted metabolism to overcome chemoresistance."

      Revised: "More efforts are devoted to targeting cell metabolism to overcome chemoresistance."

      • Line 594: "Repeat the above steps twice" -> This procedure was repeated twice.

      Thank you for your valuable comments. We have revised the manuscript according to your suggestions:

      Original (Line 594): "Repeat the above steps twice."

      Revised: "This procedure was repeated twice."

      • Line 620: How were the new potential markers verified? List the exact genes and experiments or a reference to a Figure.

      Thank you for your valuable comments. We have provided detailed information on how the new potential markers were verified, including the exact genes involved and the specific experiments conducted. A reference to the relevant Figure has also been added to the manuscript.

      • Line 637: Which immune cells were used as a background in CNV analysis? All immune cells or just T cells?

      Thank you for your valuable comments. In this study, all immune cells were used as background control cells.

      • Line 658: in a single cell

      Thank you for your valuable comments. We have revised the manuscript according to your suggestions.

      • Line 672: "a variety of environmental factors potentially affect" -> potentially affects/ may potentially affect.

      Thank you for your valuable comments. We have revised the manuscript according to your suggestions:

      Original (Line 672): "a variety of environmental factors potentially affect"

      Revised: "A variety of environmental factors may potentially affect"

      • Line 683: Which metabolites were tested?

      The metabolites tested included those related to glycolysis and oxidative phosphorylation (OXPHOS), such as glucose and various metabolites indicative of mitochondrial activity. The contents of these metabolites were analyzed to verify consistency with gene expression levels as mentioned in the analysis of metabolic pathways section.

      • Line 718: Required or acquired?

      The correct term should be "acquired" in the context of discussing drug resistance in tumor cells. The sentence likely refers to the "acquired drug resistance" of tumor cells, which is a common challenge in chemotherapy.

      • Line 726: What are the A549 cells?

      A549 cells are a human lung adenocarcinoma cell line commonly used in cancer research, particularly for studying lung cancer. In this study, A549 cells were used in animal experiments, mixed with tumor-associated macrophages (TAMs), and implanted into nude mice to study tumor formation and progression.

      • Line 631: "we set the following cut-off thresholds to reveal the marker genes of each cluster: adjusted P-value <0.01 and multiple changes >0.5." What metric is "multiple changes"? Commonly used measures are adjuster P-value and average Log2FC.

      Thank you for your valuable comment. We have revised the manuscript according to your suggestion. The term "multiple changes" was indeed a misstatement. The correct metric should be "log2 fold change (Log2FC)," which is a commonly used measure in gene expression studies. We have updated the manuscript to reflect this, using "adjusted P-value <0.01 and average Log2FC > 0.5" instead of "multiple changes > 0.5."

      • Figure 1f: "Samplied" -> Samples. What do the numbers on the left side of each column mean?

      Thank you for your valuable comment. The term "Samplied" was indeed a typographical error and has been corrected to "Samples". The numbers on the left side of each column likely represent cluster IDs or sample identifiers corresponding to the different patient samples or clusters analyzed in the study. We have clearly labeled these numbers in the figure to avoid any confusion.

      • Figure 2b: Please add a scale.

      Thank you for your valuable comment. We agree that adding a scale bar is crucial for accurately interpreting the size of the cells or structures shown in the figure. We have now included an appropriate scale bar during the figure preparation stage to provide this reference.

      • Figure 3d/4c: What is the matrix_27/3 metric? Is it average expression?

      Thank you for your valuable comment. The term "matrix_27/3" refers to a specific metric used in our analysis. This metric indeed represents the average expression levels of genes within a particular subset of the dataset. We will clarify this in the figure legend and the methods section to ensure that readers have a clear understanding of what the metric represents. Additionally, we will make sure that all such metrics are consistently and accurately described throughout the manuscript.

      • Figure 6e: Why CD206 staining is shown instead of ARG if ARG was chosen as the main gene for classification of Pro-macrophages?

      We apologize for the confusion regarding the use of CD206 staining in Figure 6e. This issue arose due to a miscommunication within our research team. While ARG1 was initially intended as the primary marker for Pro-macrophages, the technician mistakenly assumed ARG1 was another name for CD206 (MRC1), leading to the incorrect labeling of CD206 as ARG1 in our experimental records. In actuality, CD206 was used for the staining, which is consistent with the surface marker shown in Figure 6e. We have corrected this error in the manuscript and updated the experimental figures accordingly. We sincerely apologize for any misunderstanding this may have caused and appreciate the reviewer for bringing this to our attention.

      • Figures 6h and k: Please explain why do NCT Anti-macrophages show higher glucose and lactate uptake than the Anti-macrophages from the control group, while the size of tumors is the lowest in NCT Anti-macrophages in vivo?

      Thank you for your insightful comment. The observation that NCT Anti-macrophages exhibit higher glucose and lactate uptake while the tumor size is lowest could be attributed to the metabolic reprogramming induced by chemotherapy. It is possible that the enhanced metabolic activity in Anti-macrophages, characterized by increased glucose and lactate uptake, is linked to a more aggressive anti-tumor response in the NCT group. This heightened metabolic activity could reflect an increased energy demand necessary for sustaining enhanced immune functions, ultimately contributing to the reduction in tumor size. We will expand upon this explanation in the revised manuscript to provide a clearer interpretation of these findings.

      • The supplementary Table 1 needs a better legend/more explanation.

      Thank you for your valuable feedback. We have revised the legend for Supplementary Table 1 to provide a more detailed explanation of its contents.

      • No tSNE plot showing epithelial cells colored by patient, which may be important for observation of cell heterogeneity, especially in the epithelial cell population.

      Thank you for pointing this out. We agree that a tSNE plot showing epithelial cells colored by patient would be valuable for observing cell heterogeneity within the epithelial population.

      • Several acronyms not explained in the text (for example GSVA, NMF).

      Thank you for bringing this to our attention. We have ensured that all acronyms, including GSVA (Gene Set Variation Analysis) and NMF (Non-negative Matrix Factorization), are clearly defined in the text at their first mention.

      • Availability of data and material section: Please describe "other experimental data" in more detail.

      Thank you for your suggestion. We have expanded the "Availability of Data and Material" section to provide a more detailed description of the "other experimental data" referenced. This will include specific types of data generated, their formats, and 10how they can be accessed by other researchers. This clarification will enhance transparency and facilitate the reuse of our data by the research community.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1:

      (1) Given that this is one of the first studies to report the mapping of longitudinal intactness of proviral genomes in the globally dominant subtype C, the manuscript would benefit from placing these findings in the context of what has been reported in other populations, for example, how decay rates of intact and defective genomes compare with that of other subtypes where known.  

      Most published studies are from men living with HIV-1 subtype B and the studies are not from the hyperacute infection phase and therefore a direct head-to-head comparison with the FRESH study is difficult.  However, we can cite/highlight and contrast our study with a few a few examples from acute infection studies as follows.

      a. Peluso et. al., JCI, 2020, showed that in Caucasian men (SCOPE study), with subtype B infection, initiating ART during chronic infection virus intact genomes decayed at a rate of 15.7% per year, while defective genomes decayed at a rate of 4% per year.  In our study we showed that in chronic treated participants genomes decreased at a rate of 25% (intact) and 3% (defective) per month for the first 6 months of treatment.

      b. White et. al., PNAS, 2021, demonstrated that in a cohort of African, white and mixed-race American men treated during acute infection, the rate of decay of intact viral genomes in the first phase of decay was <0.3 logs copies in the first 2-3 weeks following ART initiation. In the FRESH cohort our data from acute treated participants shows a comparable decay rate of 0.31 log copies per month for virus intact genomes.

      c. A study in Thailand (Leyre et. al., 2020, Science Translational Medicine), of predominantly HIV-1 CRF01-AE subtype compared HIV-reservoir levels in participants starting ART at the earliest stages of acute HIV infection (in the RV254/SEARCH 010 cohort) and participants initiating ART during chronic infection (in SEARCH 011 and RV304/SEARCH 013 cohorts). In keeping with our study, they showed that the frequency of infected cells with integrated HIV DNA remained stable in participants who initiated ART during chronic infection, while there was a sharp decay in these infected cells in all acutely treated individuals during the first 12 weeks of therapy.  Rates of decay were not provided and therefore a direct comparison with our data from the FRESH cohort is not possible.

      d. A study by Bruner et. al., Nat. Med. 2016, described the composition of proviral populations in acute treated (within 100 days) and chronic treated (>180 days), predominantly male subtype B cohort. In comparison to the FRESH chronic treated group, they showed that in chronic treated infection 98% (87% in FRESH) of viral genomes were defective, 80% (60% in FRESH) had large internal deletions and 14% (31% in FRESH) were hypermutated.  In acute treated 93% (48% in FRESH) were defective and 35% (7% in FRESH) were hypermutated.  The differences frequency of hypermutations could be explained by the differences in timing of infection specifically in the acute treated groups where FRESH participants initiate ART at a median of 1 day after infection.  It is also possible that sex- or race-based differences in immunological factors that impact the reservoir may play a role.  

      This study also showed that large deletions are non-random and occur at hotspots in the HIV-1 genome. The design of the subtype B IPDA assay (Bruner et. al., Nature, 2019) is based on optimal discrimination between intact and deleted sequences - obtained with a 5′ amplicon in the Ψ region and a 3′ amplicon in Envelope. This suggest that Envelope is a hotspot for large while deletions in Ψ is the site of frequent small deletions and is included in larger 5′ deletions. In the FRESH cohort of HIV-1 subtype C, genome deletions were most frequently observed between Integrase and Envelope relative to Gag (p<0.0001–0.001).

      e. In 2017, Heiner et. al., in Cell Rep, also described genetic characteristics of the latent HIV-1 reservoir in 3 acute treated and 3 chronic treated male study participants with subtype B HIV.  Their data was similar to Bruner et. al. above showing proportions of intact proviruses in participants who initiated therapy during acute/early infection at 6% (94% defective) and chronic infection at 3% (97% defective). In contrast the frequencies in FRESH in acute treated were 52% intact and 48% defective and in chronic infection were 13% intact and 87% defective.  These differences could be attributed to the timing of treatment initiation where in the aforementioned study early treatment ranged from 0.6-3.4 months after infection.

      (2) Indeed, in the abstract, the authors indicate that treatment was initiated before the peak. The use of the term 'peak' viremia in the hyperacute-treated group could perhaps be replaced with 'highest recorded viral load'. The statistical comparison of this measure in the two groups is perhaps more relevant with regards to viral burden over time or area under the curve viral load as these are previously reported as correlates of reservoir size.

      We have edited the manuscript text to describe the term peak viraemia in hyperacute treated participants more clearly (lines 443-444). We have now performed an analysis of area under the curve to compare viral burden in the two study groups and found associations with proviral DNA levels after one year. This has been added to the results section (lines 162-163).

      Reviewer #2:

      (1) Other factors also deserve consideration and include age, and environment (e.g. other comorbidities and coinfections.)

      We agree that these factors could play a role however participants in this study were of similar age (18-23), and information on co-morbidities and coinfections are not known.

      Reviewer #3:

      (1) The word reservoir should not be used to describe proviral DNA soon after ART initiation. It is generally agreed upon that there is still HIV DNA from actively infected cells (phase 1 & 2 decay of RNA) during the first 6-12 months of ART. Only after a full year of uninterrupted ART is it really safe to label intact proviral HIV DNA as an approximation of the reservoir. This should be amended throughout.

      We agree and where appropriate have amended the use of the word reservoir to only refer to the proviral load after full viral suppression, i.e., undetectable viral load.

      (2) All raw, individualized data should be made available for modelers and statisticians. It would be very nice to see the RNA and DNA data presented in a supplementary figure by an individual to get a better grasp of intra-host kinetics.

      We will make all relevant data available and accessible to interested parties on request. We have now added a section on data availability (lines 489-491).

      (3) The legend of Supplementary Figure 2 should list when samples were taken.

      The data in this figure represents an overall analysis of all sequences available for each participant at all time points.  This has now been explained more clearly in the figure legend.

      Recommendations for The Authors:

      Reviewer #1:

      (1) It is recommended that the introduction includes information to set the scene regarding what is currently reported on the composition of the reservoir for those not in the immediate field of study i.e., the reported percentage of defective genomes and in which settings/populations genome intactness has been mapped, as this remains an area of limited information.

      We have now included summary of other reported findings in the field in the introduction (lines 89-92, 9498) and discussion (lines 345-350).  A more detailed overview has been provided in the response to public reviews.

      (2) It may be beneficial to state in the main text of the paper what the purpose of the Raltegravir was and that it was only administered post-suppression. Looking at Table 1, only the hyperacute treatment group received Raltegravir and this could be seen as a confounder as it is an integrase inhibitor. Therefore, this should be explained.

      Once Raltegravir became available in South Africa, all new acute infections in the study cohort had an intensified 4-drug regimen that included Raltegravir.  A more detailed explanation has now been included in the methods section (lines 435-437).

      (3) Can the authors explain why the viral measures at 6 months post-ART are not shown for chronictreated individuals in Figure 1 or reported on in the text?

      The 6 months post-ART time point has been added to Figure 1.

      (4) Can the authors indicate in the discussion, how the breakdown of proviral composition compares to subtype B as reported in the literature, for example, are the common sites of deletion similar, or is the frequency of hypermutation similar?

      Added to discussion (lines 345-350).

      (5) Do the numbers above the bars in Figure 3 represent the number of sampled genomes? If so, this should be stated.

      Yes, the numbers above the bars represent the number of sampled genomes. This has been added to the Figure 3 legend.

      (6) In the section starting on line 141, the introduction implies a comparison with immunological features, yet what is being compared are markers of clinical disease progression rather than immune responses. This should be clarified/corrected.

      This has been corrected (line 153).

      (7) Line 170 uses the term 'immediately' following infection, however, was this not 1 -3 days after?

      We have changed the word “immediately” to “1-3 days post-detection” (line 181).

      (8) Can the sampling time-points for the two groups be given for the longitudinal sequencing analysis?

      The sequencing time points for each group is depicted in Figure 2.

      (9) Line 183 indicates that intact genomes contributed 65% of the total sequence pool, yet it's given as 35% in the paragraph above. Should this be defective genomes?

      Yes, this was a typographical error.  Now corrected to read “defective genomes” (line 193).

      (10) The section on decay kinetics of intact and defective genomes seems to overlap with the section above and would flow better if merged.

      Well noted, however we choose to keep these sections separate.

      (11) Some references in the text are given in writing instead of numbering.

      This has been corrected.

      (12) In the clonal expansion results section, can it be indicated between which two time-points expansion was measured?

      This analysis was performed with all sequences available for each participant at all time points.  We have added this explanation to the respective Figure legend.

      Reviewer #2:

      (1) The statement on line 384 "Our data showed that early ART...preserves innate immune factors" - what innate immune factors are being referred to?

      We have removed this statement.

      (2) HLA genotyping methods are not included in the Methods section

      Now included and referenced (lines 481-483).

      (3) Are CD4:CD8 ratios available for the cohorts? This could be another informative clinical parameter to analyse in relation to HIV-1 proviral load after 1 year of ART – as done for the other variables (peak VL, and the CD4 measures).

      Yes, CD4:CD8 ratios are available. We performed the recommended analysis but found no associations with HIV-1 proviral load after 1 year of ART. We have added this to the results section (lines 163-164).

      (4) Reference formatting: Paragraph starting at line 247 (Contribution of clonal expansion...) - the two references in this paragraph are not cited according to the numbering system as for the rest of the manuscript. The Lui et al, 2020 reference is missing from the reference list - so will change all the numbering throughout.

      This has been corrected.

      Reviewer #3:

      (1) To allow comparison to past work. I suggest changing decay using % to half-life. I would also mention the multiple studies looking at total and intact HIV DNA decay rates in the intro.

      We do not have enough data points to get a good estimate of the half-life and therefor report decay as percentage per month for the first 6 months. 

      (2) Line 73: variability is the wrong word as inter-individual variability is remarkably low. I think the authors mean "difference" between intact and total.

      We have changed the word variability to difference as suggested.

      (3) Line 297: I am personally not convinced that there is data that definitively shows total HIV DNA impacting the pathophysiology of infection. All of this work is deeply confounded by the impact of past viremia. The authors should talk about this in more detail or eliminate this sentence.

      We have reworded the statement to read “Total HIV-1 DNA is an important biomarker of clinical outcomes.” (Lines 308-309).

      (4) Line 317; There is no target cell limitation for reservoir cells. The vast majority of CD4+ T cells during suppressive ART are uninfected. The mechanism listing the number of reservoir cells is necessarily not target cell limitation.

      We agree. The statement this refers to has been reworded as follows: “Considering, that the majority of CD4 T cells remain uninfected it is likely that this does not represent a higher number of target cells, and this warrants further investigation.” (lines 325-326).

      (5) Line 322: Some people in the field bristle at the concept of total HIV DNA being part of the reservoir as defective viruses do not contribute to viremia. Please consider rephrasing. 

      We acknowledge that there are deferring opinions regarding total HIV DNA being part of the reservoir as defective viruses do not contribute to viremia, however defective HIV proviruses may contribute to persistent immune dysfunction and T cell exhaustion that are associated comorbidities and adverse clinical outcomes in people living with HIV.  We have explained in the text that total HIV-DNA does not distinguish between replication-competent and -defective viruses that contribute to the viral reservoir.

      (6) Line 339: The under-sampling statement is an understatement. The degree of under-sampling is massive and biases estimates of clonality and sensitivity for intact HIV. Please see and consider citing work by Dan Reeves on this subject.

      We agree and have cited work by Dan Reeves (line 358).

      (7) Line 351: This is not a head-to-head comparison of biphasic decay as the Siliciano group's work (and others) does not start to consider HIV decay until one year after ART. I think it is important to not consider what happens during the first year of ART to be reservoir decay necessarily.

      Well noted.

      (8) Line 366-371: This section is underwritten. In nearly all PWH studies to date, observed reservoirs are highly clonal.

      We agree that observed reservoirs are highly clonal but have not added anything further to this section.

      (9) It would be nice to have some background in the intro & discussion about whether there is any a priori reason that clade C reservoirs, or reservoirs in South African women, might differ (or not) from clade B reservoirs observed in different study participants.

      We have now added this to the introduction (lines 94-103).

      (10) Line 248: This sentence is likely not accurate. It is probable that most of the reservoir is sustained by the proliferation of infected CD4+ T cells. 50% is a low estimate due to under-sampling leading to false singleton samples. Moreover, singletons can also be part of former clones that have contracted, which is a natural outcome for CD4+ T cells responding to antigens &/or exhibiting homeostasis. The data as reported is fine but more complex ecologic methods are needed to truly probe the clonal structure of the reservoir given severe under sampling.

      Well noted.

    1. Author response:

      The following is the authors’ response to the original reviews.

      We thank the reviewers for their time and thoughtful comments on our manuscript. 

      We realised a preliminary version of Figure 2 was initially submitted, which we are replacing now with a novel version. Differences between the two figures are : 1) The schematic in Figure 2a was replaced with a new one in line with that of Figure 3a; 2) in Figure 2c details about the statistical analysis were removed from the legend and one datapoint that was erroneously removed at day 5 for the ΔMYR1-Luc condition was included. Regardless, these changes do not affect the results and the conclusions initially drawn.

      Public Reviews:

      Reviewer #1 (Public review): 

      Previous studies have highlighted some of these paracrine activities of Toxoplasma - and Rasogi et al (mBio, 2020) used a single cell sequencing approach of cells infected in vitro with the WT or MYR KO parasites - and one of their conclusions was that MYR-1 dependent paracrine activities counteract ROP-dependent processes.

      Similarly, Chen et al (JEM 2020) highlighted that a particular rhoptry protein (ROP16) could be injected into uninfected macrophages and move them to an anti-inflammatory state that might benefit the parasite. 

      We are aware of both these studies, where the injection of rhoptry proteins into cells that the parasite does not invade alters the host transcriptional profile establishing a permissive environment. However, here we propose a different paracrine effect that goes beyond the injected/uninfected cell. Specifically, we propose that one or more MYR1-dependent effectors alter the cytokine secretion profile of infected cells, which leads to overall changes in the immune response such as cell types recruited to the site of infection, or the activation state. 

      There are caveats around immunity and as yet no insight into how this works. In Figure 2 there is a marked defect in the ability of the parasites to expand at day 2 and day 5. Together, these data sets suggest that this paracrine effect mediated by MYR-1 works early - well before the development of adaptive responses. 

      Yes, we also hypothesise an early effect based on the data. Growth continues until day 5 at least, and then plateaus towards day 7, which makes us believe that the effect takes place within the first 5 days. We agree with the reviewer that the MYR1-mediated rescue acts before the involvement of the adaptive immune response, which is supported by our results obtained in Rag2-/- mice shown in Figure 3e. 

      Reviewer #2 (Public review): 

      Summary: 

      In this manuscript by Torelli et al., the authors propose that the major function of MYR1 and MYR1-dependent secreted proteins is to contribute to parasite survival in a paracrine manner rather than to protect parasites from cell-autonomous immune response. The authors conclude that these paracrine effects rescue ∆MYR1 or knockouts of MYR1-dependent effectors within pooled in vivo CRISPR screens. 

      Strengths: 

      The authors raised a more general concern that pooled CRISPR screens (not only in Toxoplasma but also other microbes or cancers) would miss important genes by "paracrine masking effect". Although there is no doubt that pooled CRISPR screens (especially in vivo CRISPR screens) are powerful techniques, I think this topic could be of interest to those fields and researchers. 

      Weaknesses: 

      In this version, the reviewer is not entirely convinced of the 'paracrine masking effect' because the in vivo experiments should include appropriate controls (see major point 2). 

      (1) It is convincing that co-infection of WT and ∆MYR1 parasites could rescue the growth of ∆MYR1 in mice shown by in vivo luciferase imaging. Also, this is consistent with ∆MYR1 parasites showing no in vivo fitness defect in the in vivo CRISPR screens conducted by several groups. Meanwhile, it has been reported previously and shown in this manuscript that ∆MYR1 parasites have an in vitro growth defect; however, ∆MYR1 parasites show no in vitro fitness defect the in vitro pooled CRISPR screen. The authors show that the competition defect of ∆MYR1 parasites cannot be rescued by co-infection with WT parasites in Figure 1c, which might indicate that no paracrine rescue occurred in an in vitro environment. The authors seem not to mention these discrepancies between in vitro CRISPR screens and in vitro competition assays. Why do ∆MYR1 parasites possess neutral in vitro fitness scores in in vitro CRISPR screens? Could the authors describe a reasonable hypothesis? 

      The reviewer raises a very interesting point, which at this stage, we cannot fully explain. A technical explanation could be that the relatively small growth defect detected for clean KOs, is not well represented in the CRISPR screens due to the variability of guides, where smaller differences in growth are not reliably captured and hidden within the noise of the assays. Another technical explanation may be median-centering: if the majority of KOs in the pool have a small growth defect, median centering would push these towards a zero. We have observed and reported this phenomenon in Young et al., 2019 for libraries containing a larger fraction of genes with a negative fitness score. In the library used here focusing on secreted proteins, we have not observed a strong trend to negative fitness scores, but cannot exclude smaller shifts. Because we have no solid base to favour any of the above mentioned explanations, we have decided to not speculate too much on this in the manuscript. However, we wanted to show all the data as the difference between these results may not be technical, but biological, which could inform future studies or results by us and others.  

      (2) The authors developed a mixed infection assay with an inoculum containing a 20:80 ratio of ΔMYR1-Luc parasites with either WT parasites or ΔMYR1 mutants not expressing luciferase, showing that the in vivo growth defect of ∆MYR1 parasites is rescued by the presence of WT parasites. Since this experiment lacks appropriate controls, interpretation could be difficult. Is this phenomenon specific to MYR1? If a co-inoculum of ∆GRA12-Luc with either WT parasites or GRA12 parasites not expressing luciferase is included, the data could be appropriately interpreted. 

      We are not quite sure what appropriate controls the reviewer refers to. We show here in Figures 3c and 3f that increasing parasite load by co-infecting mice with ∆MYR1 parasites is not sufficient to rescue ∆MYR1-Luc parasite growth. Co-infection with WT parasites, however, does result in increased ∆MYR1-Luc parasitaemia at day 7 p.i., indicating that MYR1 competence is required for the in vivo trans-rescue we describe. As ∆GRA12 parasites have a very strong cell-autonomous restriction in vitro and severe growth defect in vivo (Torelli et al., BioRxiv), these parasites would be rapidly depleted, which is also observed in all CRISPR screens from various laboratories. Therefore we do not think that co-infection with GRA12-deficient parasites would be an informative experiment here. We do speculate that mutant parasites for other proteins required for export (i.e. MYR 2, 3, 4, ROP17) could also be trans-rescued in addition to mutants for other MYR-dependent proteins such as GRA24 and GRA28, which remodel cytokine secretion and could individually, or synergistically, affect host cell immunity. Dissecting which Toxoplasma factor/s and host cytokine signalling pathways drive this trans-rescue effect is highly interesting, but beyond the scope of this manuscript. Here, we focused on the basic concept that an individual mutant can be rescued in trans in vivo, which we think is of importance beyond the field of Toxoplasma research. 

      (3) In the Discussion part, the authors argue that the rescue phenotype of mixed infection is not due to co-infection of host cells (lines 307-310). This data is important to support the authors' paracrine hypothesis and should be shown in the main figure.

      We understand the reviewer’s concern for rescue by co-infection of the same cell, but we largely exclude this hypothesis as Toxoplasma cell-autonomous effectors, such as GRA12 and ROP18, would also be rescued if that were to happen on a larger scale. We previously performed an in vivo experiment to assess co-infection rates of peritoneal exudate cells (PECs) by imaging using infection doses comparable to those used in the trans-rescue experiments. The total infection rate of PECs was 2.3%, so the overall number of infected cells per image was low, and not suitable for publication purposes. We tried to capture more cells using FACS analysis, however, PECs are highly autofluorescent in the yellow/green channels, which prevented us from drawing adequate conclusions using our GFP and mCherry strains. Because we see no rescue of GRA12 or ROP18 in CRISPR screens, and the overall in vivo co-infection rates were very low as observed by imaging, we did not think that generating strains expressing different fluorochromes compatible with standard FACS analysis, and then performing more in vivo experiments was best use of resources at the time. 

      (4) In the Discussion part, the authors assume that the rescue phenotype is the result of multiple MYR1-dependent effectors. I admit that this hypothesis could be possible since a recently published paper described the concerted action of numerous MYR1-dependent or independent effectors contributing to the hypermigration of infected cells (Ten Hoeve et al., mBio, 2024). I think this paragraph would be kind of overstated since the authors did not test any of the candidate effectors. Since the authors possess ∆IST parasites, they can test whether IST is involved in the "paracrine masking effect" or not to support their claim. 

      MYR1 deletion impairs the export of multiple Toxoplasma effectors into the host cell, including GRA16, GRA24, GRA28, HCE1/TEEGR etc, many of which can influence cytokine levels. As such, we speculate that it is a combination of multiple effector proteins that are responsible for the trans-rescue. As stated above, which parasite effectors, host cell types and cytokines are involved in the phenotype we describe are part of ongoing and future studies. Here, we wanted to focus on the key message, that in in vivo CRISPR screens, paracrine rescue of individual mutants can occur. While we will test IST mutants, it is probably not the top candidate as it only prevents upregulation of ISGs after exposure to IFN-γ, but has probably no role in already stimulated cells. As we still observe strong rescue past day 3, when IFN-γ levels are already elevated (Nishiyama 2020 Parasitol Int), IST probably plays no dominant role. 

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors): 

      (1) Figure 1 - it's not obvious what concentration of IFN-gamma is being used in these assays (sorry if this is stated somewhere else). 

      All in vitro experiments were performed with 100 U/ml IFN-γ as stated in the Material & Methods section, however added this information in the figure legend of Figure 1.

      (2) Figure 3 This reviewer wonders if earlier differences are buried in the data sets. In Figure 3b it looks like there are early differences but this is lost in the collated data analysis in 3c. An early difference is quite apparent in Figure 2. 

      We agree with the reviewer that a difference is visible at day 3 and 5 in Figure 3b, however differences between experimental groups became statistically significant only at day 7 in Figure 3c (N = 4 biological replicates). We cannot compare results between Figure 3c and Figure 2c as the latter reports 100% WT or ΔMYR1 infections and not 20:80 mixes.

      (3) The authors conclude from their in vitro studies that MYR-1 is not required for in vitro growth in IFN-g activated macrophages. Given that the WT parasites still rescue MYR KO parasites in RAG mice it does imply that this paracrine effect would impact early innate responses. Since RAG mice do have a strong ILC/NK cell response that leads to the local production of IFN-g it would seem like a reasonable candidate. Do the authors know if the MYR KO have improved growth in the absence of IFN-g in vivo? This could be done using KO mice or with IFN-g neutralization. 

      MYR1 displayed a neutral score in CRISPR screens in IFN-γ KO mice (Tachibana et al Cell Reports 2023), suggesting that lack of IFN-γ does not specifically improve MYR1 mutant growth compared to other mutants in a pool. We believe that the rescue is rather driven by other cytokines that have been shown to be altered in a MYR1 dependent manner (i.e CCL2, IL-6, IL-12). But as laid out before, this is subject of future studies.  

      This is a submission that might benefit from a graphical model of how the authors view this system working. 

      We agree with the reviewer and we added a graphical model to the manuscript. 

      Reviewer #2 (Recommendations for the authors): 

      The authors previously published a study that combines CRISPR screens in Toxoplasma and host transcriptome by scRNA-seq (Butterworth et al., Cell Host Microbe 2023). I think the authors possess transcriptome of ∆MYR1-infected HFFs. Although I understand this screen is conducted in in-vitro culture and human fibroblasts, are there any differentially expressed genes or pathways that could explain the paracrine rescue phenomenon described in this manuscript?

      We thank the reviewer for this insightful comment, which is however hard to address.  Thousands of host cell genes within multiple pathways are affected by MYR1 deletion (Naor et al. mBio 2018; Butterworth et al. Cell Host Microbe 2023). Therefore the PerturbSeq dataset is not helpful to pinpoint specific immune mechanisms of rescue, and is speculative without any experimentation to back it up. However, we added a sentence in line 350 of the discussion to highlight known MYR1-related effects on immune-related pathways. “Individual MYR-related effectors that may be responsible for the paracrine rescue have not been investigated here and we hypothesise that the phenotype is likely the concerted result of multiple effectors that affect cytokine secretion. For example, previous studies showed that both GRA18 and GRA28 can induce release of CCL22 from infected cells (He 2018 eLife; Rudzki 2021 mBio), while GRA16 and HCE1/TEEGR impair NF-kB signalling and the potential release of pro-inflammatory cytokines such as IL-6, IL-1β and TNF (Seo 2020 Int J Mol Sci; Braun 2019 Nat Microbiol). Regardless of the effector(s), our results highlight an important novel function of MYR1-dependent effectors by establishing a supportive environment in trans for Toxoplasma growth within the peritoneum.”

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Strengths and weaknesses:

      Although the revised manuscript has significantly improved in the quality of pictures, there seems to be still a discrepancy in Figure 2A: quantification result suggested that NIC (1um) treatment increased the number of colonies from 300 to around 450 (1.5 folds), whereas representative picture shown that the difference was 3 to 12 living organoids (4 folds).

      As reviewer points out, the selected picture was not representative image of “control” group in Figure2A. We replaced it by the new representative image in this revised version.

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      A minor point to be corrected:

      Please consider removing "In consistent with this notion", which is repetitive with "Similarly".

      " NIC is supposed to activate Wnt signaling via Hippo-YAP/TAZ and Notch signaling. In consistent with this notion. Similarly, the expression of target proteins (Sox9, TCF4 and, C-myc)..."

      We corrected it according to the reviewer’s suggestion.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      In their manuscript, Gomez-Frittelli and colleagues characterize the expression of cadherin6 (and -8) in colonic IPANs of mice. Moreover, they found that these cdh6-expressing IPANs are capable of initiating colonic motor complexes in the distal colon, but not proximal and midcolon. They support their claim by morphological, electrophysiological, optogenetic, and pharmacological experiments.

      Strengths:

      The work is very impressive and involves several genetic models and state-of-the-art physiological setups including respective controls. It is a very well-written manuscript that truly contributes to our understanding of GI-motility and its anatomical and physiological basis. The authors were able to convincingly answer their research questions with a wide range of methods without overselling their results.

      We greatly appreciate the reviewer’s time, careful reading and support of our study.

      Weaknesses:

      The authors put quite some emphasis on stating that cdh6 is a synaptic protein (in the title and throughout the text), which interacts in a homophilic fashion. They deduct that cdh6 might be involved in IPAN-IPAN synapses (line 247ff.). However, Cdh6 does not only interact in synapses and is expressed by non-neuronal cells as well (see e.g., expression in the proximal tubuli of the kidney). Moreover, cdh6 does not only build homodimers, but also heterodimers with Chd9 as well as Cdh7, -10, and -14 (see e.g., Shimoyama et al. 2000, DOI: 10.1042/0264-6021:3490159). It would therefore be interesting to assess the expression pattern of cdh6-proteins using immunostainings in combination with synaptic markers to substantiate the authors' claim or at least add the possibility of cell-cell-interactions other than synapses to the discussion. Additionally, an immunostaining of cdh6 would confirm if the expression of tdTomato in smooth muscle cells of the cdh6-creERT model is valid or a leaky expression (false positive).

      We agree with the reviewer that Cdh6 could be mediating some other cell-cell interaction besides synapses between IPANs, and will include more on this in the discussion. Cdh6 primarily forms homodimers but, as the reviewer points out, has been known to also form heterodimers with some other cadherins. We performed RNAscope in the colonic myenteric plexus with Cdh7 and found no expression (data not shown). Cdh10 is suggested to have very low expression (Drokhlyansky et al., 2020), possibly in putative secretomotor vasodilator neurons, and Cdh14 has not been assayed in any RNAseq screens. We attempted to visualize Cdh6 protein via antibody staining (Duan et al., 2018) but our efforts did not result in sufficient signal or resolution to identify synapses in the ENS, which remain broadly challenging to assay. Similarly, immunostaining with Cdh6 antibody was unable to confirm Cdh6 protein in tdT-expressing muscle cells, or by RNAscope. We will address these caveats in the discussion section.

      (1) E. Drokhlyansky, C. S. Smillie, N. V. Wittenberghe, M. Ericsson, G. K. Griffin, G. Eraslan, D. Dionne, M. S. Cuoco, M. N. Goder-Reiser, T. Sharova, O. Kuksenko, A. J. Aguirre, G. M. Boland, D. Graham, O. Rozenblatt-Rosen, R. J. Xavier, A. Regev, The Human and Mouse Enteric Nervous System at Single-Cell Resolution. Cell 182, 1606-1622.e23 (2020).

      (2) X. Duan, A. Krishnaswamy, M. A. Laboulaye, J. Liu, Y.-R. Peng, M. Yamagata, K. Toma, J. R. Sanes, Cadherin Combinations Recruit Dendrites of Distinct Retinal Neurons to a Shared Interneuronal Scaffold. Neuron 99, 1145-1154.e6 (2018).

      Reviewer #2 (Public review):

      Summary:

      Intrinsic primary afferent neurons are an interesting population of enteric neurons that transduce stimuli from the mucosa, initiate reflexive neurocircuitry involved in motor and secretory functions, and modulate gut immune responses. The morphology, neurochemical coding, and electrophysiological properties of these cells have been relatively well described in a long literature dating back to the late 1800's but questions remain regarding their roles in enteric neurocircuitry, potential subsets with unique functions, and contributions to disease. Here, the authors provide RNAscope, immunolabeling, electrophysiological, and organ function data characterizing IPANs in mice and suggest that Cdh6 is an additional marker of these cells.

      Strengths:

      This paper would likely be of interest to a focused enteric neuroscience audience and increase information regarding the properties of IPANs in mice. These data are useful and suggest that prior data from studies of IPANs in other species are likely translatable to mice.

      We appreciate the reviewer’s support of our study and insightful critiques for its improvement.

      Weaknesses:

      The advance presented here beyond what is already known is minimal. Some of the core conclusions are overstated and there are multiple other major issues that limit enthusiasm. Key control experiments are lacking and data do not specifically address the properties of the proposed Cdh6+ population.

      Major weaknesses:

      (1) The novelty of this study is relatively low. The main point of novelty suggests an additional marker of IPANs (Cdh6) that would add to the known list of markers for these cells. How useful this would be is unclear. Other main findings basically confirm that IPANs in mice display the same classical characteristics that have been known for many years from studies in guinea pigs, rats, mice and humans.

      We appreciate the already existing markers for IPANs in the ENS and the existing literature characterizing these neurons. The primary intent of this study was to use these well established characteristics of IPANs in both mice and other species to characterize Cdh6-expressing neurons in the mouse myenteric plexus and confirm their classification as IPANs.

      (2) Some of the main conclusions of this study are overstated and claims of priority are made that are not true. For example, the authors state in lines 27-28 of the abstract that their findings provide the "first demonstration of selective activation of a single neurochemical and functional class of enteric neurons". This is certainly not true since Gould et al (AJP-GIL 2019) expressed ChR2 in nitrergic enteric neurons and showed that activating those cells disrupted CMC activity. In fact, prior work by the authors themselves (Hibberd et al., Gastro 2018) showed that activating calretinin neurons with ChR2 evoked motor responses. Work by other groups has used chemogenetics and optogenetics to show the effects of activating multiple other classes of neurons in the gut.

      We believe our phrasing in this sentence was misleading. Whilst single neurochemical classes of enteric neurons have been manipulated to alter gut functions, all such instances to date do not represent manipulation of a single functional class of enteric neurons. In the given examples, NOS and calretinin are each expressed to varying degrees across putative motor neurons, interneurons and IPANs. In contrast, Chd6 is restricted to IPANs and therefore this study is the first optogenetic investigation of enteric neurons from a single putative functional class. We will alter this segment in the revised manuscript to emphasize this point and differentiate this study from those previous.

      (3) Critical controls are needed to support the optogenetic experiments. Control experiments are needed to show that ChR2 expression a) does not change the baseline properties of the neurons, b) that stimulation with the chosen intensity of light elicits physiologically relevant responses in those neurons, and c) that stimulation via ChR2 elicits comparable responses in IPANs in the different gut regions focused on here.

      We completely agree controls are essential. However, our paper is not the first to express ChR2 in enteric neurons. Authors of our paper have shown in Hibberd et al. 2018 that expression of ChR2 in a heterogeneous population of myenteric neurons did not change network properties of the myenteric plexus. This was demonstrated in the lack of change in control CMC characteristics in mice expressing ChR2 under basal conditions (without blue light exposure). Regarding question (b), that it should be shown that stimulation with the chosen intensity of light elicits physiologically relevant responses in those neurons. We show the restricted expression of ChR2 in IPANs and that motor responses (to blue light) are blocked by selective nerve conduction blockade.

      Regarding question (c), that our study should demonstrate that stimulation via ChR2 elicits comparable responses in IPANs in the different gut regions. We would not expect each region of the gut to behave comparably. This is because the different gut regions (i.e. proximal, mid, distal) are very different anatomically, as is anatomy of the myenteric plexus and myenteric ganglia between each region, including the density of IPANs within each ganglia, in addition to the presence of different patterns of electrical and mechanical activity [Spencer et al., 2020]. Hence, it is difficult to expect that between regions stimulation of ChR2 should induce similar physiological responses. The motor output we record in our study (CMCs) is a unified motor program that involves the temporal coordination of hundreds of thousands of enteric neurons and a complex neural circuit that we have previously characterized [Spencer et al., 2018]. But, never has any study until now been able to selectively stimulate a single functional class of enteric neurons (with light) to avoid indiscriminate activation of other classes of neurons.

      (1) T. J. Hibberd, J. Feng, J. Luo, P. Yang, V. K. Samineni, R. W. Gereau, N. Kelley, H. Hu, N. J. Spencer, Optogenetic Induction of Colonic Motility in Mice. Gastroenterology 155, 514-528.e6 (2018).

      (2) N. J. Spencer, L. Travis, L. Wiklendt, T. J. Hibberd, M. Costa, P. Dinning, H. Hu, Diversity of neurogenic smooth muscle electrical rhythmicity in mouse proximal colon. American Journal of Physiology-Gastrointestinal and Liver Physiology 318, G244–G253 (2020).

      (3) N. J. Spencer, T. J. Hibberd, L. Travis, L. Wiklendt, M. Costa, H. Hu, S. J. Brookes, D. A. Wattchow, P. G. Dinning, D. J. Keating, J. Sorensen, Identification of a Rhythmic Firing Pattern in the Enteric Nervous System That Generates Rhythmic Electrical Activity in Smooth Muscle. J. Neurosci. 38, 5507–5522 (2018).

      (4) The electrophysiological characterization of mouse IPANs is useful but this is a basic characterization of any IPAN and really says nothing specifically about Cdh6+ neurons. The electrophysiological characterization was also only done in a small fraction of colonic IPANs, and it is not clear if these represent cell properties in the distal colon or proximal colon, and whether these properties might be extrapolated to IPANs in the different regions. Similarly, blocking IH with ZD7288 affects all IPANs and does not add specific information regarding the role of the proposed Cdh6+ subtype.

      Our electrophysiological characterization was guided to be within a subset of Cdh6+ neurons by Hb9:GFP expression. As in the prior comment (1) above, we used these experiments to confirm classification of Cdh6+ (Hb9:GFP+) neurons in the distal colon as IPANs. We will clarify that these experiments were performed in the distal colon and agree that we cannot extrapolate that these properties are also representative of IPANs in the proximal colon. We apologize that this was confusing. Finally, we agree with the reviewer that ZD7288 affects all IPANs in the ENS and will clarify this in the text.

      (5) Why SMP IPANs were not included in the analysis of Cdh6 expression is a little puzzling. IPANs are present in the SMP of the small intestine and colon, and it would be useful to know if this proposed marker is also present in these cells.

      We agree with the reviewer. In addition to characterizing Cdh6 in the myenteric plexus, it would be interesting to query if sensory neurons located within the SMP also express Cdh6. Our preliminary data (n=2) show ~6-12% tdT/Hu neurons in Cdh6-tdT ileum and colon (data not shown). We will add a sentence to the discussion.

      (6) The emphasis on IH being a rhythmicity indicator seems a bit premature. There is no evidence to suggest that IH and IT are rhythm-generating currents in the ENS.

      Regarding the statement there is no evidence to suggest that IH and IT are rhythm-generating currents in the ENS. We agree with the reviewer that evidence of rhythm generation by IH and IT in the ENS has not been explicitly confirmed. We are confident the reviewer agrees that an absence of evidence is not evidence of absence, although the presence of IH has been well described in enteric neurons. We will modify the text in the results to indicate more clearly that IH and IT are known to participate in rhythm generation in thalamocortical circuits, though their roles in the ENS remain unknown. Our discussion of the potential role of IH or IT in rhythm generation or oscillatory firing of the ENS is constrained to speculation in the discussion section of the text.

      (7) As the authors point out in the introduction and discuss later on, Type II Cadherins such as Cdh6 bind homophillically to the same cadherin at both pre- and post-synapse. The apparent enrichment of Cdh6 in IPANs would suggest extensive expression in synaptic terminals that would also suggest extensive IPAN-IPAN connections unless other subtypes of neurons express this protein. Such synaptic connections are not typical of IPANs and raise the question of whether or not IPANs actually express the functional protein and if so, what might be its role. Not having this information limits the usefulness of this as a proposed marker.

      We agree with the reviewer that the proposed IPAN-IPAN connection is novel although it has been proposed before (Kunze et al., 1993). As detailed in our response to Reviewer #1, we attempted to confirm Cdh6 protein expression, but were unsuccessful, due to insufficient signal and resolution. We therefore discuss potential IPAN interconnectivity in the discussion, in the context of contrasting literature.

      (1) W. A. A. Kunze, J. B. Furness, J. C. Bornstein, Simultaneous intracellular recordings from enteric neurons reveal that myenteric ah neurons transmit via slow excitatory postsynaptic potentials. Neuroscience 55, 685–694 (1993).

      (8) Experiments shown in Figures 6J and K use a tethered pellet to drive motor responses. By definition, these are not CMCs as stated by the authors.

      The reviewer makes a valid criticism as to the terminology, since tethered pellet experiments do not record propagation. We believe the periodic bouts of propulsive force on the pellet is triggered by the same activity underlying the CMC. In our experience, these activities have similar periodicity, force and identical pharmacological properties. Consistent with this, we also tested full colons (n = 2) set up for typical CMC recordings by multiple force transducers, finding that CMCs were abolished by ZD7288, similar to fixed pellet recordings (data not shown).

      (9) The data from the optogenetic experiments are difficult to understand. How would stimulating IPANs in the distal colon generate retrograde CMCs and stimulating IPANs in the proximal colon do nothing? Additional characterization of the Cdh6+ population of cells is needed to understand the mechanisms underlying these effects.

      We agree that the different optogenetic responses in the proximal and distal colon are challenging to interpret, but perhaps not surprising in the wider context. It is not only possible that the different optogenetic responses in this study reflect regional differences in the Chd6+ neuronal populations, but also differences in neural circuits within these gut regions. A study some time ago by the authors showed that electrical stimulation of the proximal mouse colon was unable to evoke a retrograde (aborally) propagating CMC (Spencer, Bywater, 2002), but stimulation of the distal colon was readily able to. We concluded that at the oral lesion site there is a preferential bias of descending inhibitory nerve projections, since the ascending excitatory pathways have been cut off. In contrast, stimulation of the distal colon was readily able to activate an ascending excitatory neural pathway, and hence induce the complex CMC circuits required to generate an orally propagating CMC. Indeed, other recent studies have added to a growing body of evidence for significant differences in the behaviors and neural circuits of the two regions (Li et al., 2019, Costa et al., 2021a, Costa et al., 2021b, Nestor-Kalinoski et al., 2022). We will expand this discussion.

      (1) N. J. Spencer, R. A. Bywater, Enteric nerve stimulation evokes a premature colonic migrating motor complex in mouse. Neurogastroenterology & Motility 14, 657–665 (2002).

      (2) Li Z, Hao MM, Van den Haute C, Baekelandt V, Boesmans W, Vanden Berghe P (2019) Regional complexity in enteric neuron wiring reflects diversity of motility patterns in the mouse large intestine. Elife 8.

      (3). Costa M, Keightley LJ, Hibberd TJ, Wiklendt L, Dinning PG, Brookes SJ, Spencer NJ (2021a) Motor patterns in the proximal and distal mouse colon which underlie formation and propulsion of feces. Neurogastroenterol Motil e14098.

      (4) Costa M, Keightley LJ, Hibberd TJ, Wiklendt L, Smolilo DJ, Dinning PG, Brookes SJ, Spencer NJ (2021b) Characterization of alternating neurogenic motor patterns in mouse colon. Neurogastroenterol Motil 33:e14047.

      (5) Nestor-Kalinoski A, Smith-Edwards KM, Meerschaert K, Margiotta JF, Rajwa B, Davis BM, Howard MJ (2022) Unique Neural Circuit Connectivity of Mouse Proximal, Middle, and Distal Colon Defines Regional Colonic Motor Patterns. Cell Mol Gastroenterol Hepatol 13:309-337.e303.

    1. Author response:

      We are very pleased to see these positive reviews of our preprint.

      Reviewers 1 and 3 raise issues around PIP-PP1 interactions.

      (1) Role of the “RVxF-ΦΦ-R-W string”

      Most PIPs interact with the globular PP1 catalytic core through short linear interaction motifs (SLiMs) and Choy et al (PNAS 2014) previously showed that many PIPs interact with PP1 through conserved trio of SLiMs, RVxF-ΦΦ-R, which is also present in the Phactrs.

      Previous structural analysis showed the trajectory of the PPP1R15A/B, Neurabin/Spinphilin (PPP1R9A/B), and PNUTS (PPP1R10) PIPs across the PP1 surface encompasses not only the RVxF-ΦΦ-R trio, but also additional sequences C-terminal to it (Chen et al, eLife, 2015). This extended trajectory is maintained in the Phactr1-PP1 complex (Fedoryshchak et al, eLife (2020). Based on structural alignment we proposed the existence of an additional hydrophobic “W” SLiM that interacts with the PP1 residues I133 and Y134.

      The extended “RVxF-ΦΦ-R-W” interaction brings sequences C-terminal to the “W” SLiM into the vicinity of the hydrophobic groove that adjoins the PP1 catalytic centre. In the Phactr1/PP1 complex, these sequences remodel the groove, generating a novel pocket that facilitates sequence-specific substrate recognition.

      This raises the possibility that sequences C-terminal to the extended “RVxF-ΦΦ-R-W string” in the other complexes also confer sequence-specific substrate recognition, and our study aims to test this hypothesis. Indeed, the hydrophobic groove structures of the Neurabin/Spinophilin/PP1 and Phactr1/PP1 complexes differ significantly (Ragusa et al, 2010; see Fedoryshchak et al 2020, Fig2 FigSupp1).

      (2) Orientation of the W side chain

      Reviewer 1 points out that in the substrate-bound PP1/PPP1R15A/Actin/eIF2 pre-dephosphorylation complex the W sidechain is inverted with respect to its orientation in  PP1-PPP1R15B complex (Yan et al, NSMB 2021). The authors proposed that this may reflect the role of actin in assembly of the quaternary complex. This does not necessarily invalidate the notion that sequences C-terminal to the “W” motif might play a role in actin-independent substrate recognition, and we therefore consider our inclusion of the R15A/B fusions in our analysis to be reasonable.

      (3) Conservation of W

      The motif ‘W’ does not mandate tryptophan - Phactrs and PPP1R15A/B indeed have W at this position but Neurabin/spinophilin contain VDP, which makes similar interactions. Similarly the _“_RVxF” motifs in Phactr1, Neurabin/Spinophilin, PPP1R15A/B and PNUTS are LIRF, KIKF, KV(R/T)F and TVTW respectively.

      In our revision, we will present comparisons of the differentially remodelled/modified PP1 hydrophobic groove in the various complexes, discuss the different orientations of the tryptophan in the previously published PPP1R15A/PP1 and PPP1R15B/PP1 structures. We will also address the other issues raised by the referees.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Ma, Yang et al. report a new investigation aimed at elucidating one of the key nutrients S. Typhimurium (STM) utilizes with the nutrient-poor intracellular niche within the macrophage, focusing on the amino acid beta-alanine. From these data, the authors report that beta-alanine plays an important role in mediating STM infection and virulence. The authors employ a multidisciplinary approach that includes some mouse studies and ultimately propose a mechanism by which panD, involved in B-Ala synthesis, mediates the regulation of zinc homeostasis in Salmonella. The impact of this work is questionable. There are already many studies reporting Salmonella-effector interactions, and while this adds to that knowledge it is not a significant advance over previous studies. While the authors are investigating an interesting question, the work has two important weaknesses; if addressed, the conclusions of this work and broader relevance to bacterial pathogenesis would be enhanced.

      Strengths:

      This reviewer appreciates the multidisciplinary nature of the work. The overall presentation of the figure graphics are clear and organized.

      Weaknesses:

      First, this study is very light on mechanistic investigations, even though a mechanism is proposed. Zinc homeostasis in cells, and roles in bacteria infections, are complex processes with many players. The authors have not thoroughly investigated the mechanisms underlying the roles of B-Ala and panD in impacting STM infection such that other factors cannot be ruled out. Defining the cellular content of Zn2+ STM in vivo would be one such route. With further mechanistic studies, the possibility cannot be ruled out that the authors have simply deleted two important genes and seen an infection defect - this may not relate directly to Zn2+ acquisition.

      Thank you for your patient and thoughtful reading as well as the constructive comments and advice about our manuscript. We will revise the manuscript based on your comments and suggestions.

      You are right that this work have not thoroughly investigated the mechanisms underlying the roles of β-Ala, panD and zinc in impacting Salmonella infection. We will perform additional experiments to detect the content of zinc during Salmonella infection in vivo and in vitro, according to your suggestions.

      We agree that other unknown mechanism(s) are also involved in the virulence regulation by β-Ala in Salmonella, as our results showed that the double mutant Δ_panD_Δ_znuA_ (cannot synthesis of β-Ala and uptake of zinc) is more attenuated than the single mutant Δ_znuA_ (Figure 5D), suggesting that the contribution of β-Ala to the virulence of Salmonella is partially dependent on zinc acquisition_._ We will reword the related description throughout the manuscript for clarity.

      Second, the authors hint at their newly described mechanism/pathway being important for disease and possibly a target for therapeutics. This claim is not justified given that they have employed a single STM strain, which was isolated from chickens and is not even a clinical isolate. The authors could enhance the impact of their findings and relevance to human disease by demonstrating it occurs in human clinical isolates and possibly other serovars. Further, the use of mouse macrophage as a model, and mice, have limited translatability to human STM infections.

      We thank your comments and advice regarding our manuscript and are delighted to accept them.

      You are right that our current findings are relatively limited and not sufficient for disease therapeutics. We will reword the related description throughout the manuscript. Based on this comment, we will also use Salmonella Typhi and human macrophages to perform additional experiments to extend our findings. Salmonella Typhi is a human-limited Salmonella serovar and the cause of typhoid fever, a severe lethal systemic disease. Salmonella Typhimurium (STM) cause systemic disease in mice, which is similar to the symptoms of typhoid fever in human and has been widely used to explore the pathogenesis of Salmonella.

      Reviewer #2 (Public review):

      Summary:

      Salmonella exploits host- and bacteria-derived β-alanine to efficiently replicate in host macrophages and cause systemic disease. β-alanine executes this by increasing the expression of zinc transporter genes and therefore the uptake of zinc by intracellular Salmonella

      Strengths:

      The experiments designed are thorough and the claims made are directly related to the outcome of the experiments. No overreaching claims were made.

      Weaknesses:

      A little deeper insight was expected, particularly towards the mechanistic aspects. For example, zinc transport was found to be the cause of the b-alanine-mediated effect on Salmonella intracellular replication. It would have been very interesting to see which are the governing factors that may get activated or inhibited due to Zn accumulation that supports such intracellular replication.

      We appreciate your review and advice. We will design and perform additional experiments to further investigate the mechanisms by which β-Ala, panD and zinc influence Salmonella infection, according to your suggestions. For example, we will detect the content of zinc during Salmonella infection in vivo and in vitro.

      Reviewer #3 (Public review):

      Summary:

      Salmonella is interesting due to its life within a compact compartment, which we call SCV or Salmonella containing vacuole in the field of Salmonella. SCV is a tight-fitting vacuole where the acquisition of nutrients is a key factor by Salmonella. The authors among many nutrients, focussed on beta-alanine. It is also known from many other studies that Salmonella requires beta-alanine. The authors have done in vitro RAW macrophage infection assays and In vivo mouse infection assays to see the life of Salmonella in the presence of beta-alanine. They concluded by comprehending that beta-alanine modulates the expression of many genes including zinc transporters which are required for pathogenesis.

      Strengths:

      This study made a couple of knockouts in Salmonella and did a transcriptomic investigation to understand the global gene expression pattern.

      Weaknesses:

      The following questions are unanswered:

      (1) It is not clear how the exogenous beta-alanine is taken up by macrophages.

      We thank the reviewer for the question. It is reported that β-alanine is delivered to eukaryotic cells through TauT (SLC6A6) and PAT1 (SLC36A1) transporters (Am J Physiol Cell Physiol. 2020 Apr 1;318(4):C777-C786; Br J Pharmacol 161: 589 –600, 2010; Biochim Biophys Acta 1194: 44 –52, 1994). We will add this information in the revised manuscript.

      (2) It is not clear how the Beta-alanine from the cytosol of the macrophage enters the SCV.

      Thank you for pointing it out. You are right that the above question is not clear. We will do our best to achieve this issue, via reviewing literature, designing and performing additional experiments.

      (3) It is not clear how the beta-alanine from SCV enters the bacterial cytosol.

      Thank you for the question. We have attempted to find the transporter of β-alanine in Salmonella, but we found that the CycA transporter transports β-alanine  in Escherichia coli but not in Salmonella, despite Salmonella is the closely related species of E. coli.

      According to your suggestion, we will perform additional experiments to verify whether BasC is involved in the transport of β-alanine into Salmonella cytosol.

      (4) There is no clarity on the utilization of exogenous beta-alanine of the host and the de novo synthesis of beta-alanine by panD of Salmonella.

      Thank you for the question. Our results showed that β-alanine concentrations were downregulated in the Salmonella-infected RAW264.7 cells, and the replication of Salmonella in RAW264.7 cells was significantly increased with the addition of β-alanine to the culture medium (RPMI) of RAW264.7 cells, implying that intracellular Salmonella use host-derived β-alanine for growth. Unfortunately, we have not found the transporter of exogenous β-alanine into Salmonella cytosol. We will perform additional experiments to verify whether BasC is involved in the transport of β-alanine into Salmonella cytosol, or search for other transporters that are responsible for the uptake of β-alanine into Salmonella.

      Upon confirming the β-alanine transporter in Salmonella, we will compare the intracellular replication and virulence between WT and the transporter mutant strain, via cell and mice infection assays. If the replication ability and virulence of the mutant strain decreases relative to WT, suggesting that Salmonella uptakes the exogenous beta-alanine of the host to enhance intracellular replication and its virulence in mice.

      We have found that the replication of Salmonella panD mutant in macrophages and the virulence in mice were significantly decreased relative to WT, suggesting that the de novo synthesis of β-alanine is important for Salmonella intracellular replication and virulence_. To further confirm that both uptake of host-derived β-alanine and de novo synthesis of β-alanine are critical for the full virulence of _Salmonella, we will generate the double mutant of panD and β-alanine transporter gene. If the replication ability and virulence of the double mutant decreases compared with each of the single mutant, suggesting that Salmonella both utilizes the exogenous beta-alanine of the host and de novo synthesis of β-alanine for full virulence.

    1. Author response:

      Reviewer #1 (Public review):

      Summary:

      The paper by Tolossa et al. presents classification studies that aim to predict the anatomical location of a neuron from the statistics of its in-vivo firing pattern. They study two types of statistics (ISI distribution, PSTH) and try to predict the location at different resolutions (region, subregion, cortical layer).

      Strengths:

      This paper provides a systematic quantification of the single-neuron firing vs location relationship.

      The quality of the classification setup seems high.

      The paper uncovers that, at the single neuron level, the firing pattern of a neuron carries some information on the neuron's anatomical location, although the predictive accuracy is not high enough to rely on this relationship in most cases.

      Thank you for your thoughtful feedback. The level of predictive accuracy offered by our current approach, while far above chance, is insufficient for electrode localization in most cases. Although, we speculate that our results represent a lower limit on possible performance—future improvements are almost certain as larger datasets are generated, more diverse features of neural activity are employed, and more advanced ML tools are implemented. We note that the current performance indicates a far more reliable embedding of anatomy in spiking than precedented by the modest statistical significance previously described in the literature. It would have been impossible to achieve this without the tremendous resources provided by the Allen Institute. In our revision, we will clarify that major performance improvements are both possible and probable.

      Weaknesses:

      As the authors mention in the Discussion, it is not clear whether the observed differences in firing are epiphenomenal. If the anatomical location information is useful to the neuron, to what extent can this be inferred from the vicinity of the synaptic site, based on the neurotransmitter and neuromodulator identities? Why would the neuron need to dynamically update its prediction of the anatomical location of its pre-synaptic partner based on activity when that location is static, and if that information is genetically encoded in synaptic proteins, etc (e.g., the type of the synaptic site)? Note that the neuron does not need to classify all possible locations to guess the location of its pre-synaptic partner because it may only receive input from a subset of locations.  If an argument on activity-based estimation being more advantageous to the neuron than synaptic site-based estimation cannot be made, I believe limiting the scope of the paper (e.g., in the Introduction) to an epiphenomenal observation and its quantification will improve the scientific quality.

      Summarily, in response to the two reviewers, we will minimize our discussion of this question in the revision. However, given that our results are either epiphenomenal or functional, we feel that it is important to indicate these possibilities, even if this indication is succinct and conservative.

      In pursuit of a more concise revision, we will not expand our discussion to accommodate this interesting conversation with the reviewer, but we are excited to briefly offer our perspective here.

      Regarding the epiphenomenal nature of our observations: this is a complex question that would be challenging but not impossible to validate experimentally. It has been previously established that neurons, especially those that integrate inputs from a variety of regions and are involved in diverse functions, could benefit from mechanisms for dynamically parsing inputs (Gutig, Sompolinsky 2006). Neurotransmitter and neuromodulator identities may indeed convey some information about presynaptic neuron location (e.g., NE may originate from the locus coeruleus). However, hypothetically, the binding of a neurotransmitter only bears on the postsynaptic neuron via ionic current, or second messenger activity. Postsynaptic neurons do not consume or otherwise endocytose the neurotransmitter, thus the ability of a neuron to “know” the presynaptic identity is a function of induced postsynaptic activity. Certainly, there are multiple streams of information that can provide insight into anatomical location all taking the ultimate form of neural activity and membrane dynamics. This would be broadly consistent with (for example) reward prediction error which is evident in dopamine release, firing rates, spiking patterns, and oscillatory rhythms.

      We could imagine a possible role for the embedding of location in spiking patterns. It is important to note that many neurons in neighboring areas share common neurotransmitters (e.g., glutamate, GABA). Neurons receiving input from multiple regions with similar neurotransmitter profiles could benefit from additional information in the spiking patterns for distinguishing input sources, especially for multimodal integration. For instance, an inferior parietal lobule neuron or microcircuit could be downstream from both auditory cortex (listening) and Broca’s area (speaking). Imagine an individual is in a crowded coffee shop waiting for their drink order to be called while speaking to their friend. In this scenario, it may be important to recognize region-specific activity and thus selectively attend to it. Thus, it is unlikely that neurons actively update a “location prediction,” but rather that location-related information is passively embedded in spike patterning and this might be dynamically leveraged in computation. We emphasize that this is a simplified conceptual example and not a hypothesis that we test in the paper. This conversation, however, is a wonderful example of the thought experiments that we hope will grow from this type of work.

      Reviewer #2 (Public review):

      Summary:

      In this manuscript, Tolossa et al. analyze Inter-spike intervals from various freely available datasets from the Allen Institute and from a dataset from Steinmetz et al. They show that they can modestly decode between gross brain regions (Visual vs. Hippocampus vs. Thalamus), and modestly separate sub-areas within brain regions (DG vs. CA1 or various visual brain areas).

      Strengths:

      The paper is reasonably well written, and the definitions are quite well done. For example, the authors clearly explained transductive vs. inductive inference in their decoders. E.g., transductive learning allows the decoder to learn features from each animal, whereas inductive inference focuses on withheld animals and prioritizes the learning of generalizable features.

      Thank you!

      Weaknesses:

      However, even with some of these positive aspects, I still found the manuscript to be a laundry list of results, where some results are overly explained and not particularly compelling or interesting, whereas interesting results are not strongly described or emphasized. The overall problem is that the study is not cohesive, and the authors need to either come up with a tool or demonstrate a scientific finding. The current version attempts to split the middle and thus is not as impactful as it could be

      In our revision, we will endeavor to present our results in line with your suggestions. Thank you for the careful and thorough feedback that will improve the readability of our manuscript. We strove to be complete in establishing the logic leading to our ultimate finding—that a robust code for anatomical location can be extracted from single neuron spike trains, but not from more traditional descriptions of neural activity. Our detection of this code, albeit not perfect in performance, is, in most cases, both far above chance levels and is robust to animal identity and laboratory of origin. Our presentation of these results is cohesive in as much as we sequentially establish a series of results that build towards a concluding set of experiments. We start by establishing a baseline via standard measurements and then explore more challenging problems through more complex models that build toward our final test.  Based on your feedback, we will contract and expand elements of this sequence.

      While our findings raise the possibility of developing a computational tool for electrode localization, pending additional features and/or datasets, our current focus is on establishing the neurobiological principle of anatomical embedding in spike trains. The purpose of briefly mentioning a possible application is that we hope to encourage those engaged in machine-learning on multi-modal neural data that this problem is tractable, yet still open. Based on your feedback, we will clarify that the focus of our current work is not an introduction of a new tool.

    1. Author response:

      Public Reviews: 

      Reviewer #1 (Public review): 

      Summary: 

      Mitotic kinesins carry out crucial roles in intracellular motility and mitotic spindle organization. Although many mitotic kinesins have been extensively studied, a few conserved mitotic motors remain poorly explored, including chromosome-associated kinesins. Here, Furusaki et al reconstitute recombinant chromosome-associated kinesin or chromokinesin (Kid) and reveal processive plus-end motility along microtubules. The authors purify multiple versions of Kid, revealing dimeric organization and their processive microtubule plus-ended motility which depends on their conserved motor domains, neck linkers, and coiled-coil regions. The study reveals for the first time that KID can recruit and transport duplex DNA along microtubules using its conserved C-terminal DNA binding domain. The work provides crucial revised thinking about the mechanisms of Chromokinesins mitosis as physical processive motors that mobilize chromosomes towards the microtubule plus ends in early metaphase. 

      Strengths: 

      The authors reconstitute multiple chromosome-associated kinesin (KID) orthologs from Xenopus and humans with microtubules and determine their oligomerization. The study shows how coiled-coil and neck linker regions of KID are essential for its function as its deletion leads to non-processive motility. CHimeras placing the KID coiled-coil and neck linker on the KIF1A motor domain led to the production of a processive recombinant motor supporting the compatibility of their motility mechanisms. The KID c-terminal tail binds and transports only double-stranded DNA and its deletion or single-stranded DNA leads to defects in this activity.

      Thank you very much.

      Weaknesses: 

      A minor weakness in the studies is that they do not resolve the mechanisms of KID in binding large duplex DNA molecules or condensed chromatin. The authors suggest a model in which KID forms multimers along large chromosomes that lead to their transport, but this model was not directly tested. 

      Thank you very much for your suggestion.

      We will attempt to observe the movement of longer dsDNA and/or DNA-bead complexes and compare their motility with that of a single KID motor to elucidate the cooperativity of the motor protein.

      Reviewer #2 (Public review): 

      Summary: 

      Previous work in the field highlighted the role of the kinesin-10 motor protein Kid (KIF22) in the polar ejection force during prometaphase. However, the biochemical and biophysical properties of Kid that enabled it to serve in this role were unclear. The authors demonstrate that human and xenopus Kid proteins are processive kinesins that function as homodimeric molecules. The data are solid and support the findings although the text could use some editing to improve clarity. 

      Strengths: 

      A highlight of the work is the reconstitution of DNA transport in vitro. 

      A second highlight is the demonstration that the monomer vs dimer state is dependent on protein concentration. 

      Thank you very much.

      Weaknesses: 

      The authors make several assumptions of the monomer vs dimer state of various Kid constructs without verifying the protein state using e.g. size exclusion chromatography and/or nanophotometry. They also make statements about monomer-to-dimer transitions on the microtubule without showing or quantifying the data. 

      As reviewer suggests, the monomer-to-dimer transitions on the microtubule is a speculation. What we can measure in our hands are (1) monomer and dimer ratio in the solution and (2) particle movement on microtubules. At the pmol/L condition, Kid is monomeric in solution but exhibits processive movement on microtubules. Dimerization is generally required for the processivity. Therefore, we suggest Kid forms a dimer on microtubules.

      To show that Kid forms a dimer on microtubules, we will perform photobleaching assays and measure the fluorescent intensities of each particle on microtubules to determine their oligomeric state.

      The discussion needs to better put the work into context regarding the ability of non-processive motors to work in teams (formerly thought to be the case for Kid) and how their findings on Kid change this prevailing view in the case of polar ejection force. 

      We will look for the example of non-processive motors and include them in the Discussion and Citation. As described by this reviewer, Kid was originally thought to be a non-processive motor. We hope that our current work would change that view.  

      The authors also do not mention previous work on kinesins with non-conventional neck linker/neck coil regions that have been shown to move processively. Their work on Kid needs to be put into this context.

      We have thought that most kinesins, belonging to the cargo-transport classes, have conserved neck linker domain and neck coil domains, with Kid being exception. We will search for more citations, including non-transport classes of kinesins, and re-write the Discussion.

    1. Author response:

      The following is the authors’ response to the original reviews.

      First of all, we would like to thank the reviewers for their very constructive comments, which helped us to improve the manuscript! In response to the raised issues, we have performed new experiments and made necessary changes on the manuscript.

      eLife Assessment

      The study describes a valuable new technology in the field of targeted protein degradation that allows identification of E3-ubiquitin ligases that target a protein of interest. The presented data are convincing, however, it is unclear whether the proposed system can be successfully used in high throughput applications. This technology will serve the community in the initial stages of developing targeted protein degraders.

      We thank the eLife editors for the positive assessment and have clarified the scalability of our system for high throughput applications in the revised manuscript (see our response to both reviewer’s comment on weakness point 1).

      Reviewer #1 (Public Review):

      Summary:

      PROTACs are heterobifunctional molecules that utilize the Ubiquitin Proteasome System to selectively degrade target proteins within cells. Upon introduction to the cells, PROTACs capture the activity of the E3 ubiquitin ligases for ubiquitination of the targeted protein, leading to its subsequent degradation by the proteasome. The main benefit of PROTAC technology is that it expands the "druggable proteome" and provides numerous possibilities for therapeutic use. However, there are also some difficulties, including the one addressed in this manuscript: identifying suitable target-E3 ligase pairs for successful degradation. Currently, only a few out of about 600 E3 ligases are used to develop PROTAC compounds, which creates the need to identify other E3 ligases that could be used in PROTAC synthesis. Testing the efficacy of PROTAC compounds has been limited to empirical tests, leading to lengthy and often failure-prone processes. This manuscript addressed the need for faster and more reliable assays to identify the compatible pairs of E3 ligases-target proteins. The authors propose using the RiPA assay, which depends on rapamycin-induced dimerization of FKBP12 protein with FRB domain. The PROTAC technology is advancing rapidly, making this manuscript both timely and essential. The RiPA assay might be useful in identifying novel E3 ligases that could be utilized in PROTAC technology. Additionally, it could be used at the initial stages of PROTAC development, looking for the best E3 ligase for the specific target.

      The authors described an elegant assay that is scalable, easy-to-use, and applicable to a wide range of cellular models. This method allows for the quantitative validation of the degradation efficacy of a given pair of E3 ligase-target proteins, using luciferase activity as a measure. Importantly, the assay also enables the measurement of kinetics in living cells, enhancing its practicality.

      Strengths:

      (1) The authors have addressed the crucial needs that arise during PROTAC development. In the introduction, they nicely describe the advantages and disadvantages of the PROTAC technology and explain why such an assay is needed.

      (2) The study includes essential controls in experiments (important for generating new assay), such as using the FRB vector without E3 ligase as a negative control, testing different linkers (which may influence the efficacy of the degradation), and creating and testing K-less vectors to exclude the possibility of luciferase or FKBP12 ubiquitination instead of WDR5 (the target protein). Additionally, the position of the luc in the FKBP12 vector and the position of VHL in the FRB vector are tested. Different E3 ligases are tested using previously identified target proteins, confirming the assay's utility and accuracy.

      (3) The study identified a "new" E3 ligase that is suitable for PROTAC technology (FBXL).

      We greatly appreciate the reviewer’s positive feedback on our work. To evaluate our system further, in our revised manuscript we have conducted additional analysis on KRASG12D degradation via VHL and CRBN within our K-less system. Consistent with previous findings of VHL-harnessing PROTACs, our assay demonstrated that VHL mediated efficient degradation of KRASG12D while CRBN induced only a minor effect. This new data is presented in Figure 2 - figure supplement 1C of the revised manuscript.

      Weaknesses:

      · It is not clear how feasible it would be to adapt the assay for high-throughput screens.

      The design of our study is a well-based assay. It is therefore possible but not realistic to evaluate all 600 and more human E3 ligases. Nonetheless, if interested in all E3 ligases, our assay could be adapted for pooled experimental strategies, as demonstrated in Poirson, J., Cho, H., Dhillon, A. et al., Nature 628, 878–886 (2024).

      Our system offers several advantages over pooled screens, including the generation of more quantitative data and faster testing of selected candidates. Pooled screens, by contrast, require more time due to the necessity of next-generation sequencing and bioinformatics analysis. Moreover, in response to the reviewers comment, we have included a schematic in the revised manuscript (Figure 4 - figure supplement 1A) that outlines the assay duration and hands-on time for target and E3 ligase candidates.

      · In some experiments, the efficacy of WDR5 degradation tested by immunoblotting appears to be lower than luciferase activity (e.g., Figure 2G and H).

      We concur with the reviewer that in some instances, the degradation observed via immunoblotting appears lower than that indicated by luciferase activity. Thus, we have quantified the western and added it to the respective blots. This discrepancy may result from the non-linearity of western blots.

      Reviewer #2 (Public Review):

      Summary:

      Adhikari and colleagues developed a new technique, rapamycin-induced proximity assay (RiPA), to identify E3-ubiquitin (ub) ligases of a protein target, aiming at identifying additional E3 ligases that could be targeted for PROTAC generation or ligases that may degrade a protein target. The study is timely, as expanding the landscape of E3-ub ligases for developing targeted degraders is a primary direction in the field.

      Strengths:

      The study's strength lies in its practical application of the FRB:FKBP12 system. This system is used to identify E3-ub ligases that would degrade a target of interest, as evidenced by the reduction in luminescence upon the addition of rapamycin. This approach effectively mimics the potential action of a PROTAC.

      We are delighted with this assessment of our work by the reviewer. To evaluate our system further, in our revised manuscript we have conducted additional analysis on KRASG12D degradation via VHL and CRBN within our K-less system. Consistent with previous findings of VHL-harnessing PROTACs, our assay demonstrated that VHL mediated efficient degradation of KRASG12D while CRBN induced only a minor effect. This new data is presented in Figure 2 - figure supplement 1C of the revised manuscript.

      Weaknesses:

      (1) While the technique shows promise, its application in a discovery setting, particularly for high-throughput or unbiased E3-ub ligase identification, may pose challenges. The authors should provide more detailed insights into these potential difficulties to foster a more comprehensive understanding of RiPA's limitations.

      The design of our study is well-based assay . It is therefore possible but not realistic to evaluate all 600 and more human E3 ligases. Nonetheless, if interested in all E3 ligases, our assay could be adapted for pooled experimental strategies, as demonstrated in Poirson, J., Cho, H., Dhillon, A. et al., Nature 628, 878–886 (2024).

      Our system offers several advantages over pooled screens, including the generation of more quantitative data and faster testing of selected candidates. Pooled screens, by contrast, require more time due to the necessity of next-generation sequencing and bioinformatics analysis. Moreover, in response to the reviewers comment, we have included a schematic in the revised manuscript (Figure 4 - figure supplement 1A) that outlines the assay duration and hands-on time for target and E3 ligase candidates.

      We also added the following sentences to the Limitations of the study section of the revised manuscript (line 322-326): “While our system offers easy testing of different tagging approaches and due to its simple workflow facilitates the rapid characterization of novel E3 ligases across multiple targets, it is currently not optimized for high-throughput evaluation of all 600+ E3 ligases. Achieving such scale would necessitate further adaptations, including the incorporation of pooled experimental strategies.”

      (2) While RiPA will help identify E3 ligases, PROTAC design would still be empirical. The authors should discuss this limitation. Could the technology be applied to molecular glue generation?

      We agree with the reviewer that our assay rationalizes the choice of E3 ligases but that PROTAC design (“linkerology”) is still mostly empirical. To address this, we included the following line in the Limitations of the study section of our initial manuscript (line 327-330): “Conversely, it is also conceivable that an E3 ligase that can efficiently decrease the levels of a particular target in the RiPA setting may be less suitable for PROTACs, since PROTACs that mimic the steric interaction of the target/E3 pair may not be easily identified in the chemical space.”

      Regarding molecular glues, our assay could also be instrumental in identifying suitable E3 ligases for a target protein prior to screening for molecular glues, provided that the screening system specifically screens E3 ligase and target pairs. However, as most molecular glue screens are currently agnostic to specific E3 ligases or targets, our system may not be applicable in those cases. We have elaborated on this in the discussion section of the revised manuscript (line 271-274): “We envision that this setting will be valuable for identifying the most suitable E3 ligase candidates for PROTACs aimed at specific proteins, and for guiding E3 ligase selection when screening for molecular glues targeting specific E3 ligase and protein pairs.”

      (3) Controls to verify the intended mechanism of action are missing, such as using a proteasome inhibitor or VHL inhibitors/siRNA to verify on-target effects. Verification of the target E3 ligase complex after rapamycin addition via orthogonal approaches, such as IP, should be considered.

      We thank the reviewer for the comment. Particularly VHL siRNA is not beneficial in this setup, as we overexpress the E3 ligase rather than relying on endogenous protein.

      To verify mechanism of action, we performed additional experiments in the presence of proteosomal inhibitor MG132 and neddylation inhibitor MLN4924 with target KRASG12D and E3 ligase VHL. The results is shown in Figure 2H of the revised manuscript.

      Minor concern:

      The graphs in Figure 1E are missing.

      We thank the reviewer for pointing this out. We corrected the figure in the revised manuscript.

      Reviewer #1 (Recommendations For The Authors):

      •  Optionally, the authors could add control experiments with Aurora B and Crb vectors (there shouldn't be any degradation) and experiments confirming that the degradation occurs via the proteasome. For example, the addition of proteasome inhibitors (such as bortezomib) should decrease the efficiency of the target degradation and confirm that targets are degraded via the proteasome system.

      Regarding Aurora-B degradation, as far as we know, there are no specific Aurora-B PROTACs reported. Thus, there is no definitive evidence that CRBN could not degrade Aurora-B. Nevertheless, we performed assays with Aurora-B and VHL, CRBN, or FRB, and observed more effective degradation of Aurora-B by VHL than CRBN. This data is now included in Figure 2 - figure supplement 1B of the revised manuscript.

      • It would also be helpful to provide a possible explanation for why the ratio 1:1 of vectors did not induce the degradation (regarding Figure 1D).

      We believe the lack of degradation with 1:1 vector ratio is due to the differential expression levels of endogenous FKBP12 and mTOR in HEK293 cells. According to Human Protein Atlas, the normalized protein-coding transcripts per million (nTPM) for FKBP12 and mTOR in HEK293 cells are 160 and 24 respectively, indicating that FKBP12 is expressed at levels approximately 6.7 times higher than mTOR. This disparity likely limits the heterodimerization of exclusively fusion proteins upon rapamycin addition. To increase the likelihood of FKBP12 and FRB fusion protein dimerization, we used a higher ratio of the FRB component during transfection, considering the higher endogenous expression of FKBP12.

      • It would be helpful to add more explanation for the data in Figure 1F, including whether there is a difference between vectors with different positions of VHL and FRB and why the FRB-VHL vector is less expressed without rapamycin.

      We thank the reviewer for the comment. Regarding the vector orientations of VHL/FRB and WDR5/Luc/FKBP12, we have consistently observed different migration behaviors for WDR5 and VHL constructs, despite their same molecular weights. This observation aligns with literature reports where differential running behavior is noted when FRB or FKBP12 (or their mutants) are tagged to the N- or C-terminus of a protein (Bondeson, D.P., Mullin-Bernstein, Z., Oliver, S. et al. Nat Commun 13, 5495 (2022); Mabe, S., Nagamune, T. & Kawahara, M. Sci Rep 4, 6127 (2014)). We have now included the following explanation in the figure legend of Figure 1F of the revised manuscript: “WDR5 and VHL fusion proteins tagged at the N- and C-terminal show different migration behaviors despite having same molecular weight.”

      Additionally, the stabilizing effect of rapamycin on FRB (or its mutants), FRB fusion proteins, and FRB-containing proteins has been documented (Stankunas, K., Bayle, J.H., Havranek, J.J. et al. ChemBioChem, 8(10), 1162-1169 (2007); Stankunas, K., Bayle, J.H., Gestwicki J.E. et al. Mol Cell, 12(6), 1615–1624 (2003); Zhang, C., Cui, M., Cui, Y. et al. J. Vis. Exp. (150), e59656 (2019)). We believe that the degree of stabilization by rapamycin could differ between N- and C-terminal FRB fusion proteins.

      • Finally, the mistake in Figure 2G (where the lanes are wrongly labelled, BRBN-FRB and FRB) should be corrected. Also please correct the graph in Figure 1E (there seems to be a problem with bars for 1:100). There are some typos, such as in lines 38, 277, and 288.

      Thank you for bringing this to our attention. We have corrected all the mentioned errors.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      In this work, the authors examine the activity and function of D1 and D2 MSNs in dorsomedial striatum (DMS) during an interval timing task. In this task, animals must first nose poke into a cued port on the left or right; if not rewarded after 6 seconds, they must switch to the other port. Thus, this task requires animals to estimate if at least 6 seconds have passed after the first nose poke. After verifying that animals estimate the passage of 6 seconds, the authors examine striatal activity during this interval. They report that D1-MSNs tend to decrease activity, while D2MSNs increase activity, throughout this interval. They suggest that this activity follows a driftdiffusion model, in which activity increases (or decreases) to a threshold after which a decision is made. The authors next report that optogenetically inhibiting D1 or D2 MSNs, or pharmacologically blocking D1 and D2 receptors, increased the average wait time. This suggests that both D1 and D2 neurons contribute to the estimate of time, with a decrease in their activity corresponding to a decrease in the rate of 'drift' in their drift-diffusion model. Lastly, the authors examine MSN activity while pharmacologically inhibiting D1 or D2 receptors. The authors observe most recorded MSNs neurons decrease their activity over the interval, with the rate decreasing with D1/D2 receptor inhibition. 

      We appreciate the careful read by this reviewer. 

      Major strengths: 

      The study employs a wide range of techniques - including animal behavioral training, electrophysiology, optogenetic manipulation, pharmacological manipulations, and computational modeling. The question posed by the authors - how striatal activity contributes to interval timing - is of importance to the field and has been the focus of many studies and labs. This paper contributes to that line of work by investigating whether D1 and D2 neurons have similar activity patterns during the timed interval, as might be expected based on prior work based on striatal manipulations. However, the authors find that D1 and D2 neurons have distinct activity patterns. They then provide a decision-making model that is consistent with all results. The data within the paper is presented very clearly, and the authors have done a nice job presenting the data in a transparent manner (e.g., showing individual cells and animals). Overall, the manuscript is relatively easy to read and clear, with sufficient detail given in most places regarding the experimental paradigm or analyses used. 

      We are glad that our main points come clearly through.

      Major weaknesses: 

      One weakness to me is the impact of identifying whether D1 and D2 had similar or different activity patterns. Does observing increasing/decreasing activity in D2 versus D1, or different activity patterns in D1 and D2, support one model of interval timing over another, or does it further support a more specific idea of how DMS contributes to interval timing? 

      This is a great point - we were not clear.  We observe distinct patterns of D2 and D1-MSN activity, but that disrupting either D2-MSNs or D1-MSNs led to increased response time.  The model that this supports is that D2-MSNs and D1-MSN ensemble activity represents temporal evidence.  This is a very specific model that can be rigorously tested in future work.  We have now made this very clear in the abstract (Page 2). 

      “We found that D2-MSNs and D1-MSNs exhibited distinct dynamics over temporal intervals as quantified by principal component analyses and trial-by-trial generalized linear models. MSN recordings helped construct and constrain a fourparameter drift-diffusion computational model in which MSN ensemble activity represented the accumulation of temporal evidence. This model predicted that disrupting either D2-MSNs or D1-MSNs would increase interval timing response times and alter MSN firing. In line with this prediction, we found that optogenetic inhibition or pharmacological disruption of either D2-MSNs or D1-MSNs increased interval timing response times.”

      And in the results on Page 18:  

      “Because both D2-MSNs and D1-MSNs accumulate temporal evidence, disrupting either MSN type in the model changed the slope. The results were obtained by simultaneously decreasing the drift rate D (equivalent to lengthening the neurons’ integration time constant) and lowering the level of network noise 𝝈: D = 𝟎. 𝟏𝟐𝟗, 𝝈 = 𝟎. 𝟎𝟒𝟑 for D2-MSNs in Fig 4A (in red; changes in noise had to accompany changes in drift rate to preserve switch response time variance. See Methods); and 𝑫 = 𝟎. 𝟏𝟐𝟐, 𝝈 = 𝟎. 𝟎𝟒𝟑 for D1-MSNs in Fig 4B (in blue). The model predicted that disrupting either D2-MSNs or D1-MSNs would increase switch response times (Fig 4C and Fig 4D) and would shift MSN dynamics.” 

      And in the discussion (Page 30): 

      “Striatal MSNs are critical for temporal control of action (Emmons et al., 2017; Gouvea et al., 2015; Mello et al., 2015). Three broad models have been proposed for how striatal MSN ensembles represent time: 1) the striatal beat frequency model, in which MSNs encode temporal information based on neuronal synchrony (Matell and Meck, 2004); 2) the distributed coding model, in which time is represented by the state of the network (Paton and Buonomano, 2018); and 3) the DDM, in which neuronal activity monotonically drifts toward a threshold after which responses are initiated (Emmons et al., 2017; Simen et al., 2011; Wang et al., 2018). While our data do not formally resolve these possibilities, our results show that D2-MSNs and D1MSNs exhibit opposing changes in firing rate dynamics in PC1 over the interval. Past work by our group and others has demonstrated that PC1 dynamics can scale over multiple intervals to represent time (Emmons et al., 2020, 2017; Gouvea et al., 2015; Mello et al., 2015; Wang et al., 2018). We find that low-parameter DDMs account for interval timing behavior with both intact and disrupted striatal D2- and D1-MSNs. While other models can capture interval timing behavior and account for MSN neuronal activity, our model does so parsimoniously with relatively few parameters (Matell and Meck, 2004; Paton and Buonomano, 2018; Simen et al., 2011). We and others have shown previously that ramping activity scales to multiple intervals, and DDMs can be readily adapted by changing the drift rate (Emmons et al., 2017; Gouvea et al., 2015; Mello et al., 2015; Simen et al., 2011). Interestingly, decoding performance was high early in the interval; indeed, animals may have been focused on this initial interval (Balci and Gallistel, 2006) in making temporal comparisons and deciding whether to switch response nosepokes.”

      Regarding the reviewer’s specific question – it is not clear why D1-MSNs and D2-MSNs have opposing patterns of activity, as integration of temporal evidence can certainly be achieved increasing or decreasing firing rates alone. These patterns have been seen in motor control. Prefrontal neurons, which control striatal ramping, also ramp up and down. We have now included a paragraph on Page 30 explicitly discussing these ideas; however, future experiments will be required to investigate the source of the divergent patterns of activity among D2-MSNs and D1-MSNs.   

      “D2-MSNs and D1-MSNs play complementary roles in movement. For instance, stimulating D1-MSNs facilitates movement, whereas stimulating D2-MSNs impairs movement (Kravitz et al., 2010). Both populations have been shown to have complementary patterns of activity during movements with MSNs firing at different phases of action initiation and selection (Tecuapetla et al., 2016). Further dissection of action selection programs reveals that opposing patterns of activation among D2MSNs and D1-MSNs suppress and guide actions, respectively, in the dorsolateral striatum (Cruz et al., 2022). A particular advantage of interval timing is that it captures a cognitive behavior within a single dimension — time. When projected along the temporal dimension, it was surprising that D2-MSNs and D1-MSNs had opposing patterns of activity. Ramping activity in the prefrontal cortex can increase or decrease; and prefrontal neurons project to and control striatal ramping activity (Emmons et al., 2020, 2017; Wang et al., 2018).  It is possible that differences in D2MSNs and D1-MSNs reflect differences in cortical ramping, which may themselves reflect more complex integrative or accumulatory processes. Further experiments are required to investigate these differences. Past pharmacological work from our group and others has shown that disrupting D2- or D1-MSNs slows timing (De Corte et al., 2019b; Drew et al., 2007, 2003; Stutt et al., 2024) and are in agreement with pharmacological and optogenetic results in this manuscript. Computational modeling predicted that disrupting either D2-MSNs or D1-MSNs increased selfreported estimates of time, which was supported by both optogenetic and pharmacological experiments.”

      I found the results presented in Figures 2 and 3 to be a little confusing or misleading. In Figure 2, the authors appear to claim that D1 neurons decrease their activity over the time interval while D2 neurons increase activity. The authors use this result to suggest that D1/D2 activity patterns are different. In Figure 3, a different analysis is done, and this time D2 neurons do not significantly increase their activity with time, conflicting with Figure 2. While in both figures, there is a significant difference between the mean slopes across the population, the secondary effect of positive/negative slope for D2/D1 neurons changes. I find this especially confusing as the authors refer back to the positive/negative slope for D2/D1 neurons result throughout the rest of the text.  

      We were not clear.  First, we attempted to quantify these differences based on PCA and slope.  We have rephrased our characterization of these differences by changing text on (Page 9) to: 

      “These PETHs revealed that for the 6-second interval immediately after trial start, many putative D2-MSN neurons appeared to ramp up while many putative D1-MSNs appeared to ramp down. For 32 putative D2-MSNs average PETH activity increased over the 6-second interval immediately after trial start, whereas for 41 putative D1-MSNs, average PETH activity decreased. Accordingly, D2-MSNs and D1-MSNs had differences in activity early in the interval (0-5 seconds; F = 4.5, p = 0.04 accounting for variance between mice) but not late in the interval (5-6 seconds; F = 1.9, p = 0.17 accounting for variance between mice). Examination of a longer interval of 10 seconds before to 18 seconds after trial start revealed the greatest separation in D2-MSN and D1-MSN dynamics during the 6-second interval after trial start (Fig S2). Strikingly, these data suggest that D2-MSNs and D1-MSNs might display distinct dynamics during interval timing.” 

      We have rephrased our discussion on PCA to quantify differences in Fig 2G-H using data-driven methods (Page 12): 

      “To quantify differences between D2-MSNs vs D1-MSNs in Fig 2G-H, we turned to principal component analysis (PCA), a data-driven tool to capture the diversity of neuronal activity (Kim et al., 2017a). Work by our group and others has uniformly identified PC1 as a linear component among corticostriatal neuronal ensembles during interval timing (Bruce et al., 2021; Emmons et al., 2020, 2019, 2017; Kim et al., 2017a; Narayanan et al., 2013; Narayanan and Laubach, 2009; Parker et al., 2014; Wang et al., 2018). We analyzed PCA calculated from all D2-MSN and D1MSN PETHs over the 6-second interval immediately after trial start. PCA identified time-dependent ramping activity as PC1 (Fig 3A), a key temporal signal that explained 54% of variance among tagged MSNs (Fig 3B; variance for PC1 p = 0.009 vs 46 (44-49)% for any pattern of PC1 variance derived from random data; Narayanan, 2016). Consistent with population averages from Fig 2G&H, D2-MSNs and D1-MSNs had opposite patterns of activity with negative PC1 scores for D2MSNs and positive PC1 scores for D1-MSNs (Fig 3C; PC1 for D2-MSNs: -3.4 (-4.6 – 2.5); PC1 for D1-MSNs: 2.8 (-2.8 – 4.9); F = 8.8, p = 0.004 accounting for variance between mice (Fig S3A); Cohen’s d = 0.7; power = 0.80; no reliable effect of sex (F = 0.44, p = 0.51) or switching direction (F = 1.73, p = 0.19)).”

      And finally, we directly investigate the heart of the reviewer’s question by explicitly comparing PC1 scores – a data-driven analysis of neuronal patterns that explain the least variance – and show that they are less than 0 for D2-MSNs (i.e., negatively correlated with a down-ramping pattern, or ramping up), and greater than 0 for D1MSNs (i.e., positively correlated with an up-ramping pattern): 

      “Importantly, PC1 scores for D2-MSNs were significantly less than 0 (signrank D2MSN PC1 scores vs 0: p = 0.02), implying that because PC1 ramps down, D2-MSNs tended to ramp up. Conversely, PC1 scores for D1-MSNs were significantly greater than 0 (signrank D1-MSN PC1 scores vs 0: p = 0.05), implying that D1-MSNs tended to ramp down.  Thus, analysis of PC1 in Fig 3A-C suggested that D2-MSNs (Fig 2G) and D1-MSNs (Fig 2H) had opposing ramping dynamics.”

      We interpret these data on Page 16: 

      “Our analysis of average activity (Fig 2G-H) and PC1 (Fig 3A-C) suggested that D2MSNs and D1-MSNs might have opposing dynamics. However, past computational models of interval timing have relied on drift-diffusion dynamics that increases over the interval and accumulates evidence over time (Nguyen et al., 2020; Simen et al., 2011).”

      The reviewer mentions our analysis of ‘mean slopes across the population’ -which we clarify as trial-by-trial slope analysis, which is distinct from the population averages in 2G-H and 3A-C.  We have now made this clear (Page 12). 

      “To interrogate these dynamics at a trial-by-trial level, we calculated the linear slope of D2-MSN and D1-MSN activity over the first 6 seconds of each trial using generalized linear modeling (GLM) of effects of time in the interval vs trial-by-trial firing rate (Latimer et al., 2015).  Note that this analysis focuses on each trial rather than population averages in Fig 2G-H and Fig 3A-C.”

      Finally, as the reviewer suggests, we have removed the term ‘slope’ from the rest of the paper, as the increasing/decreasing comes from averages and analyses of PC1.  We have removed all discussion of ‘opposing’ slope or ‘increasing/decreasing’ slope. 

      It is a bit unclear to me how the authors chose the parameters for the model, and how well the model explains behavior is quantified. It seems that the authors didn't perform cross-validation across trials (i.e., they chose parameters that explained behavior across all trials combined, rather than choosing parameters from a subset of trials and determining whether those parameters are robust enough to explain behavior on held-out trials). I think this would increase the robustness of the result. 

      In addition, it remains a bit unclear to me how the authors changed the specific parameters they did to model the optogenetic manipulation. It seems these parameters were chosen because they fit the manipulation data. This makes me wonder if this model is flexible enough that there is almost always a set of parameters that would explain any experimental result; in other words, I'm not sure this model has high explanatory power. 

      We are glad the reviewer raised these points.  First, we have now included a complete exploration of the parameter space, exactly as the reviewer recommends.  These are described in the methods (Page 41): 

      “Selection of DDMs parameters. Our goal was to build DDMs with dynamics that produce “response times” according to the observed distribution of mice switch times. The selection of parameter values in Fig 4 was done in three steps. First, we fit the distribution of the mice behavioral data with a Gamma distribution and found its fitting values for shape 𝜶𝑴 and rate 𝜷𝑴 (Table S2 and Fig S8; R2 Data vs Gamma ≥ 𝟎. 𝟗𝟒). We recognized that the mean 𝝁𝑴 and the coefficient of variation 𝑪𝑽𝑴 are directly related to the shape and rate of the Gamma distribution by formulas 𝝁𝑴 \= 𝜶𝑴/𝜷𝑴 and 𝑪𝑽𝑴 \= 𝟏/√𝜶𝑴.  Next, we fixed parameters 𝑭 and 𝒃 in DDM (e.g., for D2-MSNs: 𝑭 = 𝟏, 𝒃 = 𝟎. 𝟓𝟐) and simulated the DDM for a range of values for 𝑫 and 𝝈. For each pair (𝑫, 𝝈), one computational “experiment” generated 500 response times with mean 𝝁 and coefficient of variation 𝑪𝑽. We repeated the “experiment” 10 times and took the group median of 𝝁 and 𝑪𝑽 to obtain the simulation-based statistical measures 𝝁𝑺 and 𝑪𝑽𝑺. Last, we plotted 𝑬𝝁 \= |(𝝁𝑺 − 𝝁𝑴)/𝝁𝑴| and 𝑬𝒄𝒗 \= |𝑪𝑽𝑺 − 𝑪𝑽𝑴|, the respective relative error and the absolute error to data (Fig S7). We considered that parameter values (𝑫, 𝝈) provided a good DDM fit of mice behavioral data whenever  𝑬𝝁 ≤ 𝟎. 𝟎𝟓    and 𝑬𝒄𝒗

      And included a new Fig S7 which shows the parameter space: 

      These new data clearly comment on the parameter space of our model. 

      Finally, the reviewer mentions cross-validation.  We did this at length on our model and data fits.  We used 10-fold cross-validation as fitlm needs enough data for the individual fits.  We found that the fit was extremely stable – i.e, we ended up with standard deviations in R2<0.004 for all comparisons.  Thus, we added the following point to the methods on Page 41:  

      “10-fold cross-validation revealed highly stable fits between gamma, models and data.”

      Lastly, the results are based on a relatively small dataset (tens of cells). 

      This is an important point.  Although it is a small optogenetically-tagged dataset, we have adequate statistical power and large effect sizes, which we now detail in the text on Page 12:

      “Consistent with population averages from Fig 2G&H, D2-MSNs and D1-MSNs had opposite patterns of activity with negative PC1 scores for D2-MSNs and positive PC1 scores for D1-MSNs (Fig 3C; PC1 for D2-MSNs: -3.4 (-4.6 – 2.5); PC1 for D1MSNs: 2.8 (-2.8 – 4.9); F = 8.8, p = 0.004 accounting for variance between mice (Fig S3A); Cohen’s d = 0.7; power = 0.80; no reliable effect of sex (F = 0.44, p = 0.51) or switching direction (F = 1.73, p = 0.19)).”

      And:  

      “GLM analysis also demonstrated that D2-MSNs had significantly different slopes (0.01 spikes/second (-0.10 – 0.10)), which were distinct from D1-MSNs (-0.20 (-0.47– 0.06; Fig 3D; F = 8.9, p = 0.004 accounting for variance between mice (Fig S3B); Cohen’s d = 0.8; power = 0.98; no reliable effect of sex (F = 0.02, p = 0.88) or switching direction (F = 1.72, p = 0.19)).”

      And we have included the reviewers point as a limitation on Page 33:  

      “Second, although we had adequate statistical power and medium-to-large effect sizes, optogenetic tagging is low-yield, and it is possible that recording more of these neurons would afford greater opportunity to identify more robust results and alternative coding schemes, such as neuronal synchrony.”

      Impact: 

      The task and data presented by the authors are very intriguing, and there are many groups interested in how striatal activity contributes to the neural perception of time. The authors perform a wide variety of experiments and analysis to examine how DMS activity influences time perception during an interval-timing task, allowing for insight into this process. However, the significance of the key finding -- that D1 and D2 activity is distinct across time -- remains somewhat ambiguous to me. 

      Again, we are glad that the reviewer appreciated our main point, and we very much appreciate the additional points about interpretation, model parameters, and statistical power. If there is any way we can clarify the text further we are happy to do so.  

      Reviewer #2 (Public Review):  

      (1) Regarding the results in Figure 2 and Figure 5: for the heatmaps in Fig.2F and Fig.2E, the overall activity pattern of D1 and D2 MSNs looks very similar, both D1 and D2 MSNs contains neurons showing decreasing or increasing activity during interval timing. And the optogenetic and pharmacologic inhibition of either D1 or D2 MSNs resulted in similar behavior outcomes. To me, the D1 and D2 MSN activities were more complementary than opposing. 

      This is a great point. In our last revision, R3 suggested that complementary means opposing – and suggested we change the title to reflect this.  Our original title was ‘Complementary cognitive roles for D2-MSNs and D1-MSNs during interval timing’ – and we have changed the title back to this. We have clarified what we meant by complementary in the abstract (Page 2):

      “Together, our findings demonstrate that D2-MSNs and D1-MSNs had opposing dynamics yet played complementary cognitive roles, implying that striatal direct and indirect pathways work together to shape temporal control of action.”

      And on Page 30: 

      “These data, when combined with our model predictions, demonstrate that despite opposing dynamics,  D2-MSNs and D1-MSN contribute complementary temporal evidence to controlling actions in time.”

      If the authors want to emphasize the opposing side of D1 and D2 MSNs, then the manipulation experiments need to be re-designed, since the average activity of D2 MSNs increased, while D1 MSNs decreased during interval timing, instead of using inhibitory manipulations in both pathways, the authors should use inhibitory manipulation in D2-MSNs, while using optogenetic or pharmacology to activate D1-MSNs. In this way, the authors can demonstrate the opposing role of D1 and D2 MSNs and the functions of increased activity in D2-MSNs and decreased activity in D1-MSNs. 

      These are great ideas, which we agree with.  We would like to emphasize the complementary nature as noted in our original title, and not the opposing side of D1/D2 MSNs. The experiments proposed by reviewer are certainly worth doing, but would likely be quite complex to find the right stimulation parameters to affect timing without affecting movement – and we have now included them as an important limitation / future direction (Page 33):

      “Fifth, we did not deliver stimulation to the striatum because our pilot experiments triggered movement artifacts or task-specific dyskinesias (Kravitz et al., 2010). Future stimulation approaches carefully titrated to striatal physiology may affect interval timing without affecting movement.”

      (2) Regarding the results in Figure 3 C and D, Figure 6 H and Figure 7 D, what is the sample size? From the single data points in the figures, it seems that the authors were using the number of cells to do statistical tests and plot the figures. For example, Figure 3 C, if the authors use n= 32 D2 MSNs and n= 41D1 MSNs to do the statistical test, it could make a small difference to be statistically significant. The authors should use the number of mice to do the statistical tests. 

      These are important points that were discussed at length in the prior review.  First, for the sample size, we now have detailed in our Table 1: 

      Second, we have detailed our statistical approach which explicitly deals with repeated observations of neurons across mice (Page 43):

      “Statistics. All data and statistical approaches were reviewed by the Biostatistics, Epidemiology, and Research Design Core (BERD) at the Institute for Clinical and Translational Sciences (ICTS) at the University of Iowa. All code and data are made available at http://narayanan.lab.uiowa.edu/article/datasets. We used the median to measure central tendency and the interquartile range to measure spread. We used Wilcoxon nonparametric tests to compare behavior between experimental conditions and Cohen’s d to calculate effect size. Analyses of putative single-unit activity and basic physiological properties were carried out using custom routines for MATLAB. For all neuronal analyses, variability between animals was accounted for using generalized linear-mixed effects models and incorporating a random effect for each mouse into the model, which allows us to account for inherent betweenmouse variability. We used fitglme in MATLAB and verified main effects using lmer in R. We accounted for variability between MSNs in pharmacological datasets in which we could match MSNs between saline, D2 blockade, and D1 blockade. P values < 0.05 were interpreted as significant.”   

      We have formally reviewed this approach with professional biostatisticians at the University of Iowa.

      Finally, we note that we do have adequate statistical power for analysis of Fig 3C and D:  we have adequate statistical power and large effect sizes, which we now detail in the text on Page 12:

      “Consistent with population averages from Fig 2G&H, D2-MSNs and D1-MSNs had opposite patterns of activity with negative PC1 scores for D2-MSNs and positive PC1 scores for D1-MSNs (Fig 3C; PC1 for D2-MSNs: -3.4 (-4.6 – 2.5); PC1 for D1MSNs: 2.8 (-2.8 – 4.9); F = 8.8, p = 0.004 accounting for variance between mice (Fig S3A); Cohen’s d = 0.7; power = 0.80; no reliable effect of sex (F = 0.44, p = 0.51) or switching direction (F = 1.73, p = 0.19)).”

      And, on Page 12:  

      “GLM analysis also demonstrated that D2-MSNs had significantly different slopes (0.01 spikes/second (-0.10 – 0.10)), which were distinct from D1-MSNs (-0.20 (-0.47– 0.06; Fig 3D; F = 8.9, p = 0.004 accounting for variance between mice (Fig S3B); Cohen’s d = 0.8; power = 0.98; no reliable effect of sex (F = 0.02, p = 0.88) or switching direction (F = 1.72, p = 0.19)).”

      And we have included the reviewers point as a limitation on Page 33: 

      “Second, although we had adequate statistical power and medium-to-large effect sizes, optogenetic tagging is low-yield, and it is possible that recording more of these neurons would afford greater opportunity to identify more robust results and alternative coding schemes, such as neuronal synchrony.”

      (3) Regarding the results in Figure 5, wly at is the reason for the increase in the response times? The authors should plot the position track during intervals (0-6 s) with or without optogenetic or pharmacologic inhibition. The authors can check Figures 3, 5, and 6 in the paper https://doi.org/10.1016/j.cell.2016.06.032 for reference to analyze the data. 

      These are key points, and we are glad the reviewer raised them.  Our interpretation is that response time increases – without reliable changes in other task-specific movements such as nosepoke reaction time or traversal time (Fig S9).  This was lacking in our prior manuscript, and we are glad the reviewer raised it.  We have now added this to Page 30

      “Our interpretation is that because the activity of D2-MSN and D1-MSN ensembles represents the accumulation evidence, pharmacological/optogenetic disruption of D2-MSN/D1-MSN activity slows this accumulation process, leading to slower interval timing-response times (Fig 5) without changing other task-specific movements (Fig S9).  These results provide new insight into how opposing patterns of striatal MSN activity control behavior in similar ways and show that they play a complementary role in elementary cognitive operations.”

      Regarding the tracking of velocity, we unfortunately do not have this information reliably across all conditions. This citation is a beautiful landmark paper, and we are working on collecting this information in our new datasets going forward.  We have included this as a major limitation (Page 34): 

      “Still, future work combining motion tracking/accelerometry with neuronal ensemble recording and optogenetics and including bisection tasks may further unravel timing vs. movement in MSN dynamics (Robbe, 2023; Tecuapetla et al., 2016).”

      Once again, we are appreciative of the thoughtful points raised by this reviewer.  

      Reviewer #3 (Public Review): 

      Summary: 

      The cognitive striatum, also known as the dorsomedial striatum, receives input from brain regions involved in high-level cognition and plays a crucial role in processing cognitive information. However, despite its importance, the extent to which different projection pathways of the striatum contribute to this information processing remains unclear. In this paper, Bruce et al. conducted a study using various causal and correlational techniques to investigate how these pathways collectively contribute to interval timing in mice. Their results were consistent with previous research, showing that the direct and indirect striatal pathways perform opposing roles in processing elapsed time. Based on their findings, the authors proposed a revised computational model in which two separate accumulators track evidence for elapsed time in opposing directions. These results have significant implications for understanding the neural mechanisms underlying cognitive impairment in neurological and psychiatric disorders, as disruptions in the balance between direct and indirect pathway activity are commonly observed in such conditions. 

      Strengths: 

      The authors employed a well-established approach to study interval timing and employed optogenetic tagging to observe the behavior of specific cell types in the striatum. Additionally, the authors utilized two complementary techniques to assess the impact of manipulating the activity of these pathways on behavior. Finally, the authors utilized their experimental findings to enhance the theoretical comprehension of interval timing using a computational model. 

      We very much appreciate the considered read and comments by the reviewer, and recognition of the breadth of techniques in this manuscript. 

      Weaknesses: 

      The behavioral task used in this study is best suited for investigating elapsed time perception, rather than interval timing. Timing bisection tasks are often employed to study interval timing in humans and animals. In the optogenetic experiment, the laser was kept on for too long (18 seconds) at high power (12 mW). This has been shown to cause adverse effects on population activity (for example, through heating the tissue) that are not necessarily related to their function during the task epochs. Given the systemic delivery of pharmacological interventions, it is difficult to conclude that the effects are specific to the dorsomedial striatum. Future studies should use the local infusion of drugs into the dorsomedial striatum. 

      These are important points.  We agree with them completely and have now included responses to them.  First, bisection tasks certainly have advantages – we have justified our approach in the discussion (Page 32):

      “Our task version has been used extensively to study interval timing in mice and humans (Balci et al., 2008; Bruce et al., 2021; Stutt et al., 2024; Tosun et al., 2016; Weber et al., 2023). However, temporal bisection tasks, in which animals hold during a temporal cue and respond at different locations depending on cue length, have advantages in studying how animals time an interval because animals are not moving while estimating cue duration (Paton and Buonomano, 2018; Robbe, 2023; Soares et al., 2016). Our interval timing task version – in which mice switch between two response nosepokes to indicate their interval estimate has elapsed – has been used extensively in rodent models of neurodegenerative disease (Larson et al., 2022; Weber et al., 2024, 2023; Zhang et al., 2021), as well as in humans (Stutt et al., 2024). This version of interval timing involves motor timing, which engages executive function and has more translational relevance for human diseases than perceptual timing or bisection tasks (Brown, 2006; Farajzadeh and Sanayei, 2024; Nombela et al., 2016; Singh et al., 2021).  Furthermore, because many therapeutics targeting dopamine receptors are used clinically, these findings help describe how dopaminergic drugs might affect cognitive function and dysfunction. Future studies of D2-MSNs and D1-MSNs in temporal bisection and other timing tasks may further clarify the relative roles of D2- and D1-MSNs in interval timing and time estimation.”

      Second – we have included an explicit control that has the same laser that is on for the same epoch as in the experimental animal – and find no effects.  This is now detailed in the methods: (Page 37): 

      “To control for heating and nonspecific effects of optogenetics, we performed control experiments in mice without opsins using identical laser parameters in D2-cre or D1-cre mice (Fig S6).”

      And in the results (Page 21): 

      “To control for heating and nonspecific effects of optogenetics, we performed control experiments in D2-cre mice without opsins using identical laser parameters; we found no reliable effects for opsin-negative controls (Fig S6).”

      And on Page 21:

      “As with D2-MSNs, we found no reliable effects with opsin-negative controls in D1MSNs (Fig S6).”

      We have now detailed these results in Figure S6:

      Regarding focal pharmacology, we performed this experiment with focal infusion of D1/D2 antagonists in our prior work, which we have now cited (Page 4):

      “Similar behavioral effects were found with systemic (Stutt et al., 2024) or focal infusion of D2 or D1 antagonists locally within the dorsomedial striatum (De Corte et al., 2019a).”

      Comments on revised version: 

      Thank you for the comprehensive revisions. Most of my (addressable) concerns were addressed. The current version of your manuscript appears significantly improved. 

      Once again, we appreciate the reviewer’s constructive and insightful comments and careful review of our manuscript.  Their comments have been extremely helpful.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      A subset of fibroblast growth factor (FGF) proteins (FGF11-FGF14; often referred to as fibroblast growth factor homologous factors because they are not thought to be secreted and do not seem to act as growth factors) have been implicated in modulating neuronal excitability, however, the exact mechanisms are unclear. In part, this is because it is unclear how different FGF isoforms alter ion channel activity in different neuronal populations. In this study, the authors explore the role of FGF 13 in epilepsy using a variety of FGF13 knock-out mouse models, including several targeted cell-type specific conditional knockout mouse lines. The study is intriguing as it indicates that FGF13 plays an especially important role in inhibitory neurons. Furthermore, although FGF13 has been studied as a regulator of neuronal voltage-gated sodium channels, the authors present data indicating that FGF13 knockout in inhibitory neurons induces seizures not by altering sodium current properties but by reducing voltage-gated potassium currents in inhibitory neurons. While intriguing, the data are incomplete in several aspects and thus the mechanisms by which various FGF13 variants induce Developmental and Epileptic Encephalopathies are not resolved by the data presented. 

      Strengths: 

      A major strength is the array of techniques used to assess the mice and the electrical activity of the neurons. 

      The multiple mouse knock-out models utilized are a strength, clearly demonstrating that FGF13 expression in inhibitory neurons, and possibly specific sub-populations of inhibitory neurons, is critically important. 

      The data on the increased sensitivity to febrile seizures in KO mice are very nice, provide clear evidence for regulation of excitability in inhibitory neurons by FGF13. 

      The Gad2Fgf13-KO mice indicated that several Fgf13 splice variants may be expressed in inhibitory neurons and suggest that the Fgf13-VY splice variants may have previously unrecognized specific roles in regulating neuronal excitability. 

      The data on males and females from the various KO mice lines indicates a clear gene dosage effect for this X-linked gene. 

      The unbiased metabolomic analysis supports the assertion that Fgf13 expression in inhibitory neurons is important in regulating seizure susceptibility. 

      Weaknesses: 

      The knockout approach can be powerful but also has distinct limitations. Multiple missense mutations in FGF13-S have been identified. The knockout models employed here are not appropriate for understanding how these missense variants lead to altered neuronal excitability. While the data show that complete loss of Fgf13 from excitatory forebrain neurons is not sufficient to induce seizure susceptibility, it does not rule out that specific variants (e.g., R11C) might alter the excitability of forebrain neurons. The missense variants may alter excitatory and/or inhibitory neuron excitability in distinct ways from a full FGF13 knockout. 

      We agree with this overall interpretation of our data and have updated our language in the Discussion to make the distinction between mechanisms attributable to a knockout compared to a missense variant. We note, however, that the proposed mechanism by which missense variants (e.g., R11C) drive seizures is through loss of long-term inactivation in excitatory neurons and our excitatory knockout model shows loss of long-term inactivation in excitatory neurons. Thus, our knockout model demonstrates that the mechanism(s) by which the missense variants alter neuronal excitability in excitatory neurons must exclude long-term inactivation, thereby providing some clarity regarding the proposed mechanism for those missense variants.

      The electrophysiological experiments are intriguing but not comprehensive enough to support all of the conclusions regarding how FGF13 modulates neuronal excitability. 

      We agree and have updated the language in our Discussion to clarify speculation from conclusions that are directly supported by data.

      Another concern is the use of different ages of neurons for different experiments. For example, sodium currents in Figures 2 and 5 (and Supplemental Figures 2 and 7) are recorded from cultured neurons, which may have very different properties (including changes in sodium channel complexes) from neurons in vivo that drive the development of seizure activity. 

      We agree and acknowledge the important differences between neurons examined in culture and in vivo, yet the in vitro vs in vivo preparations were necessitated by the specific experiments. While these differences are important, previous gene profiling studies comparing primary hippocampal neurons with developing mouse hippocampus have found that although gene expression is accelerated in vitro, gene expression profiles in vitro and in vivo are similar (PMID: 11438693). Moreover, the relative immaturity of the cultured neurons is balanced at least in part because the in vivo experiments were performed on very young animals (~P12), which also have relatively immature neurons. Thus, we predict that sodium channel complexes studied in vitro are informative for the in vivo aspects of this investigation.

      Reviewer #2 (Public Review): 

      Summary: 

      The authors address three primary questions: 

      (1) how FGF13 variants confer seizure susceptibility, 

      (2) the specific cell types involved, and 

      (3) the underlying mechanisms, particularly regarding Nav dysfunction. 

      They use different Cre drivers to generate cell type-specific knockouts (KOs). First, using Nestin-Cre to create a whole-brain Fgf13 KO, they observed spontaneous seizures and premature death. While KO of Fgf13 in excitatory neurons does not lead to spontaneous seizures, KO in inhibitory neurons recapitulates the seizures and premature death observed in the Nestin-Cre KO. They further narrow down the critical cell type to MGE-derived interneurons (INs), demonstrating that MGE-neuron-specific KO partially reproduces the observed phenotypes. "All interneuron" KOs exhibit deficits in synaptic transmission and interneuron excitability, not seen in excitatory neuron-specific KOs. Finally, they rescue the defects in the interneuron-specific KO by expressing specific Fgf13 isoforms. This is an elegant and important study adding to our knowledge of mechanisms that contribute to seizures. 

      Strengths 

      • The study provides much-needed cell type-specific KO models. 

      • The authors use appropriate Cre lines and characterize the phenotypes of the different KOs. 

      • The metabolomic analysis complements the rest of the data effectively. 

      • The study confirms and extends previous research using improved approaches (KO lines vs. in vitro KD or antibody infusion). 

      • The methods and analyses are robust and well-executed. 

      Weaknesses 

      • One weakness lies in the use of the Nkx2.1 line (instead of Nkx2.1CreER) in the paper. As a result, some answers to key questions are incomplete. For instance, it remains unclear whether the observed effects are due to Chandelier cells or NGFCs, potentially both MGE and CGE derived, explaining why Nkx2.1 alone does not fully replicate the overall inhibitory KO. Using Nkx2.1CreER could have helped address the cell specificity. With the Nkx2.1 line used in the paper, the answer is partial. 

      We agree that while our data is consistent with the possibility of a role for Fgf13 in chandelier function, the current Cre driver does not provide sufficient direct evidence. We performed preliminary experiments (unpublished) using a Nkx2.1CreER driver, with late embryonic induction with a tamoxifen dosage validated for sparse labeling of chandelier cells (30846310). While we successfully replicated sparse labeling of neocortical chandelier cells (using a Cre-dependent Ai9 reporter), we were unable to determine if there was a significant loss of FGF13 as measured by immunohistochemistry since FGF13+ cells are only a small subset of the already sparse cells. Because multiple snRNA-seq studies identified Fgf13 as a marker for chandelier cells, we speculated—now more carefully circumspect—about the role of chandelier cells vs NGFCs.

      • While the mechanism behind the reduced inhibitory drive in the IN-specific KO is suggested to be presynaptic, the chosen method does not allow them to exactly identify the mechanisms (spontaneous vs mEPSC/mIPSC), and whether it is a loss of inhibitory synapses (potentially axo-axonic) or release probability. 

      We agree that this is an important limitation of our work, and that we are unable to identify the exact mechanism behind the reduced inhibitory drive. We are continuing to explore this question in a follow-up study.

      • Some supporting data (e.g. Supplemental Figure 7 and 8) appear to come from only one (or two) WT and one (or two) KO mice. Supplementary data, like main data, should come from at least three mice in total to be considered complete/solid (even if the statistical analysis is done with cells). 

      All panels in the manuscript, including supplementary data, except supplementary 7D and 8A, have N(mouse)≥3. Time limitations (graduating student) prevented us from obtaining a larger N. Because those supplementary data are not critical for supporting our conclusions, we removed them.

      General Assessment 

      The general conclusions of this paper are supported by data. As it is, the claim that "these results enhance our understanding of the molecular mechanisms that drive the pathogenesis of Fgf13-related seizures" is partially supported. A more cautious term may be more appropriate, as the study shows the mechanism is not Nav-mediated and suggests alternative mechanisms without unambiguously identifying them. The conclusion that the findings "expand our understanding of FGF13 functions in different neuron subsets" is supported, although somewhat overstated, as the work is not conclusive about the exact neuron subtypes. However, it does indeed show differential functions for specific neuronal classes, which is a significant result. 

      Impact and Utility 

      This paper is undoubtedly valuable. Understanding that excitatory neurons are not the primary contributors to the observed phenotypes is crucial. The finding that the effects are not MGE-unique is also important. This work provides a solid foundation for further research and will be a useful resource for future studies. 

      Reviewer #3 (Public Review): 

      Summary: 

      The authors aimed to determine the mechanism by which seizures emerge in Developmental and Epileptic Encephalopathies caused by variants in the gene FGF13. Loss of FGF13 in excitatory neurons had no effect on seizure phenotype as compared to the loss of FGF13 in GABAergic interneurons, which in contrast caused a dramatic proseizure phenotype and early death in these animals. They were able to show that Fgf13 ablation and consequent loss of FGF13-S and FGF13-VY reduced overall inhibitory input from Fgf13-expressing interneurons onto hippocampal pyramidal neurons. This was shown to occur not via disruption to voltage-gated sodium channels but rather by reducing potassium currents and action potential repolarisation in these interneurons. 

      Strengths: 

      The authors employed multiple well-validated, novel mouse lines with FGF13 knocked out in specific cell types including all neurons, all excitatory cells, all GABAergic interneurons, or a subset of MGE-derived interneurons, including axo-axonic chandelier cells. The phenotypes of each of these four mouse lines were carefully characterised to reveal clear differences with the most fundamental being that Interneuron-targeted deletion of FGF13 led to perinatal mortality associated with extensive seizures and impaired the hippocampal inhibitory/excitatory balance while deletion of FGF13 in excitatory neurons caused no detectable seizures and no survival deficits. 

      The authors made excellent use of western blotting and in situ hybridisation of the different FGF13 isoforms to determine which isoforms are expressed in which cell types, with FGF3-S predominantly in excitatory neurons and FGF13-VY and FGF13-V predominantly in GABAergic neurons. 

      The authors performed a highly detailed electrophysiological analysis of excitatory neurons and GABAergic interneurons with FGF13 deficits using whole-cell patch clamp. This enabled them to show that FGF13 removal did not affect voltage-gated sodium channels in interneurons, but rather reduced the action of potassium channels, with the resultant effect of making it more likely that interneurons enter depolarisation block. These findings were strengthened by the demonstration that viral re-expression of different Fgf13 splice isoforms could partially rescue deficits in interneuron action potential output and restore K+ channel current size. 

      Additionally, the discussion was nuanced and demonstrated how the current findings resolved previous apparent contradictions in the field involving the function of FGF13. 

      These findings will have a significant impact on our understanding of how FGF13 causes seizures and death in DEEs, and the action of different FGF13 isoforms within different neuronal cell types, particularly GABAergic interneurons. 

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      The limitations of the KO model should be fully discussed in the discussion. It should be clear that knocking out FGF13 does not provide insight into how missense mutations such as R11C may alter excitatory and/or inhibitory neuron excitability. 

      We agree with this overall interpretation of our data and have updated our language in the Discussion to make the distinction between mechanisms attributable to a knockout compared to a missense variant. We note, however, that the proposed mechanism by which missense variants (e.g., R11C) drive seizures is through loss of long-term inactivation in excitatory neurons and our excitatory knockout model shows loss of long-term inactivation in excitatory neurons. Thus, our knockout model demonstrates that the mechanism(s) by which the missense variants alter neuronal excitability in excitatory neurons must exclude long-term inactivation, thereby providing some clarity regarding the proposed mechanism for those missense variants.

      It is important to know what sodium channel isoforms are expressed in the cultured neurons used in the experiments for Figures 2 and 5. Are Nav1.1, Nav1.2, Nav1.3, and Nav1.6 expressed at appropriate levels in the cultures? 

      We agree it is important to know that the sodium channel isoforms expressed in our hippocampal neurons are expressed at physiologically relevant levels, for further validation of our primary culture system. We have added RT-qPCR data from our hippocampal neuron cultures (Supplemental Figure 2B) showing the relative levels of SCN1A, SCN2A, SCN3A, and SCN8A, which are similar to the relative levels of voltage-gated sodium channel isoforms found in rodent and human forebrain in early development (Figure 1 in PMID: 35031483).

      The electrophysiological experiments are intriguing but limited. One, it would be helpful to report if there were any changes in resting membrane potential for the cells reported in Figure 5. It is also inappropriate to unequivocally state that "Nav currents were not significantly affected by Fgf13 knockout in Gad2Fghf13 KO neurons" as only a sampling of properties was investigated. Recovery from inactivation and persistent current amplitudes were not evaluated. Furthermore, while it looks like long-term inactivation is not altered, only one specific protocol was used and currents measured from cultured neurons may not be fully representative of neuronal properties in vivo. 

      We agree that we performed a selective analysis of Nav currents—selected because those are the major parameters that have been associated with FGF13 modulation. Because we did not observe significant differences in NaV currents, we therefore hypothesized that FGF13 affected other currents, as previously observed, and consequently assessed potassium currents, for which we did observe a difference. Further, we note that our sodium current and potassium current results are consistent with, and supportive of, our action potential data in which we find no deficit in AP initiation, but rather a deficit in AP repolarization. We revised the text to reflect the more limited analysis of Nav currents. Regarding long-term inactivation, we also agree that measurements in cultured neurons may not fully represent neuronal properties in vivo; however, we note that regulation of long-term inactivation by FGF13 has previously been assessed only in cultured cells (and not in neurons). Thus, our protocols were designed to query that modulation previously reported.

      The first sentence of the results section is misleading: "To determine how FGF13 variants contribute to seizure disorders, we developed genetic mouse models that eliminate Fgf13 in specific neuronal cell types." The knockouts do not target specific splice isoforms and do not help determine how missense variants contribute to DEE. This should be modified to reflect better what is actually being tested. 

      We agree and have revised our text to state that our goal was to assess how FGF13 contributes to neuronal excitability and thereby accurately reflect the cell type-specific, but not isoform specific, targeting.

      Reviewer #2 (Recommendations For The Authors): 

      • The sentence in the introduction stating "an unusual example of differential expression of an alternatively spliced neuronal gene in excitatory vs. inhibitor neurons" is factually incorrect, especially for transcripts regulating intrinsic properties like FGF13. Refer to PMID: 31451803 for more details and consider rephrasing this statement. 

      We updated our text to reflect the similarity of Fgf13’s cell type-specific alternative splicing to other genes known to control synaptic interactions and neuronal architecture and added the suggested reference.

      • Consistency is needed in the manuscript regarding the term "BASEscope" or "basescope"; the correct version is "BaseScope." 

      We corrected the text accordingly.

      • In the discussion, the term "reduced overall inhibitory drive" might be more appropriate than "input." 

      We updated the text accordingly.

      • The authors should refer to the Fgf13 data in the database from Furlanis et al., which complements their findings: https://scheiffele-splice.scicore.unibas.ch/

      We agree and now incorporate this reference.

      • The phrase "Fgf13 silencing in Nkx2.1 expressing neurons" should be clarified to include the use of CreER, which was crucial and effectively resulted in the labeling of a different subtype of interneurons, see PMID: 23180771. 

      We agree and have updated our text accordingly.

      • Be more cautious when discussing the role of FGF13 in chandelier function; while it seems probable, the current Cre driver used provides no direct evidence. 

      We agree (as noted above) that while our data are consistent with the possibility of a role for Fgf13 in chandelier function, the current Cre driver used is insufficient to offer direct evidence and therefore updated our text in the discussion.

      • The gene dosage effect is interesting, it would be interesting to explore it further in the future. 

      We agree. Because our data suggest that seizures result from loss of inhibitory neuron input, we hypothesize that the gene dosage effect derives from further loss of inhibitory neuron input and thus more hyperexcitability.

      • Another critical aspect not addressed here and of interest for the future is the distinction between the role of FGF13 in interneuron development versus general maintenance. Using Nkx2.1CreER could have helped address both cell specificity and developmental roles. 

      We agree that there may be an interesting distinction between the role of Fgf13 in development versus general maintenance. We have piloted an Nkx2.1-CreER targeted deletion of Fgf13 from cortical interneurons but have been unsuccessful with significant deletion of Fgf13, likely because the Nkx2.1-CreER strategy targets only a sparse subset of interneurons and FGF13 is expressed in only a subset of total interneurons. Thus, use of the Nkxs.1-CreER strategy is challenging. We are looking for ways to optimize.

      Reviewer #3 (Recommendations For The Authors): 

      This was a truly fabulous paper, with an exceptional quantity of beautiful data. I would like to congratulate the authors on their superb work. 

      In the discussion, the authors correctly draw attention to the fact that the clear pro-seizure phenotype they see when FGF13 was knocked out more specifically in a subset of interneurons including chandelier cells, adds to our understanding of the role of FGF13 in chandelier cells. More than that though, given that FGF13 is reducing excitability in these cells AND this results in a strong pro-seizure phenotype, they may want to postulate that this lends further weight to the argument that chandeliers cells are likely powerful regulators of network excitability despite suggestions in the field that they could potentially have a proexcitatory function (see Szabadics et al. Science 2006). 

      We agree this is interesting and have elaborated on our discussion of chandelier cells to include this point while also addressing the important caveats noted by reviewer 2.

      A minor point: 

      On page 26 the sentence: 

      "Here, we were able to assess FGF13-S and FGF13-VY, chosen because they are most abundantly expressed isoforms in the adult mouse brain, but the inability to rescue electrophysiological consequences completely with either isoform alone leaves open the possibility that other isoforms (e.g., FGF13-U, FGF13-V, and FGF13-VY) also make critical contributions." Should the last "FGF13-VY" be removed? 

      We thank the reviewer for noticing the error and have updated the text accordingly.

    1. Author response:

      Joint Public Review:

      In the microglia research community, it is accepted that microglia change their shape both gradually and acutely along a continuum that is influenced by external factors both in their microenvironments and in circulation. Ideally, a given morphological state reflects a functional state that provides insight into a microglia's role in physiological and pathological conditions. The current manuscript introduces MorphoCellSorter, an open-source tool designed for automated morphometric analysis of microglia. This method adds to the many programs and platforms available to assess the characteristics of microglial morphology; however, MorphoCellSorter is unique in that it uses Andrew's plotting to rank populations of cells together (in control and experimental groups) and presents "big picture" views of how entire populations of microglia alter under different conditions. Notably, MorphoCellSorter is versatile, as it can be used across a wide array of imaging techniques and equipment. For example, the authors use MorphoCellSorter on images of fixed and live tissues representing different biological contexts such as embryonic stages, Alzheimer's disease models, stroke, and primary cell cultures.

      This manuscript outlines a strategy for efficiently ranking microglia beyond the classical homeostatic vs. active morphological states. The outcome offers only a minor improvement over the already available strategies that have the same challenge: how to interpret the ranking functionally.

      We would like to thank the reviewers for their careful reading and constructive comments and questions. While MorphoCellSorter currently does not rank cells functionally based on their morphology, its broad range of application, ease of use and capacity to handle large datasets provide a solid foundation. Combined with advances in single-cell transcriptomics, MorphoCellSorter could potentially enable the future prediction of cell functions based on morphology.

      Strengths and Weaknesses:

      (1) The authors offer an alternative perspective on microglia morphology, exploring the option to rank microglia instead of categorizing them with means of clusterings like k-means, which should better reflect the concept of a microglia morphology continuum. They demonstrate that these ranked representations of morphology can be illustrated using histograms across the entire population, allowing the identification of potential shifts between experimental groups. Although the idea of using Andrews curves is innovative, the distance between ranked morphologies is challenging to measure, raising the question of whether the authors oversimplify the problem. 

      We have access to the distance between cells through the Andrew’s score of each cell. However, the challenge is that these distances are relative values and specific to each dataset. While we believe that these distances could provide valuable information, we have not yet determined the most effective way to represent and utilize this data in a meaningful manner.

      Also, the discussion about the pipeline's uniqueness does not go into the details of alternative models.The introduction remains weak in outlining the limitations of current methods (L90). Acknowledging this limitation will be necessary.

      Thank you for these insightful comments. The discussion about alternative methods was already present in the discussion L586-598 but to answer the request of the reviewers, we have revised the introduction and discussion sections to more clearly address the limitations of current methods, as well as discussed the uniqueness of the pipeline. Additionally, we have reorganized Figure 1 to more effectively highlight the main caveats associated with clustering, the primary method currently in use.

      (2) The manuscript suffers from several overstatements and simplifications, which need to be resolved. For example:

      a) L40: The authors talk about "accurately ranked cells". Based on their results, the term "accuracy" is still unclear in this context.

      Thank you for this comment. Our use of the term "accurately" was intended to convey that the ranking was correct based on comparison with human experts, though we agree that it may have been overstated. We have removed "accurately" and propose to replace it with "properly" to better reflect the intended meaning.

      b) L50: Microglial processes are not necessarily evenly distributed in the healthy brain. Depending on their embedded environment, they can have longer process extensions (e.g., frontal cortex versus cerebellum).

      Thank you for raising this point to our attention. We removed evenly to be more inclusive on the various morphologies of microglia cells in this introductory sentence

      c) L69: The term "metabolic challenge" is very broad, ranging from glycolysis/FAO switches to ATP-mediated morphological adaptations, and it needs further clarification about the author's intended meaning.

      Thank you for this comment, indeed we clarified to specify that we were talking about the metabolic challenge triggered by ischemia and added a reference as well.

      d) L75: Is morphology truly "easy" to obtain? 

      Yes, it is in comparison to other parameters such as transcripts or metabolism, but we understand the point made by the reviewer and we found another way of writing it.  As an alternative we propose: “morphology is an indicator accessible through…”

      e) L80: The sentence structure implies that clustering or artificial intelligence (AI) are parameters, which is incorrect. Furthermore, the authors should clarify the term "AI" in their intended context of morphological analysis.

      We apologize for this confusing writing, we reformulated the sentence as follows: “Artificial intelligence (AI) approaches such as machine learning have also been used to categorize morphologies (Leyh et al., 2021)”.

      f) L390f: An assumption is made that the contralateral hemisphere is a non-pathological condition. How confident are the authors about this statement? The brain is still exposed to a pathological condition, which does not stop at one brain hemisphere.

      We did not say that the contralateral is non-pathological but that the microglial cells have a non-pathological morphology which is slightly different. The contralateral side in ischemic experiments is classically used as a control (Rutkai et al 2022). Although It has been reported that differences in transcript levels can be found between sham operated animals and contralateral hemisphere in tMCAO mice (Filippenkov et al 2022) https://doi.org/10.3390/ijms23137308 showing that indeed the contralateral side is in a different state that sham controls, no report have been made on differences in term of morphology.

      We have removed “non-pathological” to avoid misinterpretations

      g) Methodological questions:

      a) L299: An inversion operation was applied to specific parameters. The description needs to clarify the necessity of this since the PCA does not require it.

      Indeed, we are sorry for this lack of explanation. Some morphological indexes rank cells from the least to the most ramified, while others rank them in the opposite order. By inverting certain parameters, we can standardize the ranking direction across all parameters, simplifying data interpretation. This clarification has been added to the revised manuscript as follows:

      “Lacunarity, roundness factor, convex hull radii ratio, processes cell areas ratio and skeleton processes ratio were subjected to an inversion operation in order to homogenize the parameters before conducting the PCA: indeed, some parameters rank cells from the least to the most ramified, while others rank them in the opposite order. By inverting certain parameters, we can standardize the ranking direction across all parameters, thus simplifying data interpretation.”

      b) Different biological samples have been collected across different species (rat, mouse) and disease conditions (stroke, Alzheimer's disease). Sex is a relevant component in microglia morphology. At first glance, information on sex is missing for several of the samples. The authors should always refer to Table 1 in their manuscript to avoid this confusion. Furthermore, how many biological animals have been analyzed? It would be beneficial for the study to compare different sexes and see how accurate Andrew's ranking would be in ranking differences between males and females. If they have a rationale for choosing one sex, this should be explained.

      As reported in the literature, we acknowledge the presence of sex differences in microglial cell morphology. Due to ethical considerations and our commitment to reducing animal use, we did not conduct dedicated experiments specifically for developing MorphoCellSorter. Instead, we relied on existing brain sections provided by collaborators, which were already prepared and included tissue from only one sex—either female or male—except in the case of newborn pups, whose sex is not easily determined. Consequently, we were unable to evaluate whether MorphoCellSorter is sensitive enough to detect morphological differences in microglia attributable to sex. Although assessing this aspect is feasible, we are uncertain if it would yield additional insights relevant to MorphoCellSorter’s design and intended applications.

      To address this, we have included additional references in Table 1 of the revised manuscript and clearly indicated the sex of the animals from which each dataset was obtained.

      c) In the methodology, the slice thickness has been given in a range. Is there a particular reason for this variability? 

      We could not spot any range in the text, we usually used 30µm thick sections in order to have entire or close to entire microglia cells.

      Although the thickness of the sections was identical for all the sections of a given dataset, only the plans containing the cells of interest were selected during the imaging for both of the ischemic stroke model. This explains why depending on how the cell is distributed in Z the range of the plans acquired vary.

      Also, the slice thickness is inadequate to cover the entire microglia morphology. How do the authors include this limitation of their strategy? Did the authors define a cut-off for incomplete microglia? 

      We found that 30 µm sections provide an effective balance, capturing entire or nearly entire microglial cells (consistent with what we observe in vivo) while allowing sufficient antibody penetration to ensure strong signal quality, even at the section's center. In our segmentation process, we excluded microglia located near the section edges (i.e., cells with processes visible on the first or last plane of image acquisition, as well as those close to the field of view’s boundary). Although our analysis pipeline should also function with thicker sections (>30 µm), we confirmed that thinner sections (15 µm or less) are inadequate for detecting morphological differences, as tested initially on the AD model. Segmented, incomplete microglia lack the necessary structural information to accurately reflect morphological differences thus impairing the detection of existing morphological differences.

      c) The manuscript outlines that the authors have used different preprocessing pipelines, which is great for being transparent about this process. Yet, it would be relevant to provide a rationale for the different imaging processing and segmentation pipelines and platform usages (Supplementary Figure 7). For example, it is not clear why the Z maximum projection is performed at the end for the Alzheimer's Disease model, while it's done at the beginning of the others.

      The same holds through for cropping, filter values, etc. Would it be possible to analyze the images with the same pipelines and compare whether a specific pipeline should be preferable to others?

      The pre-processing steps depend on the quality of the images in each dataset. For example, in the AD dataset, images acquired with a wide-field microscope were considerably noisier compared to those obtained via confocal microscopy. In this case, reducing noise plane-by-plane was more effective than applying noise reduction on a Z-projection, as we would typically do for confocal images. Given that accurate segmentation is essential for reliable analysis in MorphoCellSorter, we chose to tailor the segmentation approach for each dataset individually. We recommend future users of MorphoCellSorter take a similar approach. This clarification has been added to the discussion.

      On a note, Matlab is not open-access, 

      This is correct. We are currently translating this Matlab script in Python, this will be available soon on Github. 

      https://github.com/Pascuallab/MorphCellSorter.

      This also includes combining the different animals to see which insights could be gained using the proposed pipelines.

      Because of what we have been explaining earlier, having a common segmentation process for very diverse types of acquisitions (magnification, resolution and type of images) is not optimal in terms of segmentation and accuracy in the analysis. Although we could feed MorphoCellSorter with all this data from a unique segmentation pipeline, the results might be very difficult to interprete.

      d) L227: Performing manual thresholding isn't ideal because it implies the preprocessing could be improved. Additionally, it is important to consider that morphology may vary depending on the thresholding parameters. Comparing different acquisitions that have been binarized using different criteria could introduce biases.

      As noted earlier, segmentation is not the main focus of this paper, and we leave it to users to select the segmentation method best suited to their datasets. Although, we acknowledge that automated thresholding would be in theory ideal, we were confronted toimage acquisitions that were notuniform, even within the same sample. For instance, in ischemic brain samples, lipofuscin from cell death introduces background noise that can artificially impact threshold levels. We tested global and local algorithms to automatically binarize the cells but these approaches resulted often on imperfect and not optimized segmentation for every cell. In our experience, manually adjusting the threshold provides a more accurate, reliable, and comparable selection of cellular elements, even though it introduces some subjectivity. To ensure consistency in segmentation, we recommend that the same person performs the analysis across all conditions. This clarification has been added to the discussion.

      e) Parameter choices: L375: When using k-means clustering, it is good practice to determine the number of clusters (k) using silhouette or elbow scores. Simply selecting a value of k based on its previous usage in the literature is not rigorous, as the optimal number of clusters depends on the specific data structure. If they are seeking a more objective clustering approach, they could also consider employing other unsupervised techniques, (e.g. HDBSCAN) (L403f).

      We do agree with the referee’s comment but the purpose of the k-mean we used was just to illustrate the fact that the clusters generated are artificial and do not correspond to the reality of the continuum of microglia morphology. In the course of the study we used the elbow score to determine the k means but this did not work well because no clear elbow was visible in some datasets (probably because of the continuum of microglia morphologies). Anyway, using whatever k value will not change the problem that those clusters are quite artificial and that the boundaries of those clusters are quite arbitrary whatever the way k is determined manually or mathematically.

      L373: A rationale for the choice of the 20 non-dimensional parameters as well as a detailed explanation of their computation such as the skeleton process ratio is missing. Also, how strongly correlated are those parameters, and how might this correlation bias the data outcomes?

      Thank you for raising this point. There is no specific rationale beyond our goal of being as exhaustive as possible, incorporating most of the parameters found in the literature, as well as some additional ones that we believed could provide a more thorough description of microglial morphology.

      Indeed, some of these parameters are correlated. Initially, we considered this might be problematic, but we quickly found that these correlations essentially act as factors that help assign more weight to certain parameters, reflecting their likely greater importance in a given dataset. Rather than being a limitation, the correlated parameters actually enhance the ranking. We tested removing some of these parameters in earlier versions of MorphoCellSorter, and found that doing so reduced the accuracy of the tool.

      Differences between circularity and roundness factors are not coming across and require further clarification. 

      These are two distinct ways of characterizing morphological complexity, and we borrowed these parameters and kept the name from the existing literature, not necessarily in the context of microglia. In our case, these parameters are used to describe the overall shape of the cell. The advantage of using different metrics to calculate similar parameters is that, depending on the dataset, one method may be better suited to capture specific morphological features of a given dataset. MorphoCellSorter selects the parameter that best explains the greatest dispersion in the data, allowing for a more accurate characterization of the morphology.

      One is applied to the soma and the other to the cell, but why is neither circularity nor loudness factor applied to both?

      None of the parameters concern the cell body by itself. The cell body is always relative to another metric(s). Because these parameters and what they represent does not seem to be  very clear we will add a graphic representation of the type of measurements and measure they provide in the revised version of the manuscript.

      f) PCA analysis:

      The authors spend a lot of text to describe the basic principles of PCA. PCA is mathematically well-described and does not require such depth in the description and would be sufficient with references.

      Thank you for this comment indeed the description of PCA may be too exhaustive, we will simplify the text. 

      Furthermore, there are the following points that require attention:

      L321: PC1 is the most important part of the data could be an incorrect statement because the highest dispersion could be noise, which would not be the most relevant part of the data. Therefore, the term "important" has to be clarified.

      We are not sure in the case of segmented images the noise would represent most of the data, as by doing segmentation we also remove most of the noise, but maybe the reviewer is concerned about another type of noise? Nonetheless, we thank the reviewer for his comment and we propose the following change, that should solve this potential issue.

      “_PC_1 is the direction in which data is most dispersed.”

      L323: As before, it's not given that the first two components hold all the information.

      Thank you for this comment we modified this statement as follows: “The two first components represent most of the information (about 70%), hence we can consider the plan PC_1, PC_2 as the principal plan reducing the dataset to a two dimensional space”

      L327 and L331 contain mistakes in the nomenclature: Mix up of "wi" should be "wn" because "i" does not refer to anything. The same for "phi i = arctan(yn/wn)" should be "phi n".

      Thanks a lot for these comments. We have made the changes in the text as proposed by the reviewer.

      L348: Spearman's correlation measures monotonic correlation, not linear correlation. Either the authors used Pearson Correlation for linearity or Spearman correlation for monotonic. This needs to be clarified to avoid misunderstandings.

      Sorry for the misunderstanding, we did use Spearman correlation which is monotonic, we thus changed linear by monotonic in the text. Thanks a lot for the careful reading.

      g) If the authors find no morphological alteration, how can they ensure that the algorithm is sensitive enough to detect them? When morphologies are similar, it's harder to spot differences. In cases where morphological differences are more apparent, like stroke, classification is more straightforward.

      We are not entirely sure we fully understand the reviewer's comment. When data are similar or nearly identical, MorphoCellSorter performs comparably to human experts (see Table 1). However, the advantage of using MorphoCellSorter is that it ranks cells do.much faster while achieving accuracy similar to that of human experts AND gives them a value on an axis (andrews score), which a human expert certainly can't. For example, in the case of mouse embryos, MorphoCellSorter’s ranking was as accurate as that made by human experts. Based on this ranking, the distributions were similar, suggesting that the morphologies are generally consistent across samples.

      The algorithm itself does not detect anything—it simply ranks cells according to the provided parameters. Therefore, it is unlikely that sensitivity is an issue; the algorithm ranks the cells based on existing data. The most critical factor in the analysis is the segmentation step, which is not the focus of our paper. However, the more accurate the segmentation, the more distinct the parameters will be if actual differences exist. Thus, sensitivity concerns are more related to the quality of image acquisition or the segmentation process rather than the ranking itself. Once MorphoCellSorter receives the parameters, it ranks the cells accordingly. When cells are very similar, the ranking process becomes more complex, as reflected in the correlation values comparing expert rankings to those from MorphoCellSorter (Table 1). 

      Moreover, MorphoCellSorter does not only provide a ranking: the morphological indexes automatically computed offer useful information to compare the cells’ morphology between groups.

      h) Minor aspects:

      % notation requires to include (weight/volume) annotation.

      This has been done in the revised version of the manuscript

      Citation/source of the different mouse lines should be included in the method sections (e.g. L117).

      The reference of the mouse line has been added (RRID:IMSR_JAX:005582) to the revised version of the manuscript.

      L125: The length of the single housing should be specified to ensure no variability in this context.

      The mice were kept 24h00 individually, this is now stated in the text

      L673: Typo to the reference to the figure.

      This has been corrected, thank you for your thoughtful reading.

    1. Author response:

      We thank the editor and the reviewers for the positive evaluation of our manuscript and the thoughtful comments. Below we provide a provisional reply to the reviewers’ comments, which we will address in more detail in the revised manuscript.

      Reviewer 1 highlights three important alternative interpretations of our results: (1) sustained suppression, (2) enhancement followed by suppression, and (3) priming. We believe that these alternatives need to be addressed to improve the conclusions we can draw from the available data.

      (1) Sustained suppression: As outlined by R1, it is possible that participants suppressed the HPDL throughout the entire experiment, instead of proactively instantiating suppression on each trial. While possible, we believe that this account is unlikely to explain the present results, given the utilized analysis approach, a voxel-wise GLM fit to the BOLD data per run (see Materials and Methods for details). Specifically, we derived parameter estimates from this GLM per location to estimate the relative suppression. Sustained suppression would modulate BOLD responses throughout the run, i.e. also during the implicit baseline period used to estimate the contrast parameter estimates. Hence, a sustained suppression should not result in a differential modulation between locations, as the BOLD response at the HPDL during the baseline period would be equally suppressed as during the trial. We will discuss this important aspect in the revised manuscript.

      (2) Enhancement followed by suppression: R1 correctly points out that BOLD data, given the poor temporal resolution, do not allow for the detection of potential transient enhancements at the HPDL followed by a later and more pronounced suppression (akin to “search and destroy”). We agree with this assessment. However, we would also argue that a transient enhancement followed by sustained suppression before search onset constitutes proactive suppression in line with our interpretation, because suppression would still arise proactively (i.e., before search and hence distractor onset). Whether brief enhancement precedes suppression cannot be elucidated by our data, but we believe that it constitutes an interesting avenue for future studies using time-resolved and spatially specific recording methods. We will address this important addition in the updated manuscript.

      (3) Priming: It is possible that participants particularly suppress locations which on previous trials contained a distractor. This account constitutes a different perspective than statistical learning integrating across many trials. We believe that it is likely that both accounts contribute to the observed effect to some degree, as both the distant (but often repeated) and the most recent past should inform our priors. Indeed, arguably recent trials should be particularly informative for our predictions as natural environments vary across time, and hence the statistical learning system should remain sensitive to potential changes in the environment. In short, we agree with R1 that the n-1 trial may impact suppression, and therefore charting the potential contributions of this type of priming compared to statistical learning is a relevant addition to the manuscript. We will perform the suggested analysis; however, we also note that dividing trials based on the n-1 trial will significantly reduce the reliability of the parameter estimates (e.g. only ~1/3 of trials follow omissions).

      Reviewer 2 had two valuable suggestions to advance the inferences we can draw from the available data. In particular, R2 proposed two additional analyses, which we will consider during revision.

      First, R2 suggests separating the utilized early visual cortex (EVC) ROI mask into the three retinotopic areas comprising EVC (V1, V2, V3) and to perform the key analyses in surface space for each ROI separately. We agree that exploring distractor suppression across V1, V2 and V3 separately is an interesting extension to our results. Our reasoning to combine early visual areas into one mask was two-fold: First, we did not have an a priori reason to expected distinct neural suppression between these early ROIs. Therefore, we did not acquire retinotopy data to reliably separate V1, V2 and V3, instead opting to increase the number of search task trials. The lack of retinotopy data naturally limits the reliability of the resulting cortical segmentation. However, we believe that separating EVC into its constituent areas using anatomical data is nonetheless a promising addition to our primary analyses. Therefore, during revision we will explore the main suppression analyses split into V1, V2, and V3.

      Second, R2 highlights that behavioral facilitation and neural suppression could be correlated across participants. The rationale is that should neural suppression in EVC relate to the facilitation of behavioral responses, we may expect a positive relationship between neural suppression at the HPDL and RTs across participants. We agree with R2’s suggestion and will perform the analysis accordingly. However, we note that any results should be interpreted with caution, as the present sample size of n=28 is small for an across participant correlation analysis involving neural and behavioral difference scores.

      In summary, we believe that addressing the reviewers' suggestions will substantially improve our manuscript, particularly regarding the interpretation and scope of our findings.

    1. Author response:

      Reviewer #1 (Public review): 

      Summary: 

      The manuscript presents a significant and rigorous investigation into the role of CHMP5 in regulating bone formation and cellular senescence. The study provides compelling evidence that CHMP5 is essential for maintaining endolysosomal function and controlling mitochondrial ROS levels, thereby preventing the senescence of skeletal progenitor cells. 

      Strengths: 

      The authors demonstrate that the deletion of Chmp5 results in endolysosomal dysfunction, elevated mitochondrial ROS, and ultimately enhanced bone formation through both autonomous and paracrine mechanisms. The innovative use of senolytic drugs to ameliorate musculoskeletal abnormalities in Chmp5-deficient mice is a novel and critical finding, suggesting potential therapeutic strategies for musculoskeletal disorders linked to endolysosomal dysfunction. 

      Weaknesses: 

      The manuscript requires a deeper discussion or exploration of CHMP5's roles and a more refined analysis of senolytic drug specificity and effects. This would greatly enhance the comprehensiveness and clarity of the manuscript. 

      We thank the reviewer for these insightful comments. The tissue-specific roles of CHMP5 and the specificity of quercetin and dasatinib treatments in Chmp5-deficient mice will be further discussed and clarified in the revised manuscript. 

      Reviewer #2 (Public review): 

      Summary: 

      The authors try to show the importance of CHMP5 for skeletal development. 

      Strengths: 

      The findings of this manuscript are interesting. The mouse phenotypes are well done and are of interest to a broader (bone) field. 

      Weaknesses: 

      The mechanistic insights are mediocre, and the cellular senescence aspect poor. 

      In total, it has not been shown that there are actual senescent cells that are reduced after D+Q-treatment. These statements need to be scaled back substantially. 

      We thank the reviewer for these suggestive comments. Although multiple hallmarks of cell senescence were shown in CHMP5-deficient skeletal progenitors, we will detect and add additional markers of cell senescence in the revised manuscript. 

      In addition, the effects and specificity of the Q+D treatment will be further discussed and clarified with the revision.

      Reviewer #3 (Public review): 

      Summary: 

      In this study, Zhang et al. reported that CHMP5 restricts bone formation by controlling endolysosome-mitochondrion-mediated cell senescence. The effects of CHMP5 on osteoclastic bone resorption and bone turnover have been reported previously (PMID: 26195726), in which study the aberrant bone phenotype was observed in the CHMP5ctsk-CKO mouse model, using the same mouse model, Zhang et al., report a novel role of CHMP5 on osteogenesis through affecting cell senescence. Overall, it is an interesting study and provides new insights in the field of cell senescence and bone. 

      Strengths: 

      Analyzed the bone phenotype OF CHMP5-periskeletal progenitor-CKO mouse model and found the novel role of senescent cells on osteogenesis and migration. 

      Weaknesses: 

      (1) There are a lot of papers that have reported that senescence impairs osteogenesis of skeletal stem cells. In this study, the author claimed that Chmp5 deficiency induces skeletal progennitor cell senescence and enhanced osteogenesis. Can the authors explain the controversial results? 

      Different skeletal stem cell populations in time and space have been identified and reported. This study shows that Chmp5 deficiency in periskeletal and endosteal skeletal progenitors causes cell senescence and aberrant bone formation. Although cell senescence during aging can impair osteogenesis of certain skeletal stem cells, which contributes to diseases with low bone mass such as osteoporosis, aging can also increase heterotopic mineralization/calcification in musculoskeletal soft tissues such as ligaments and tendons, which is consistent with our results in this study. These reflect out-of-order musculoskeletal mineralization during aging. We will expand the discussion and clarify the results of CHMP5-regulated cell senescence in osteogenesis in the revised manuscript.

      (2) Co-culture of Chmp5-KO periskeletal progenitors with WT ones should be conducted to detect the migration and osteogenesis of WT cells in response to Chmp5-KO-induced senescent cells. In addition, the co-culture of WT periskeletal progenitors with senescent cells induced by H2O2, radiation, or from aged mice would provide more information.

      Increased osteogenesis of WT skeletal progenitors in the periskeletal lesion was shown to be a paracrine mechanism of abnormal bone formation in Chmp5Ctsk mice. The coculture experiment will help confirm the effect of Chmp5-deficient skeletal progenitors on the osteogenesis of neighboring WT skeletal progenitors.

      Notably, the cause and outcome of cell senescence are highly heterogeneous, and different causes of cell senescence can cause significantly different outcomes. Although the coculture of WT periskeletal progenitors with senescent cells induced by H2O2, radiation, or from aged mice would be very interesting, these are beyond the scope of the current study.

      (3) Many EVs were secreted from Chmp5-deleted periskeletal progenitors, compared to the rarely detected EVs around WT cells. Since EVs of BMSCs or osteoprogenitors show strong effects of promoting osteogenesis, did the EVs contribute to the enhanced osteogenesis induced by Chmp5-defeciency? 

      The WT skeletal progenitor cells from Chmp5Ctsk mice have an increased capacity of osteogenesis compared to the corresponding cells from control animals, suggesting that the EVs of the Chmp5-deleted periskeletal progenitors could promote osteogenesis of the WT skeletal progenitors, which represents a paracrine mechanism of abnormal bone formation in Chmp5 deficient animals. We will discuss and clarify these results in the revised manuscript.

      (4) EVs secreted from senescent cells propagate senescence and impair osteogenesis, why do EVs secreted from senescent cells induced by Chmp5-defeciency have opposite effects on osteogenesis? 

      The question is similar to comment #1. The functional heterogeneity of cellular senescence will be discussed in further detail and clarified in the revised manuscript.

      (5) The Chmp5-ctsk mice show accelerated aging-related phenotypes, such as hair loss and joint stiffness. Did Ctsk also label cells in hair follicles or joint tissue? 

      This is an interesting question. Although we did not check the expression of CHMP5 in hair follicles, which is outside the scope of the present study, the result in Fig. 1E showed the expression of CHMP5 in joint ligaments. Notably, abnormal periskeletal bone formation occurs predominantly at the joint ligament insertion site in Chmp5Ctsk mice, which will be elucidated and discussed in the revised manuscript.

      (6) Fifteen proteins were found to increase and five proteins to decrease in the cell supernatant of Chmp5Ctsk periskeletal progenitors. How about SASP factors in the secretory profile? 

      As mentioned above, the SASP phenotype and related factors of senescent cells could be highly heterogeneous depending on inducers, cell types, and timing of senescence. Most of the proteins we identified in the secretome analysis have previously been reported in the secretory profile of osteoblasts. Although we were also interested in the change of some common SASP factors, such as inflammatory cytokines, the experiment did not detect these factors because of their small molecular weights and the technical limitations of mass spec analysis. 

      (7) D+Q treatment mitigates musculoskeletal pathologies in Chmp5 conditional knockout mice. In the previously published paper (CHMP5 controls bone turnover rates by dampening NF-κB activity in osteoclasts), inhibition of osteoclastic bone resorption rescues the aberrant bone phenotype of the Chmp5 conditional knockout mice. Whether the effects of D+Q on bone overgrowth is because of the inhibition of bone resorption? 

      Although in Chmp5Ctsk mice we cannot exclude the effect of D+Q on osteoclasts, the effect of D+Q on osteoblast lineage cells, which is the focus of the current study, was verified in Chmp5Dmp1 mice. We will expand the discussion and make these results clearer with the revision.

      (8) The role of VPS4A in cell senescence should be measured to support the conclusion that CHMP5 regulates osteogenesis by affecting cell senescence. 

      We agree that additional experiments examining the role of VPS4A in cell senescence will provide more mechanistic insights. The focus of the current study is to report that CHMP5 restricts abnormal bone formation by preventing endolysosome-mitochondrion-mediated cell senescence. The roles of VPS4A in cell senescence and skeletal biology will be explored in separate studies.

      (9) Cell senescence with markers, such as p21 and H2AX, co-stained with GFP should be performed in the mouse models to indicate the effects of Chmp5 on cell senescence in vivo. 

      We will examine additional markers of cell senescence, as the reviewers suggest, in the revised manuscript.

      (10) ADTC5 cell as osteochondromas cells line, is not a good cell model of periskeletal progenitors. Maybe primary periskeletal progenitor cell is a better choice. 

      We were aware that ATDC5 cells are typically used as a chondrocyte progenitor cell line. However, our previous study showed that ATDC5 cells could also be used as a reasonable cell model for periskeletal progenitors. Furthermore, the corresponding results from primary periskeletal progenitors were shown. We will further clarify this in the revision.

      In general, the comments of these reviewers will help clarify our results and further strengthen our conclusion. We will address these comments and questions point to point in more detail in the revised manuscript.

    1. Author response:

      We sincerely thank the reviewers for their constructive feedback and the editor for facilitating this thorough review. We found the suggestions insightful and valuable for refining our manuscript.  We would like to clarify a few points in an initial response before presenting the fully updated manuscript. First of all, we would like to emphasize the multi-scale nature of our approach, where we derived insights from both atomistic and coarse-grained simulations. Reviewers focused mostly on the coarse-grained simulations, the drawbacks of which we are aware and were a strong motivation for starting with the atomistic approach. Reviewer 1 mentioned a lack of a proposed mechanism for the increased condensate forming propensity at 300K vs. 290K, and we feel we had clearly pointed to the aromatic contacts as a mechanism for this, but we will make sure to clarify this further in the revision. Furthermore, reviewer 1 was critical of our use of the 10% adjustment to Martini protein-water interactions, which has previously been thoroughly presented and assessed in the literature (see for example Tesei et al JCTC 2022). Furthermore, for our specific system we were encouraged by the favorable comparison of our Martini simulations to the atomistic simulations, e.g. for radius of gyration, contact propensity, and solvent accessibility. We will make sure to emphasize this more clearly in the revision. Finally, we are grateful for the feedback from both reviewers and will use their comments as a guide to incorporate additional analyses and extended simulations to strengthen our conclusions in an upcoming revision.

    1. Author response:

      We thank the reviewers for their thoughtful comments. 

      Based on their suggestions we will: 

      (1) Use more accurate language to describe the hypothalamus regions under investigation in this study. While we aimed to primarily investigate the medial preoptic area (MPOA), our dissections and sequencing data in fact capture several regions of the anterior hypothalamus including the anteroventral periventricular (AVPV), paraventricular (PVN), supraoptic (SON), suprachiasmatic nuclei (SCN), and more. We will revise the language in our manuscript to reflect that our study in fact investigates the cellular evolution of the anterior hypothalamus across behaviorally divergent deer mice.

      (2) Revise our language to clarify that while our study provides a rich dataset for generating hypotheses about which cell types may contribute to behavioral differences, it does not provide any evidence of causal relationships. We hope to investigate this further in future work.

      (3) Clarify specific methodological choices for which reviewers had questions, especially about the hypothalamic regions for which we did histology to validate cell abundance differences and methodological choices related to mapping our cell clusters to Mus cell types.

      Our responses to each reviewer’s specific comments are below.

      Reviewer #1:

      The major limitation of the study is the absence of causal experiments linking the observed changes in MPOA cell types to species-specific social behaviors. While the study provides valuable correlational data, it lacks functional experiments that would demonstrate a direct relationship between the neuronal differences and behavior. For instance, manipulating these cell types or gene expressions in vivo and observing their effects on behavior would have strengthened the conclusions, although I certainly appreciate the difficulty in this, especially in non-musculus mice. Without such experiments, the study remains speculative about how these neuronal differences contribute to the evolution of social behaviors.

      Yes, we agree the study lacks functional experiments. We hope that the dataset is of value for generating hypotheses about how hypothalamic neuronal cell types may govern species-specific social behaviors, and for these hypotheses to be functionally tested by us and others in future work.

      Reviewer #2:

      Some methodology could be further explained, like the decision of a 15% cutoff value for cell type assignment per cluster, or the necessity of a multi-step analysis pipeline for gene enrichment studies.

      A 15% cutoff value for cell type assignment was chosen to include all known homology correspondences between our dataset and the Mus atlas. For example, i14:Avp/Cck cells from the Mus atlas represent Avp cells from the suprachiasmatic nuclei (SCN). Though only 17.3% of cluster 15 maps to i14:Avp/Cck, we know these two clusters correspond based on the expression of Avp and additional SCN marker genes in cluster 15 (Supp Fig 6). We will further explain this cutoff in the revised manuscript.

      Our gene enrichment study includes a multi-step analysis pipeline because we wanted to control for confounders that may be introduced because of gene expression level. Genes that are more highly expressed are more accurately quantified and thus more likely to be identified as differentially expressed. Therefore, we wanted to test for gene enrichments in our set of DE genes against a background of genes with similar expression levels. We will clarify this motivation in the revised manuscript.

      The authors should exercise strong caution in making inferences about these differences being the basis of parental behavior. It is possible, given connections to relevant research, but without direct intervention, direct claims should be avoided. There should be clear distinctions of what to conclude and what to propose as possibilities for future research.

      Yes, we agree that we are unable to make direct claims about neuronal differences being the basis of parental behavior. We will revise our language to be clearer about which relationships we are hypothesizing and what we propose as possibilities for future research.

      Histology is not performed on all regions included in the sequencing analysis.

      We apologize that our language describing the hypothalamic regions included in the sequencing analysis and those included in the histology is unclear. We aimed to dissect the medial preoptic region for the sequencing analysis, but additionally captured parts of the anterior hypothalamus including the paraventricular (PVN), supraoptic (SON), and suprachiasmatic nuclei (SCN), and more.  Our histology was performed across the entire hypothalamus and includes all regions included in the sequencing data. We will revise the manuscript to more accurately describe the hypothalamic regions for which we investigated.

      Reviewer #3:

      My primary concern is that the dataset is limited: 52,121 neuronal nuclei across 24 samples, which does not provide many cells per cluster to analyze comparatively across sex and species, particularly given the heterogeneity of the region dissected. The Supplementary table reports lower UMIs/genes per cell than is typically seen as well. Perhaps additional information could be obtained from the data by not restricting the analyses to cells that can be assigned to Mus types. A direct comparison of the two Peromyscus species could be valuable as would a more complete Peromyscus POA atlas.

      Our dataset reports ~1,500 genes and ~1,000 UMIs per nuclei which is indeed lower than is typically reported in other single nuclei datasets. Some of this discrepancy is due to a lower quality genome and annotated transcriptome available for Peromyscus compared to Mus musculus, which results in a lower mapping rate than is typically reported in Mus studies. However, our dataset was sufficient to identify known peptidergic cell types (Supp Fig 6) and to map homology to Mus cell types for 34 (64%) of our 53 clusters. Additionally, although some of our clusters contain small numbers of cells, our differential abundance analysis accounts for the variance in cell numbers observed across samples and should be robust against any increase in variance due to small numbers. In fact, even differential abundance of very small cell clusters such as oxytocin neurons (cell type 40) was validated by histology. 

      We would like to clarify that all analyses were performed on all cell clusters, regardless of whether or not they could be assigned homology to a Mus cell type. All the cell types that we identified as differentially abundant or contained significant sex differences happened to be cell types for which homology to a Mus cell type could be defined. This may arise for a relatively uninteresting reason: cell types that have more distinct transcriptional signatures will be more accurately clustered, leading to more accurate identification of homology as well as more accurate measurements of differential abundance / expression. We will revise language to make this more clear in our manuscript.

      In Supplement 7, it appears that most neurons can be assigned as excitatory or inhibitory, but then so many of these cells remain in the unassigned "gray blob" seen in panel 1E. Clustering of excitatory and inhibitory neurons separately, as in prior cited work in Mus POA (refs 31 and 57) may boost statistical power to detect sex and species differences in cell types. Perhaps the cells that cannot be assigned to Mus contain too few reads to be useful, in which case they should be filtered out in the QC. The technical challenges of a comparative single-cell approach are considerable, so it benefits the scientific community to provide transparency about them.

      We are not certain about why we are unable to cluster and assign homology to many of our cells (i.e. cells in the unassigned “gray blob”). However, we note that even in the Mus atlas, many cells did not belong to obvious clusters by UMAP visualization and that several clusters lacked notable marker genes and were designated simply as “Gaba” and “Glut” clusters. Therefore, it is unsurprising that our own dataset also contains cells that lack the transcriptional signatures needed to be clustered and/or mapped to Mus cell types. We do know, however, that the median number of reads/nuclei is uniform across cell clusters and does not explain why some clusters could not be assigned to Mus. We will add this information to our revised manuscript. 

      We do not think that a two-stage clustering (i.e. clustering first by excitatory vs. inhibitory neurons) is expected to gain power to resolve cell types in this case. Excitatory vs. inhibitory neurons are clearly separable on our UMAP (Supp Fig 7) so that information is already being used by our clustering procedure. However, we will explore this further in our revised manuscript to see if doing so will boost statistical power.

      The Calb1 dimorphism as observed by immunostaining, appears much more extensive in P. maniculatus compared to P. polionotus (Figures 3 E and F). This finding is not reflected in the counts of the i20:Gal/Moxd1 cluster. The use of Calb1 staining as a proxy for the Gal/Moxd1 cluster would be strengthened if the number of POA Calb1+ neurons that are found in each cluster was apparent. There may be additional Calb+ neurons in the cells that are not annotated to a Mus cluster. This clarification would add support to the overall conclusion that there is reduced sexual dimorphism in P. polionotus.

      From the Mus MPOA atlas (which includes both single-cell sequencing data and imaging-based spatial information), it is known that the i20:Gal/Moxd1 cluster comprises sexually dimorphic cells that make up both the BNST and the SDN-POA. These sexually dimorphic cells are well-studied and known to be marked by Calb1, which we used in immunostaining as a proxy for i20:Gal/Moxd1. 

      However, we would like to clarify that in our study, the immunostaining of Calb1+ neurons and the sequencing counts of the i20:Gal/Moxd1 cluster are not completely reflective of each other because our sequencing dataset only captured the ventral portion of the BNST. Therefore our i20:Gal/Moxd1 counts contain a combination of some Calb1+ BNST cells and likely all Calb1+ SDN-POA cells and is difficult to interpret on its own. Our histology, however, covers the entire hypothalamus and is more reliable for identifying sex and species differences in each region. We will clarify this in the revised manuscript. 

      The relationship between the sex steroid receptor expression and the sex bias in gene expression would be improved if the sex bias in sex steroid receptor expression was included in Supplementary Figure 10.

      We will include this in the revised manuscript. 

      There is no explanation for the finding that there is a female bias in gene expression across all cell types in P. polionotus.

      We also find this observation interesting but don’t have a good explanation for why at this point. We plan to follow this up in future work.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      We thank Reviewer #1 for the relevant and insightful comments on our paper. Please find our detailed answers below in the Recommendations to the Authors section.

      Summary: 

      The researchers examined how individuals who were born blind or lost their vision early in life process information, specifically focusing on the decoding of Braille characters. They explored the transition of Braille character information from tactile sensory inputs, based on which hand was used for reading, to perceptual representations that are not dependent on the reading hand. 

      They identified tactile sensory representations in areas responsible for touch processing and perceptual representations in brain regions typically involved in visual reading, with the lateral occipital complex serving as a pivotal "hinge" region between them.

      In terms of temporal information processing, they discovered that tactile sensory representations occur prior to cognitive-perceptual representations. The researchers suggest that this pattern indicates that even in situations of significant brain adaptability, there is a consistent chronological progression from sensory to cognitive processing. 

      Strengths: 

      By combining fMRI and EEG, and focusing on the diagnostic case of Braille reading, the paper provides an integrated view of the transformation processing from sensation to perception in the visually deprived brain. Such a multimodal approach is still rare in the study of human brain plasticity and allows us to discern the nature of information processing in blind people's early visual cortex, as well as the time course of information processing in a situation of significant brain adaptability. 

      Weaknesses: 

      The lack of a sighted control group limits the interpretations of the results in terms of profound cortical reorganization, or simple unmasking of the architectural potentials already present in the normally developing brain. 

      We thank the reviewer for raising this important point! We acknowledge that our claims regarding the unmasking of architectural potentials in both the normally developing and visually deprived brain are limited by the study design we employed. However, we note that defining an appropriate control group and assessing non-visual reading in sighted participants is far from straightforward. We discuss these issues in our response to the Public Review of Reviewer 2.

      Moreover, the conclusions regarding the behavioral relevance of the sensory and perceptual representations in the putatively reorganized brain are limited due to the behavioral measurements adopted.

      We agree with the reviewer that the relation between behavior and neural representations as established via perceived similarity judgments are task-dependent, and that a richer assessment of behavior would be valuable. Please note, however, that this limitation pertains to any experimental task used to assess behavior in the laboratory. Our major goal was to assess whether the identified neural representations are suitably formatted to be used by the brain for at least one behavior rather than being epiphenomenal. We found that the representations are suitably formatted for similarity judgments, thus establishing that they are relevant for at least this behavior. We also argue that judging similarity is a complex task that may underlie many other relevant behaviors. We discuss this point further in response to the Recommendations to the Authors.

      Reviewer #2 (Public Review): 

      We thank the reviewer for the considerate and thoughtful suggestions. Please find a detailed description of the implemented changes below.

      Summary: 

      Haupt and colleagues performed a well-designed study to test the spatial and temporal gradient of perceiving braille letters in blind individuals. Using cross-hand decoding of the read letters, and comparing it to the decoding of the read letter for each hand, they defined perceptual and sensory responses. Then they compared where (using fMRI) and when (using EEG) these were decodable. Using fMRI, they showed that low-level tactile responses specific to each hand are decodable from the primary and secondary somatosensory cortex as well as from IPS subregions, the insula, and LOC. In contrast, more abstract representations of the braille letter independent from the reading hand were decodable from several visual ROIs, LOC, VWFA, and surprisingly also EVC. Using a parallel EEG design, they showed that sensory hand-specific responses emerge in time before perceptual braille letter representations. Last, they used RSA to show that the behavioral similarity of the letter pairs correlates to the neural signal of both fMRI (for the perceptual decoding, in visual and ventral ROIs) and EEG (for both sensory and perceptual decoding). 

      Strengths: 

      This is a very well-designed study and it is analyzed well. The writing clearly describes the analyses and results. Overall, the study provides convincing evidence from EEG and fMRI that the decoding of letter identity across the reading hand occurs in the visual cortex in blindness. Further, it addresses important questions about the visual cortex hierarchy in blindness (whether it parallels that of the sighted brain or is inverted) and its link to braille reading. 

      Weaknesses: 

      Although I have some comments and requests for clarification about the details of the methods, my main comment is that the manuscript could benefit from expanding its discussion. Specifically, I'd appreciate the authors drawing clearer theoretical conclusions about what this data suggests about the direction of information flow in the reorganized visual system in blindness, the role VWFA plays in blindness (revised from the original sighted role or similar to it?), how information arrives to the visual cortex, and what the authors' predictions would be if a parallel experiment would be carried out in sighted people (is this a multisensory recruitment or reorganization?). The data has the potential to speak to a lot of questions about the scope of brain plasticity, and that would interest broad audiences. 

      We thank the reviewer for the opportunity to provide clearer theoretical conclusions from our data. We elaborate on each of the points raised by the reviewer in the discussion section.

      Concerning the direction of information flow in the reorganized visual system in blindness, we focus on information arrival to EVC and information flow beyond EVC.

      p. 11, ll. 376-386, Discussion 4.1:

      “Overall, identifying braille letter representations in widespread brain areas raises the question of how information flow is organized in the visually deprived brain. Functional connectivity studies report deprivation-driven changes of thalamo-cortical connections which could explain both arrival of information to and further flow of information beyond EVC. First, the coexistence of early thalamic connections to both S1 and V1 (Müller et al., 2019) would enable EVC to receive from different sources and at different timepoints. Second, potentially overlapping connections from both sensory cortices to other visual or parietal areas (Ioannides et al., 2013) could enable the visually deprived brain to process information in a widespread and interconnected array of brain areas. In such a network architecture, several brain areas receive and forward information at the same time. In contrast to information discretely traveling from one processing unit to the next in the sighted brain’s processing cascade, we can rather picture information flowing in a spatially and functionally more distributed and overlapping fashion.”

      Regarding the role of VWFA, we propose that the functional organization of VWFA is modality-independent.

      p. 10, ll. 346-348, Discussion 4.1:

      “Second, we found that VWFA contains perceptual but not sensory braille letter representations. By clarifying the representational format of language representations in VWFA, our results support previous findings of the VWFA being functionally selective for letter and word stimuli in the visually deprived brain (Reich et al., 2011; Striem-Amit et al., 2012; Liu et al., 2023). Together, these findings suggest that the functional organization of the VWFA is modality-independent (Reich et al., 2011), depicting an important contribution to the ongoing debate on how visual experience shapes representations along the ventral stream (Bedny et al., 2021).” Lastly, we would like to share our thoughts about carrying out a parallel experiment in sighted people. 

      In general, we agree that it seems insightful to conduct a parallel, analogous experiment in sighted participants with the aim to disentangle whether the effects seen in blind participants are due to multisensory recruitment or reorganization. However, before making predictions regarding the outcome, we would have to define an analogous experiment in sighted participants that taps into the same mechanisms. This, however, is difficult to do as it is unclear what counts as analogous. For example, if we compare braille reading to reading visually presented braille dot arrays or Roman letters, we will assess visual object processing, a different mechanism from that involved in braille reading. Alternatively, if we compare braille reading to sighted participants reading embossed Roman letters haptically or ideally even reading Braille after extensive training, we still face the inherent problem that sighted participants have visual experiences and could use visual imagery strategies in these nonvisual tasks. As we cannot experimentally ensure that sighted participants do not use visual strategies to solve a task, this would always complicate drawing conclusions about the underlying processes. More specifically, we could never pinpoint whether differences between sighted and blind participants are due to measuring different mechanisms or measuring the same mechanism and unravelling underlying changes (i.e., multisensory recruitment or reorganization). Finally, apart from potential confounds due to visual imagery, considering populations of sighted readers and Braille readers as only differing with regard to their input modality and otherwise being comparable is problematic: In general, blind populations are more heterogenous than most typical samples due to various factors such as aetiologies, onset and severity (Merabet & Pascual-Leone, 2010). Even when carrying out studies in highly specific population subsamples, such as in congenitally blind braille readers, vast within-group differences remain, e.g., the quality and quantity of their braille education, as well as across braille and print readers, e.g., different passive exposure to braille versus written letters during childhood (Englebretson et al., 2023). Hence, to fully match the groups in terms of learning experience we would, for example, have to teach sighted infants braille reading in childhood and follow them up until a comparable age. This approach does not seem feasible. 

      p. 10, ll. 328-341, Discussion 4.1:

      “We note that our findings contribute additional evidence but cannot conclusively distinguish between the competing hypotheses that visually deprived brains dynamically adjust to the environmental constraints versus that they undergo a profound cortical reorganization. Resolving this debate would require an analogous experiment in sighted people which taps into the same mechanisms as the present study. Defining a suitable control experiment is, however, difficult. Any other type of reading would likely tap into different mechanism than braille reading. Further, whenever sighted participants are asked to perform a haptic reading task, outcomes can be confounded by visual imagery driving visual cortex (Dijkstra et al., 2019). Thus, the results would remain ambiguous as to whether observed differences between the groups index different mechanisms or plastic changes in the same mechanisms. Last, matching groups of sighted readers and braille readers such that they only differ with regard to their input modality seems practically unfeasible: There are vast differences within the blind population in general, e.g., aetiologies, onset and severity, and the subsample of congenitally blind braille readers more specifically, e.g., the quality and quantity of their braille education, as well as across braille and print readers, e.g., different passive exposure to braille versus written letters during childhood (Englebretson et al., 2023; Merabet & Pascual-Leone, 2010).”

      While we appreciate that the conclusions we can draw from our results are limited by our sample and defining an appropriate parallel experiment in sighted participants is difficult for the reasons discussed above, we would still like to share our speculations regarding the process underlying our result pattern. We think that our results, taken together with results of previous studies, suggest that EVC does not undergo fundamental reorganization in the case of visual deprivation. Rather, it can flexibly adjust to given processing requirements. This flexibility is not infinite; adjustments are limited by the area’s architectural and computational capacity. Importantly, we think that this claim refers to an unmasking of preexisting potential rather than multisensory recruitment.

      To aid in drawing even more concrete conclusions about the flow of information, I suggest that the authors also add at least another early visual ROI to plot more clearly whether EVC's response to braille letters arrives there through an inverted cortical hierarchy, intermediate stages from VWFA, or directly, as found in the sighted brain for spoken language. 

      We thank the reviewer for this comment. However, EVC here consists of V1 to V3, and we already also assess V4, LOC, VWFA and LFA. Thus, we assess regions at all levels of processing from mid- over low- to high-level and cannot add a further interim ROI. Our results using this ROI set do not allow us to arbitrate between the hypotheses raised by the reviewer.

      Similarly, it may be informative to look specifically at the occipital electrodes' time differences between decoding for the different parameters and their correlation to behavior.

      We thank the reviewer for this suggestion. However, the spatial resolution of EEG measurements is limited, and we cannot convincingly determine the neural source of signals being recorded from specific electrodes, i.e., occipital. When we reduce the number of electrodes before analysis, we primarily see comparable qualitative trends in the data albeit with a reduction in signal-to-noise-ratio.

      To illustrate, we repeated the EEG time decoding and the EEG-behavior RSA with only occipital and parieto-occipital electrodes (n=8) instead of all electrodes (n=63) and added the results to the Supplementary Material (see Supplementary Figure 3 and 4). Overall, we observe a reduction in signal-to-noise-ratio. This is not surprising given that the EEG searchlight decoding results (Figure 3b) reveal sources of the decoding signals extend beyond occipital and parieto-occipital electrodes. 

      In the EEG time decoding analysis, we see a comparable trend to the whole brain EEG analysis but do not find a significant difference in onsets of sensory and perceptual representation. 

      In the behavior-EEG RSA, we do find that the correlations between behavior and sensory representations emerge significantly earlier than correlations between behavior and perceptual representations. (N = 11, 1,000 bootstraps, one-tailed bootstrap test against zero, P< 0.001). This result is in line with the whole brain EEG analysis.

      Regarding the methods, further detail on the ability to read with both hands equally and any residual vision of the participants would be helpful.

      We thank the reviewer for raising this point. We assessed participants’ letter reading capabilities in a short screening task prior to the experiment. Participants read letters with both hands separately and we used the same presentation time as in the experiment. As the result showed that average performance for recognizing letters with the left hand (89%) and right hand (88%) were comparable. We did not measure continuous reading in the present study, and we did not assess further information about participants’ ability to read equally well with both hands. 

      While the information about the screening task was previously included in Methods section 5.3.2 EEG experiment, we now moved it into a separate section 5.3.3 Braille screening task to make the information better accessible. 

      p. 14, ll. 529-533, Methods 5.3.3:

      “Prior to the experiment, participants completed a short screening task during which each letter of the alphabet was presented for 500ms to each hand in random order. Participants were asked to verbally report the letter they had perceived to assess their reading capabilities with both hands using the same presentation time as in the experiment. The average performance for the left hand was 89% correct (SD = 10) and for the right hand it was 88% correct (SD = 13).”

      We thank the reviewer for the suggestion to include information regarding participant’s residual vision. We now added information about participants’ residual light perception to Supplementary Table 1.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      (1) ROI vs Searchlight Results: Figures 2 b and c do not seem to match. The ROI results (b) should be somehow consistent with the whole brain results (c), but "perceptual" decoding in the searchlight (in green) seems localized in sensorimotor areas while for the same classification, no sensorimotor ROI is significant. can the authors clarify this difference?

      Similarly, perceptual decoding does not emerge in EVC with the searchlight analysis, whereas is quite strong in ROI analysis.

      We agree that the results of the ROI and searchlight decoding do not show a direct match. We think that this difference is due to methodological reasons. For example, ROI decoding can be more sensitive when ROIs follow functionally relevant boundaries in the brain, in comparison to spheres used in searchlight decoding that do not. In turn, searchlight decoding may be more sensitive when information is distributed across functional boundaries that would be captured in different ROIs rather than combined, or when ROI definition is difficult (such as here in the visual system of blind participants).

      However, we point out that the primary goal of our searchlight decoding was to show that no other areas beyond our hypothesized ROIs contained braille letter representations, rather than reproducing the ROI results.

      Decoding accuracies are tested against chance (50% for pairwise classifications) according to methods. In the case of "sensory and perceptual" and "perceptual" classification, this is straightforward. In the case of the analysis that isolates "sensory" representations though the difference is computed between "sensory and perceptual" and "perceptual" decoding accuracies, the accuracies resulting from this difference should thus be centered around 0.

      Are the accuracies tested against 0 in this case? This is not specified in the methods. Furthermore, the data reported in Figure 2 and Figure 3. seem to have 0% as a baseline and the label states "decoding accuracy". Can the authors clarify whether the reported data are the difference in accuracy with an estimated empirical baseline or an expected baseline of 50%? 

      The reviewer is correct in stating that we tested “sensory and perceptual” and “perceptual” against chance level and the difference score “sensory” against 0 and that this information was missing in the methods section.

      We now specify in the methods that we are testing the accuracies for the “sensory” analysis against 0.

      p. 16, ll. 625-627, Methods 5.6:

      “We conducted subject-specific braille letter classification in two ways. First, we classified between letter pairs presented to one reading hand, i.e., we trained and tested a classifier on brain data recorded during the presentation of braille stimuli to the same hand (either the right or the left hand). This yields a measure of hand-dependent braille letter information in neural measurements. We refer to this analysis as within-hand classification. Second, we classified between letter pairs presented to different hands in that we trained a classifier on brain data recorded during the presentation of stimuli to one hand (e.g., right), and tested it on data related to the other hand (e.g., left). This yields a measure of hand-independent braille letter information in neural measurements. We refer to this analysis as across-hand classification. We tested both within-hand and across-hand pairwise classification accuracies against a chance level of 50%. We also calculated a within-across hand classification score which we compared against 0.”

      Regarding Figures 2 and 3, we plot the results as decoding accuracies minus chance level to standardize the y-axes for all three analyses, i.e., compare them to 0. We have corrected the y-axis labels accordingly. 

      In our analyses, we assumed an expected baseline of 50%. But in the response below we provide evidence that our results remain stable whether using an expected or empirical baseline.

      If my understanding is correct, a potential problem persists. The different analyses may not be comparable, because in the "sensory" analysis the baseline is empirically defined, being the classification accuracies of the "perceptual" decoding, while in the other two analyses, the baseline is set at 50%. There are suggestions in the literature to derive empirically defined baselines by randomly shuffling the trial labels and repeating the classification accuracies [grootswagers 2017]. In the context of the present work, its use will make the different statistical analyses more comparable. I would thus suggest the authors define the baseline empirically for all their analyses or, given the high computational demand of this analysis, provide evidence that the results are not affected by this difference in the baseline. 

      We thank the reviewer for raising this point. As the reviewer correctly stated, the “sensory” analysis has an empirically defined baseline because it is a difference score while in the other two analyses the baseline is set at 50%.

      To provide evidence that our results are not affected by this difference in baseline, we now re-ran the EEG time decoding. We derived null distributions from the empirical data for all three analyses, following the guidelines from Grootswagers 2017 (page 688, section “Evaluation of Classifier Performance and Group Level Statistical Testing Statistical”):

      “Another popular alternative is the permutation test, which entails repeatedly shuffling the data and recomputing classifier performance on the shuffled data to obtain a null distribution, which is then compared against observed classifier performance on the original set to assess statistical significance (see, e.g., Kaiser et al., 2016; Cichy et al., 2014; Isik et al., 2014). Permutation tests are especially useful when no assumptions about the null distribution can be made (e.g., in the case of biased classifiers or unbalanced data), but they take much longer to run (e.g., repeating the analysis 10,000 times).”

      Running a sign permutation test with 10,000 repetitions, we show that the results are comparable to the previously reported results based on one-sided Wilcoxon signed rank tests. We are, therefore, confident that our reported results are not affected by this difference in baseline. We now added this control analysis to the results section and supplementary material (see Supplementary Figure 5).

      p. 7-8, ll. 213-215, Results 3.2: 

      “Importantly, the temporal dynamics of sensory and perceptual representations differed significantly. Compared to sensory representations, the significance onset of perceptual representations was delayed by 107ms (21-167ms) (N = 11, 1,000 bootstraps, one-tailed bootstrap test against zero, P= 0.012). This results pattern was consistent when defining the analysis baseline empirically (see Supplementary Figure 5).”

      (2) According to the authors, perceptual rather than sensory braille letter representations identified in space are suitably formatted to guide behavior. However, they acknowledge that this finding is likely to be task-dependent because it is based on subject similarity ratings.

      Maybe they could use a more objective similarity measurement of Braille letters similarity?

      For instance, they can compare letters using Jaccard similarity (See for instance: Bottini et al. 2022). 

      We thank the reviewer for the opportunity to clarify. We acknowledge that our findings regarding the behavioral relevance of the identified neural representations are task-dependent. But, importantly, this is not because we use perceived similarity ratings as a measurement, but because we only use one measurement while there are infinitely many other potential tasks to assess behavior. This means that the same limitation holds when using another similarity measure like Jaccard similarity. We now clarify this in the Discussion section: 

      p. 12, ll. 419-420, Discussion 4.3:

      “Our results clarified that perceptual rather than sensory braille letter representations identified in space are suitably formatted to guide behavior. However, we only use one specific task to assess behavior and, therefore, acknowledge that this finding is taskdependent.”

      Nevertheless, we calculated Jaccard similarity based on the definition used in Bottini et. al. There are no significant correlations for the EEG-behavior or fMRI-behavior RSA when we use the Jaccard matrix and subject-specific EEG or fMRI RDMs (see Supplementary Figure 6).

      This demonstrates that braille letter similarity ratings are significantly correlated with neural representations in space and time but Jaccard similarity of braille dot overlaps is not. 

      (3) If the primacy of perceptual similarity holds also with more objective measures of letter similarity, I think the authors should spend a few more words characterizing the results in fMRI and EEG that are rather divergent (concerning this analysis). Indeed, EEG analysis shows a significant correlation between similarity ratings and within-hand classification accuracy, although this correlation does not emerge in the "sensory" ROIs. I think these findings can be put together, hypothesizing that sensory-based similarity correlates with behavior but only in perceptual ROIs. However, why so? Can the authors provide a more mechanistic explanation? Am I missing something? 

      We thank the reviewer for this intriguing idea. We now speculate about how we could harmonize the results from the behavior-EEG and behavior-fMRI RSAs in the discussion section. 

      p. 12, ll. 438-442, Discussion 4.3:

      “Similarity ratings and sensory representations as captured by EEG are correlated, and so are similarity ratings and representations in perceptual ROIs, but not sensory ROIs. This might be interpreted as suggesting a link between the sensory representations captured in EEG and the representations in perceptual ROIs. However, we do not have any evidence towards this idea. Differing signalto-noise ratios for the different ROIs and sensory versus perceptual analysis could be an alternative explanation.“

      (4) In the methods they state that EEG decoding is tested against chance at each time point but these results are not reported, only latency analysis is reported. Can the authors report the significant time points of the EEG time series decoding?  

      We thank the reviewer for catching this inconsistency! We have now added this information to Figure 3a.

      (5) In fMRI ROI definition procedure, the top 321 voxels of each anatomical ROI that had the highest functional activation were selected. The number of voxels is based on the smaller ROI, which to my understanding means that for this ROI all the voxels were selected potentially introducing noise and impacting the comparison between ROIs. Can the authors clarify which ROI was the smallest? 

      Thank you for the question! The smallest ROI was V4. This indeed means that for this ROI all voxels were selected. This could have led to our results being noisy in V4 but should not influence the results in other ROIs. We now added this information to the methods section.  p. 15, ll. 592, Methods 5.4.4:

      “The smallest mask was V4 which included 321 voxels.”

      (6) Finally, the author suggests that: "Importantly, higher-level computations are not limited to the EVC in visually deprived brains. Natural sound representations 41 and language activations 53 are also located in EVC of sighted participants. This suggests that EVC, in general, has the capacity to process higher-level information 54. Thus, EVC in the visually deprived brain might not be undergoing fundamental changes in brain organization 53. This promotes a view of brain plasticity in which the cortex is capable of dynamic adjustments within pre-existing computational capacity limits 4,53-55." - The presence of a sighted control group would have strengthened this claim. 

      We agree with the reviewer and now discuss the limitations of our approach in the discussion section (see response to weaknesses raised by Reviewer 2 in the Public Review above).

      Reviewer #2 (Recommendations For The Authors): 

      (1) Can the authors comment on the reaction time of the two reading hands? Completely ambidextrous reading is not necessarily common, so any differences in ability or response time across the hands may affect the EEG results. Alternatively, do the authors have any additional behavioral data about the participants' ability to read well with both hands? 

      We thank the reviewer for these questions! We did not assess reaction times and acknowledge this as a limitation. We did, however, measure accuracies and would have expected to see a speed-accuracy-trade off if reaction times would differ between hands, i.e., we would have expected lower accuracy for the hand with higher RTs. But this was not the case: our participants had comparable accuracy values when reading letters with both hands (see methods section 5.3.3 and answer to Public Review above). This measure indicated that participants recognized Braille letters presented for 500ms equally well with both index fingers.

      (2) Please add information about any residual sight in the blind participants (or are they all without light perception?)

      We have now added information about residual light perception in Supplementary Table 1 (see above in response to Public Review).

      (3) Is active tactile exploration involved, or are the participants not moving their fingers at all over the piezo-actuators? Can the authors elaborate more on how the participants used this passive input?

      We thank the reviewer for the opportunity to clarify. Our experimental setup does not involve tactile exploration or sliding motions. Instead, participants rest their index fingers on the piezo-actuators and feel the static sensation of dots pushing up against their fingertips. We assume that participants used the passive input of specific dot stimulation location on fingers to perceive a dot array which, in turn, led to the percept of a braille letter.

      We now specify this information in the methods section.

      p. 13, ll. 474-475, Methods 5.2:

      “The modules were taped to the clothes of a participant for the fMRI experiment and on the table for the EEG and behavioral experiment. This way, participants could read in a comfortable position with their index fingers resting on the braille cells to avoid motion confounds. Importantly, our experimental setup did not involve tactile exploration or sliding motions. We instructed participants to read letters regardless of whether the pins passively stimulated their immobile right or left index finger.”

      (4) I appreciated the RSA analysis, but remain curious about what the ratings were based on.

      Do the authors know what parameters participants used to rate for? Were these consistent across participants? That would aid in interpreting the results.

      We thank the reviewer for the interest in our representational similarity analyses linking the neural representations to behavior. 

      We do not know which parameters participants explicitly used to rate the similarity between letters. We instructed participants to freely compare the similarity of pairs of braille letters without specifying which parameters they should use for the similarity assessment. We speculate that participants used a mixture of low-level features such as stimulation location on fingers and higher-level features such as linguistic similarity between letters. We now clarify the free comparison of braille letter pairs in the methods section:

      p. 14, ll. 538-539, Methods 5.3.4:

      “Each pair of letters was presented once, and participants compared them with the same finger. We instructed participants to freely compare the similarity of pairs of Braille letters without specifying which parameters they should use for the similarity assessment. The rating was without time constraints, meaning participants decided when they rated the stimuli. Participants were asked to verbally rate the similarity of each pair of braille letters on a scale from 1 = very similar to 7 = very different and the experimenter noted down their responses.”

      (5) Can the authors provide confusion matrices for the decoding analyses in the supplementary materials? This could be informative in understanding what pairs of letters are most discernable and where. 

      We have added confusion matrices for within- and between-hand decoding for all ROIs and for the time points 100ms, 200ms, 300ms and 400ms to the Supplementary Material (see Supplementary Figures 7-10).

      (6) Was slice time correction done for the fMRI data? This is not reported. 

      We now added this information to the methods section - our fMRI preprocessing pipeline did not include slice timing correction.  

      p. 14, ll. 554, Methods 5.4.2:

      “We did not apply high or low-pass temporal filters and did not perform slice time correction.”

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #3 (Public review):

      Summary:

      Juan Liu et al. investigated the interplay between habitat fragmentation and climate-driven thermophilization in birds in an island system in China. They used extensive bird monitoring data (9 surveys per year per island) across 36 islands of varying size and isolation from the mainland covering 10 years. The authors use extensive modeling frameworks to test a general increase of the occurrence and abundance of warm-dwelling species and vice versa for cold-dwelling species using the widely used Community Temperature Index (CTI), as well the relationship between island fragmentation in terms of island area and isolation from the mainland on extinction and colonization rates of cold- and warm-adapted species. They found that indeed there was thermophilization happening during the last 10 years, which was more pronounced for the CTI based on abundances and less clearly for the occurrence based metric. Generally, the authors show that this is driven by an increased colonization rate of warm-dwelling and an increased extinction rate of cold-dwelling species. Interestingly, they unravel some of the mechanisms behind this dynamic by showing that warm-adapted species increased while cold-dwelling decreased more strongly on smaller islands, which is - according to the authors - due to lowered thermal buffering on smaller islands (which was supported by air temperature monitoring done during the study period on small and large islands). They argue, that the increased extinction rate of cold-adapted species could also be due to lowered habitat heterogeneity on smaller islands. With regards to island isolation, they show that also both thermophilization processes (increase of warm and decrease of cold-adapted species) was stronger on islands closer to the mainland, due to closer sources to species populations of either group on the mainland as compared to limited dispersal (i.e. range shift potential) in more isolated islands.

      The conclusions drawn in this study are sound, and mostly well supported by the results. Only few aspects leave open questions and could quite likely be further supported by the authors themselves thanks to their apparent extensive understanding of the study system.

      Strengths:

      The study questions and hypotheses are very well aligned with the methods used, ranging from field surveys to extensive modeling frameworks, as well as with the conclusions drawn from the results. The study addresses a complex question on the interplay between habitat fragmentation and climate-driven thermophilization which can naturally be affected by a multitude of additional factors than the ones included here. Nevertheless, the authors use a well balanced method of simplifying this to the most important factors in question (CTI change, extinction, colonization, together with habitat fragmentation metrics of isolation and island area). The interpretation of the results presents interesting mechanisms without being too bold on their findings and by providing important links to the existing literature as well as to additional data and analyses presented in the appendix.

      Weaknesses:

      The metric of island isolation based on distance to the mainland seems a bit too oversimplified as in real-life the study system rather represents an island network where the islands of different sizes are in varying distances to each other, such that smaller islands can potentially draw from the species pools from near-by larger islands too - rather than just from the mainland. Although the authors do explain the reason for this metric, backed up by earlier research, a network approach could be worthwhile exploring in future research done in this system. The fact, that the authors did find a signal of island isolation does support their method, but the variation in responses to this metric could hint on a more complex pattern going on in real-life than was assumed for this study.

      Thank you again for this suggestion. Based on the previous revision, we discussed more about the importance of taking the island network into future research. The paragraph is now on Lines 294-304:

      “As a caveat, we only consider the distance to the nearest mainland as a measure of fragmentation, consistent with previous work in this system (Si et al., 2014), but we acknowledge that other distance-based metrics of isolation that incorporate inter-island connections and island size could hint on a more complex pattern going on in real-life than was assumed for this study, thus reveal additional insights on fragmentation effects. For instance, smaller islands may also potentially utilize species pools from nearby larger islands, rather than being limited solely to those from the mainland. The spatial arrangement of islands, like the arrangement of habitat, can influence niche tracking of species (Fourcade et al., 2021). Future studies should use a network approach to take these metrics into account to thoroughly understand the influence of isolation and spatial arrangement of patches in mediating the effect of climate warming on species.”

      Recommendations for the authors:

      Reviewer #3 (Recommendations for the authors):

      Great job on the revision! The new version reads well and in my opinion all comments were addressed appropriately. A few additional comments are as follows:

      Thank you very much for your further review and recognition. We have carefully modified the manuscript according to all recommendations.

      (1) L 62: replace shifts with process

      Done. We also added the word “transforming” to match this revision. The new sentence is now on Lines 61-63:

      “Habitat fragmentation, usually defined as the process of transforming continuous habitat into spatially isolated and small patches”

      (2) L 363: Your metric for habitat fragmentation is isolation and habitat area and I think this could be introduced already in the introduction, where you somewhat define fragmentation (although it could be clearer still). You could also discuss this in the discussion more, that other measures of fragmentation may be interesting to look at.

      Thank you for this suggestion. We now introduced metric of habitat fragmentation in the Introduction part after habitat fragmentation was defined. The sentence is now on Lines 64-66:

      “Among the various ways in which habitat fragmentation is conceptualized and measured, patch area and isolation are two of the most used measures (Fahrig, 2003).”

      (3) L 384: replace for with because of

      Done.

      (4) L 388: "Following this filtering, 60 ...."

      Done.

      (5) Figure 1: In panels b-d you use different terms (fragmented, small, isolated) but aiming to describe the same thing. I would highly recommend to either use fragmented islands or isolated islands for all panels. Although I see that in your study fragmentation includes both, habitat loss and isolation. So make this clear in the figure caption too...

      Thank you very much for this suggestion. It’s important to maintain consistency in using “fragmentation”. We change “fragmented, small, isolated” into “Fragmented patches” in the caption of b-d. The modified caption is now on Line 771:

      (6) L 783: replace background with habitat (or landscape) and exhibit with exemplify

      Done. The new sentence is now on Lines 782-784:

      “The three distinct patches signify a fragmented landscape and the community in the middle of the three patches was selected to exemplify colonization-extinction dynamics in fragmented habitats.”

      (7) One bigger thing is the definition of fragmentation in your study for which you used habitat area (from habitat loss process) and isolation. This could still be clarified a bit more, especially in the figures. In Fig. 1 the smaller panels b-d could all be titled fragmented islands as this is what the different terms describe in your study (small, isolated) and thus the figure would become even clearer. Otherwise I'm happy with the changes made.

      Thank you for raising this important question. Yes, “habitat fragmentation” in our research includes both habitat loss and fragmentation per se. We have clarified the caption of b-d in Figure 1 as suggested by Recommendation (5). We believe this can make it clearer to the readers.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      Otero-Coronel et al. address an important question for neuroscience - how does a premotor neuron capable of directly controlling behavior integrate multiple sources of sensory inputs to inform action selection? For this, they focused on the teleost Mauthner cell, long known to be at the core of a fast escape circuit. What is particularly interesting in this work is the naturalistic approach they took. Classically, the M-cell was characterized, both behaviorally and physiologically, using an unimodal sensory space. Here the authors make the effort (substantial!) to study the physiology of the M-cell taking into account both the visual and auditory inputs. They performed well-informed electrophysiological approaches to decipher how the M-cell integrates the information of two sensory modalities depending on the strength and temporal relation between them.

      Strengths:

      The empirical results are convincing and well-supported. The manuscript is well-written and organized. The experimental approaches and the selection of stimulus parameters are clear and informed by the bibliography. The major finding is that multisensory integration increases the certainty of environmental information in an inherently noisy environment.

      Weaknesses:

      Even though the manuscript and figures are well organized, I found myself struggling to understand key points of the figures.

      For example, in Figure 1 it is not clear what are actually the Tonic and Phasic components. The figure will benefit from more details on this matter. Then, in Figure 4 the label for the traces in panel A is needed since I was not able to pick up that they were coming from different sensory pathways.

      We added an inset to Figure 1 showing how the tonic and phasic components are measured. We now use solid colors instead of transparencies, and the color scheme was modified for consistency. We added labels to the traces used as examples in Figure 4 panel A.

      In line 338 it should be optic tectum and not "optical tectum".

      We replaced two instances of the term “optical tectum” with “optic tectum”.

      Reviewer #2 (Public Review):

      Summary:

      In this manuscript, Otero-Coronel and colleagues use a combination of acoustic stimuli and electrical stimulation of the tectum to study MSI in the M-cells of adult goldfish. They first perform a necessary piece of groundwork in calibrating tectal stimulation for maximal M-cell MSI, and then characterize this MSI with slightly varying tectal and acoustic inputs. Next, they quantify the magnitude and timing of FFI that each type of input has on the M-cell, finding that both the tectum and the auditory system drive FFI, but that FFI decays more slowly for auditory signals. These are novel results that would be of interest to a broader sensory neuroscience community. By then providing pairs of stimuli separated by 50ms, they assess the ability of the first stimulus to suppress responses to the second, finding that acoustic stimuli strongly suppress subsequent acoustic responses in the M-cell, that they weakly suppress subsequent tectal stimulation, and that tectal stimulation does not appreciably inhibit subsequent stimuli of either type. Finally, they show that M-cell physiology mirrors previously reported behavioural data in which stronger stimuli underwent less integration.

      The manuscript is generally well-written and clear. The discussion of results is appropriately broad and open-ended. It's a good document. Our major concerns regarding the study's validity are captured in the individual comments below. In terms of impact, the most compelling new observation is the quantification of the FFI from the two sources and the logical extension of these FFI dynamics to M-cell physiology during MSI. It is also nice, but unsurprising, to see that the relationship between stimulus strength and MSI is similar for M-cell physiology to what has previously been shown for behavior. While we find the results interesting, we think that they will be of greatest interest to those specifically interested in M-cell physiology and function.

      Strengths:

      The methods applied are challenging and appropriate and appear to be well executed. Open questions about the physiological underpinnings of M-cell function are addressed using sound experimental design and methodology, and convincing results are provided that advance our understanding of how two streams of sensory information can interact to control behavior.

      Weaknesses:

      Our concerns about the manuscript are captured in the following specific comments, which we hope will provide a useful perspective for readers and actionable suggestions for the authors.

      Comment 1 (Minor):

      Line 124. Direct stimulation of the tectum to drive M-cell-projecting tectal neurons not only bypasses the retina, it also bypasses intra-tectal processing and inputs to the tectum from other sources (notably the thalamus). This is not an issue with the interpretation of the results, but this description gives the (false) impression that bypassing the retina is sufficient to prevent adaptation. Adding a sentence or two to accurately reflect the complexity of the upstream circuitry (beyond the retina) would be welcome.

      The reviewer is right in that direct tectal stimulation bypasses all neural processing upstream, not only that produced in the retina and that the tectum does not exclusively process visual information. The revised version now acknowledges (lines 245-252, revised manuscript) the complexity of the system.

      Comment 2 (Major): The premise is that stimulation of the tectum is a proxy for a visual stimulus, but the tectum also carries the auditory, lateral line, and vestibular information. This seems like a confound in the interpretation of this preparation as a simple audio-visual paradigm. Minimally, this confound should be noted and addressed. The first heading of the Results should not refer to "visual tectal stimuli".

      We changed the heading of the corresponding section of the Results section as requested and also omitted the term “optic” when we did not specifically refer to tectal circuits that process optic information.  

      Comment 3 (Major): Figure 1 and associated text.

      It is unclear and not mentioned in the Methods section how phasic and tonic responses were calculated. It is clear from the example traces that there is a change in tonic responses and the accumulation of subthreshold responses. Depending on how tonic responses were calculated, perhaps the authors could overlay a low-passed filtered trace and/or show calculations based on the filtered trace at each tectal train duration.

      The revised version of the manuscript now includes a description of how the phasic and tonic components were calculated (lines 163-172). We also modified the color scheme and the inset of Figure 1A to clarify how these two components were defined. Since we quantified the response in a 12 ms window, we did not include an overlayed low-pass filtered trace since it might be confusing with respect to the metric used.

      Comment 4 (Minor): Figure 3 and associated text.

      This is a lovely experiment. Although it is not written in text, it provides logic for the next experiment in choosing a 50ms time interval. It would be great if the authors calculated the first timepoint at which the percentage of shunting inhibition is not significantly different from zero. This would provide a convincing basis for picking 50ms for the next experiment. That said, I suspect that this time point would be earlier than 50 ms. This may explain and add further complexity to why the authors found mostly linear or sublinear integration, and perhaps the basis for future experiments to test different stimulus time intervals. Please move calculations to Methods.

      We moved calculations to the Methods section (lines 201-208). We mention the rationale for selecting the 50 ms interval in the next experiment (Figure 4, lines 369-371) and discuss in detail the potential contribution of FFI to the complexity of the integration taking place in the M-cell circuit (Discussion, lines 512-535).

      Comment 5 (Major): Figure 4C and lines 398-410.

      These are beautiful examples of M-cell firing, but the text suggests that they occurred rarely and nowhere close to significantly above events observed from single modalities. We do not see this as a valid result to report because there is insufficient evidence that the phenomenon shown is consistent or representative of your data.

      Our experimental conditions required anesthesia and paralysis, conditions designed to reduce neuronal firing and suppress motor output. We think it is valuable to report that we still see that simultaneous presentation subthreshold unisensory stimuli can add up to become suprathreshold, paralleling behavioral observations. We do not claim and acknowledge that those examples are representative of our recording conditions, but are likely to be more representative of the multisensory integration process taking place in freely moving fish. The revised manuscript adds context to these example traces to justify their inclusion (lines 420-426).

      Reviewer #2 (Recommendations For The Authors):

      Methods

      The Methods section on "Auditory stimuli" contains a long background on the biophysics of the M-cell and its inputs. This does not belong in Methods. The same is true, to a lesser degree, in the next heading. The argument that direct stimulation of the tectum is necessary to bypass adaptation should be in Results, not Methods.

      Following the reviewer recommendation, we have moved both paragraphs to the Results section.

      Figure 1 and associated text.

      Visually, the use of transparency to differentiate phasic and tonic calculations is difficult to read. Example traces are also cut off at the top and bottom at random sizes.

      We changed the color scheme to avoid the use of transparency and modified the inset of Figure 1A to clarify how the phasic and tonic components were calculated. We also modified the dimensions of the clipping mask used to trim the stimulation artifacts of sample traces to make them more similar while still enabling clear observation of the phasic and tonic components of the response.

      Line 338 "optical tectum" is not correct. "optic tectum" is more common, or better still, just "tectum".

      We apologize for the error. The two instances of “optical tectum” were replaced by the correct term (“optic tectum”).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Comments:

      (1) We find it interesting that the reshaped model showed decreased firing rates of the projection neurons. We note that maximizing the entropy <-ln p(x)> with a regularizing term -\lambda <\sum _i f(x_i)>, which reflects the mean firing rate, results in \lambda _i = \lambda for all i in the Boltzmann distribution. In other words, in addition to the homeostatic effect of synaptic normalization which is shown in Figures 3B-D, setting all \lambda_i = 1 itself might have a homeostatic effect on the firing rates. It would be better if the contribution of these two homeostatic effects be separated. One suggestion is to verify the homeostatic effect of synaptic normalization by changing the value of \lambda.

      This is an interesting question and we, therefore, explored the effects of different values of $\lambda$ on the performance of unconstrained reshaped RP models and their firing rates. The new supp. Figure 2B presents the results of this exploration: We found that for models with a small set of projections, a high value of $\lambda$ results in better performance than models with low ones, while for models with a large set of projections we find the opposite relation. The mean firing rates of the projection neurons for models with different values of $\lambda$ show a clear trend, where higher $\lambda$ values results in lower mean firing rates.

      Thus, these results suggest an interplay between the optimal size of the projection set and the value of $\lambda$ one should pick. For the population sizes and projection sets we have used here, $\lambda=1$ is a good choice, but, for different population sizes or data sets a different value of $\lambda$ might be better.

      Thus, in addition to supp. Figure 2B, we therefore added the following to the main text:

      “An additional set of parameters that might affect the Reshaped RP models are the coefficients $\lambda$, that weigh each of the projections. Above, we used $\lambda=1$ for all projections, here we investigated the effect of the value of $\lambda$ on the performance of the Reshaped RP models (supp. Figure 2B). We find that for models with a small projection set, high $\lambda$ values result in better performance than models with low values. We find an opposite relation for models with large number projection sets. (We submit that the performance decrease of Reshaped RP models with high value of $\lambda$, as the number of projections grows, is a reflection of the non-convex nature of the Reshaped RP optimization problem).

      The mean firing rates of the projection neurons for models with different values of $\lambda$ show a clear trend, higher $\lambda$ values results in lower mean firing rates. Thus, we conclude that there is an interplay between the number of projections and the value of $\lambda$ we should pick. For the sizes of projection sets we have used here, $\lambda=1$ is a good choice, but, we note that in general, one should probably seek the appropriate value of $\lambda$ for different population sizes or data sets.”

      In addition, we explored the effect of synaptic normalization on models with different values of $\lambda$ (supp. Figure 3). We found that homeostatic Reshaped RP models are superior to the non-homeostatic Reshaped RP models: For low values of $\lambda$, the homeostatic and Reshaped RP models show similar performance in terms of log-likelihood, whereas the homeostatic models are more efficient. For high values of $\lambda_i$ homeostatic models are not only more efficient but also show better performance. These results indicate that the benefit of the homeostatic model is insensitive to the specific choice of $\lambda$.

      In addition to supp. Figure 3, we added the following to the main text:

      “Exploring the effect of synaptic normalization on models with different values of $\lambda$ (supp. Figure 3), we find that homeostatic Reshaped RP models are superior to the non-homeostatic Reshaped RP models: For low values of $\lambda$, the homeostatic and Reshaped RP models show similar performance in terms of log-likelihood, whereas the homeostatic models are more efficient. Importantly, for high values of $\lambda_i$ homeostatic models are not only more efficient but also show better performance. We conclude that the benefit of the homeostatic model is insensitive to the specific choice of $\lambda$.”

      (2) As far as we understand, \theta_i (thresholds of the neurons) are fixed to 1 in the article. Optimizing the neural threshold as well as synaptic weights is a natural procedure (both biologically and engineeringly), and can easily be computed by a similar expression to that of a_ij (equation 3). Do the results still hold when changing \theta _i is allowed as well? For example,

      a. If \theta _i becomes larger, the mean firing rates will decrease. Does the backprop model still have higher firing rates than the reshaped model when \theta _i are also optimized?

      b. Changing \theta _i affects the dynamic range of the projection neurons, thus could modify the effect of synaptic constraints. In particular, does it affect the performance of the bounded model (relative to the homeostatic input models)?

      We followed the referee’s suggestion, and extended our current analysis, and added threshold optimization to the Reshape and Backpropagation models, which is now shown in supp. Figure 2A.  Comparing the performance and properties of these models to ones with fixed thresholds, we found that this addition had a small effect on the performance of the models in terms of their likelihood. (supp. Figure 2A). We further find that backpropagation models with tuned thresholds show lower firing rates compared to backpropagation models with fixed threshold, while reshaped RP models with optimized thresholds show higher firing rates compared to models with fixed threshold. These differences are, again, rather small, and both versions of the reshaped RP models show lower firing rates compared to both versions of the backpropagation models.

      In addition to supp. Figure 2A, we added the following to the main text:

      “The projections' threshold $\theta_i$, which is analogous to the spiking threshold of the projection neurons, strongly affects the projections' firing rates. We asked how, in addition to reshaping the coefficients of each projection, we can also change $\theta_i$ to optimize the reshaped RP and backpropagation models.

      We find that this addition has a small effect on the performance of the models in terms of their likelihood (supp. Figure 2A).

      We also find that this has a small effect on the firing rates of the projection neurons: backpropagation models with tuned thresholds show lower firing rates compared to backpropagation models with fixed threshold, whereas reshaped RP models with optimized thresholds show higher firing rates compared to models with fixed threshold. Yet, both versions of the reshaped RP models show lower firing rates compared to both versions of the backpropagation models. Given the small effect of tuning threshold on models' performance and their internal properties, we will, henceforth, focus on Reshaped RP models with fixed thresholds.”

      (3) In Figure 1, the authors claim that the reshaped RP model outperforms the RP model. This improved performance might be partly because the reshaped RP model has more parameters to be optimized than the RP model. Indeed, let the number of projections N and the in-degree of the projections K, then the RP model and the reshaped RP model have N and KN parameters, respectively. Does the reshaped model still outperform the original one when only (randomly chosen) N weights (out of a_ij) are allowed to be optimized and the rest is fixed? (or, does it still outperform the original model with the same number of optimized parameters (i.e. N/K neurons)?)

      Indeed, the number of tuned parameters in the reshaped RP model is much larger compared to the number of tuned parameters in an RP model with the same projection set size. Yet, we submit that the larger number of tuned parameters is not the reason for the improved performance of the reshaped RP model: Maoz et al [30] have already shown that by optimizing an RP model with a small projection set using the pruning and replacement of projections (P&R), one can reach high accuracy with an almost order of magnitude fewer projections. Thus, we argue that the improved performance stems from the properties of the projections in the model.

      Accordingly, we therefore added supp. Figure 2B that shows the performance of P&R sigmoid RP model compared to RP and reshaped RP models. We added the following to the main text:

      “Because reshaping may change all the existing synapses of each projection, the number of parameters is the number of projections times the projections in-degree. While this is much larger than the number of parameters that we learn for the RP model (one for each projection), we suggest that the performance of the reshaped models is not a naive result of having more parameters. In particular, we have seen that RP models that use a small set of projections can be very accurate when the projections are optimized using the pruning and replacement process [30] (see also supp. Figure 1B). Thus, it is really the nature of the projections that shapes the performance. Indeed, our results here show that a small fixed connectivity projection set with weight tuning is enough for accurate performance which is on par or better than an RP model with more projections.”

      (4) In Figure 2, the authors have demonstrated that the homeostatic synaptic normalization outperforms the bounded model when the allowed synaptic cost is small. One possible hypothesis for explaining this fact is that the optimal solution lies in the region where only a small number of |a_ij| is large and the rest is near 0. If it is possible to verify this idea by, for example, exhibiting the distribution of a_ij after optimization, it would help the readers to better understand the mechanism behind the superiority of the homeostatic input model.

      We modified supp. Figure 4 and made the following change in the relevant part in the main text to address the reviewer comment about the distribution of the $a_{ij}$ values:

      “Figure 5E shows the mean rotation angle over 100 homeostatic models as a function of synaptic cost -- reflecting that the different forms of homeostatic regulation results in different reshaped projections. We show in Supp. Figure 4C the histogram of the rotation angles of several different homeostatic models, as well as the unconstrained Reshape model.

      Analyzing the distribution of the synaptic weights $a_{ij}$ after learning leads to a similar conclusion (supp. Figure 4D): The peak of the histograms is at $a_{ij} = 0$, implying that during reshaping most synapses are effectively pruned. While the distribution is broader for models with higher synaptic budget, it is asymmetric, showing local maxima at different values of $a_{ij}$.

      The diversity of solutions that the different model classes and parameters show imply a form of redundancy in model choice or learning procedure. This reflects a multiplicity of ways to learn or optimize such networks that biology could use to shape or tune neural population codes.“

      (5) In Figures 5D and 5E, the authors present how different reshaping constraints result in different learning processes ("rotation"). We find these results quite intriguing, but it would help the readers understand them if there is more explanation or interpretation. For example,

      a. In the "Reshape - Hom. circuit 4.0" plot (Fig 5D, upper-left), the rotation angle between the two models is almost always the same. This is reasonable since the Homeostatic Circuit model is the least constrained model and could be almost irrelevant to the optimization process. Is there any similar interpretation to the other 3 plots of Figure 5D?

      We added a short discussion of this difference to the main text, but do not have a geometric or other intuitive explanation for the nature of these differences.

      b. In Figure 5E, is there any intuitive explanation for why the three models take minimum rotation angle at similar global synaptic cost (~0.3)?

      We added discussion of this issue to the main text, and the histogram of the rotation angles in Supp Figure 4c shows that they are not identical. But, we don’t have an intuitive explanation for why the mean values are so similar.

      Recommendations for the authors:

      (1) Some claims on the effect of synaptic normalization on the reshaped model sound a little overstated since the presented evidence does not clearly show the improvement of the computational performance (in comparison to the vanilla reshaped model) in terms of maximizing the likelihood of the inputs. Here are some examples of such claims: "Incorporating more biological features and utilizing synaptic normalization in the learning process, results in even more efficient and accurate models." (in Abstract), "Thus, our new scalable, efficient, and highly accurate population code models are not only biologically-plausible but are actually optimized due to their biological features." (in Abstract), or "in our Reshaped RP models, homeostatic plasticity optimizes the performance of network models" (in Discussion).

      We changed the wording according to the reviewers’ suggestions.

      (2) In equation (1) and the following sentence, \theta _j (threshold) should be \theta _i.

      Fixed

      (3) While the authors mention that "reshaping with normalization or without it drives the projection neurons to converge to similar average firing rate values (Figure 3B)", they also claim that "reshaping with normalization implies lower firing rates as well as... (Figure 3E)". These two claims look a little inconsistent to us. Besides, it is not very clear from Figure 3E that the normalization decreases the firing rate (it is clear from Figure 3B, though). How about just deleting "lower firing rates as well as"?

      We changed the wording according to the reviewers’ suggestion.

      (4) The captions of Figures 4D and 4E should be exchanged.

      Fixed

      (5) Typo in In Figure 4F: "normalized in-dgreree".

      Fixed

      (6) In Figure 5D (upper left plot) the choice of "Reshape" and "Bounded3.0" looks a bit weird. Is this the typo of "Hom. cicruit 4.0"?

      There is no typo in the figure labels. We discussed the results of figure 5D in our response to point (5) in the public comments list and addressed the upper left panel of figure 5D in the main text.

      (7) In the paper, the letter \theta represents (1) the threshold of the projection neurons (eq. 1), (2) the "ceiling" value of the bounded model, and (3) the rotation angle of projections (Figure 5). We find this notation a bit confusing and recommend using different notations for different entities.

      Thanks for the suggestion, we changed the confusing notations: (1) The threshold of each projection neuron is still $\theta$, following the notation of the original RP model formulation [30]. (2) The notation of the “ceiling” value of the bounded model is now $\omega$. (3) The rotation angle of the projections during reshape is now marked by $\alpha$.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      (1) We agreed that there was insufficient evidence for the authors' conclusion that Myc-overexpressing clones lacking Fmi become losers. We request that the authors change the text to discuss that suppression of Myc clone growth through Fmi depletion is reminiscent of a cell acquiring loser status, although at this point in the manuscript there is no clear demonstration whether this is mostly driven by growth suppression and/or an increase in apoptosis.

      We agree that at the point in the manuscript where we have only described the clone sizes, one cannot make firm conclusions about competition, so we have changed the language to reflect this. We argue that after showing our apoptosis data, those conclusions become firm. Please see the more lengthy responses to reviewers below.

      (2) We agreed that the apoptosis assay, data and interpretation need to be improved. The graphs in Fig. 4O and P should be better discussed in the text and in the legend. Additionally, the graphs are lacking the red lines that are written in the text.

      We regret that we did not adequately explain the data displayed in these two graphs. Supercompetition tends to cause apoptosis in both winners and losers, with the ratio between WT and super-competitor cells being critical in deciding the outcome of competition. We wanted to represent this visually but failed to properly explain our analysis. We have rewritten the figure legend and our discussion in the main text, hopefully making it clearer. 

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This paper is focused on the role of Cadherin Flamingo (Fmi) in cell competition in developing Drosophila tissues. A primary genetic tool is monitoring tissue overgrowths caused by making clones in the eye disc that expression activated Ras (RasV12) and that are depleted for the polarity gene scribble (scrib). The main system that they use is ey-flp, which make continuous clones in the developing eye-antennal disc beginning at the earliest stages of disc development. It should be noted that RasV12, scrib-i (or lgl-i) clones only lead to tumors/overgrowths when generated by continuous clones, which presumably creates a privileged environment that insulates them from competition. Discrete (hs-flp) RasV12, lgl-i clones are in fact out-competed (PMID: 20679206), which is something to bear in mind. They assess the role of fmi in several kinds of winners, and their data support the conclusion that fmi is required for winner status. However, they make the claim that loss of fmi from Myc winners converts them to losers, and the data supporting this conclusion is not compelling.

      Strengths:

      Fmi has been studied for its role in planar cell polarity, and its potential role in competition is interesting.

      Weaknesses:

      I have read the revised manuscript and have found issues that need to be resolved. The biggest concern is the overstatement of the results that loss of fmi from Myc-overexpressing clones turns them into losers. This is not shown in a compelling manner in the revised manuscript and the authors need to tone down their language or perform more experiments to support their claims. Additionally, the data about apoptosis is not sufficiently explained.

      We take issue with this reviewer’s framing of their criticism. First, the reviewer is selectively reporting the results published in PMID: 20679206. They correctly state that those authors show that small discreet clones of RasV12 lgl are eliminated (Fig. 3B), but they omit the fact that the authors also show that larger RasV12 lgl clones induce apoptosis in the surrounding wild type cells, and therefore behave as winners (Fig. 3C). Hence, the size of the clone appears to determine its winner/loser status. Of course, lgl is not scrib, and it is not a certainty that they would behave similarly, but they also show that large RasV12 scrib clones induce considerable apoptosis of the neighboring wild type cells. 

      The reviewer then discusses “continuous” clones induced by ey-flp, as we use in our manuscript. Here, the term “continuous” is probably misleading; because ey is expressed ubiquitously in the disc from early in development, it is most likely the case that the majority of cells have flipped relatively early, resulting in ~half the cells becoming clone and the other ~half twin spot. The clone cells then likely fuse to make larger clones. We show that ey-flp induced RasV12 scrib clones also behave as winners. It is logical to conclude that this is because they are large. The reviewer talks about “a privileged environment that insulates them from competition,” but if they were insulated from competition, how could they become winners? Because they occupy more territory than the wild type cells, and because they induce apoptosis in the wild type neighbors, they are winners. 

      Having shown that ey-flp induced RasV12 scrib clones behave as winners, we then remove Fmi from these clones, and show that they behave as losers by the same criteria: they occupy less area than the wild type cells (our Fig. 1 and Fig. 1 Supp 2), and they induce apoptosis in the wild type cells (our Fig 4A-H). 

      With respect to the comment about additional experiments are needed to support the claim that loss of Fmi from Myc winners converts them to losers, we’re not sure what additional data the reviewer would want. As for the tumor clones, we show that >>Myc clones get bigger than the twin control clones (Fig. 2), and we measure similar low levels of apoptosis in each (Fig. 4I-K, O). In contrast >>Myc fmi clones are out-grown by wild type clones, and apoptosis is higher in the >>Myc fmi clones than in the wild type clones (Fig. 4L-N, P-S). We therefore believe it is correct to say that >>Myc clones become losers when Fmi is removed.

      In additional comments, the reviewer takes issue with using winner and loser language at the point in the manuscript where we have only shown the clone sizes but not yet the apoptosis data, and about this we agree. We have changed the language accordingly. 

      Re explanation of the apoptosis data, see the response to reviewer #3.

      Reviewer #2 (Public review):

      Summary:

      In this manuscript, Bosch et al. reveal Flamingo (Fmi), a planar cell polarity (PCP) protein, is essential for maintaining 'winner' cells in cell competition, using Drosophila imaginal epithelia as a model. They argue that tumor growth induced by scrib-RNAi and RasV12 competition is slowed by Fmi depletion. This effect is unique to Fmi, not seen with other PCP proteins. Additional cell competition models are applied to further confirm Fmi's role in 'winner' cells. The authors also show that Fmi's role in cell competition is separate from its function in PCP formation.

      Strengths:

      (1) The identification of Fmi as a potential regulator of cell competition under various conditions is interesting.

      (2) The authors demonstrate that the involvement of Fmi in cell competition is distinct from its role in planar cell polarity (PCP) development.

      Weaknesses:

      (1) The authors provide a superficial description of the related phenotypes, lacking a mechanistic understanding of how Fmi regulates cell competition. While induction of apoptosis and JNK activation are commonly observed outcomes in various cell competition conditions, it is crucial to determine the specific mechanisms through which they are induced in fmi-depleted clones. Furthermore, it is recommended that the authors utilize the power of fly genetics to conduct a series of genetic epistasis analyses.

      We agree that it is desirable to have a mechanistic understanding of Fmi’s role in competition, but that is beyond the scope of this manuscript. Here, our goal is to report the phenomenon. We understand and share with the reviewer the interest in better understanding the relationship between Fmi and JNK signaling in competition. The role of JNK in competition, tumorigenesis and cell death is infamously complex. In some preliminary experiments, we explored some epistasis experiments, but these were inconclusive so we elected to not report them here. In the future, we will continue with additional analyses to gain a better understanding of the mechanism by which Fmi affects competition.

      Reviewer #3 (Public review):

      Summary:

      In this manuscript, Bosch and colleagues describe an unexpected function of Flamingo, a core component of the planar cell polarity pathway, in cell competition in Drosophila wing and eye disc. While Flamingo depletion has no impact on tumour growth (upon induction of Ras and depletion of Scribble throughout the eye disc), and no impact when depleted in WT cells, it specifically tunes down winner clone expansion in various genetic contexts, including the overexpression of Myc, the combination of Scribble depletion with activation of Ras in clones or the early clonal depletion of Scribble in eye disc. Flamingo depletion reduces proliferation rate and increases the rate of apoptosis in the winner clones, hence reducing their competitiveness up to forcing their full elimination (hence becoming now "loser"). This function of Flamingo in cell competition is specific of Flamingo as it cannot be recapitulated with other components of the PCP pathway, does not rely on interaction of Flamingo in trans, nor on the presence of its cadherin domain. Thus, this function is likely to rely on a non-canonical function of Flamingo which may rely on downstream GPCR signaling.

      This unexpected function of Flamingo is by itself very interesting. In the framework of cell competition, these results are also important as they describe, to my knowledge, one of the only genetic conditions that specifically affect the winner cells without any impact when depleted in the loser cells. Moreover, Flamingo do not just suppress the competitive advantage of winner clones, but even turn them in putative losers. This specificity, while not clearly understood at this stage, opens a lot of exciting mechanistic questions, but also a very interesting long term avenue for therapeutic purpose as targeting Flamingo should then affect very specifically the putative winner/oncogenic clones without any impact in WT cells.

      The data and the demonstration are very clean and compelling, with all the appropriate controls, proper quantifications and backed-up by observations in various tissues and genetic backgrounds. I don't see any weakness in the demonstration and all the points raised and claimed by the authors are all very well substantiated by the data. As such, I don't have any suggestions to reinforce the demonstration.

      While not necessary for the demonstration, documenting the subcellular localisation and levels of Flamingo in these different competition scenarios may have been relevant and provide some hints on a putative mechanism (specifically by comparing its localisation in winner and loser cells).

      While we did not perform a thorough analysis, our current revision of the manuscript shows Fmi staining results that do not support a change in subcellular localization of Fmi. In our images, Fmi seemed to localize similarly along the winner-loser clone boundaries, and inside and outside the clones. We cannot rule out that a subtle change in localization is taking place that could perhaps be detected with higher resolution imaging.

      Also, on a more interpretative note, the absence of impact of Flamingo depletion on JNK activation does not exclude some interesting genetic interactions. JNK output can be very contextual (for instance depending on Hippo pathway status), and it would be interesting in the future to check if Flamingo depletion could somehow alter the effect of JNK in the winner cells and promote downstream activation of apoptosis (which might normally be suppressed). It would be interesting to check if Flamingo depletion could have an impact in other contexts involving JNK activation or upon mild activation of JNK in clones.

      See our comment to Reviewer 2 regarding JNK.

      Strengths:

      A clean and compelling demonstration of the function of Flamingo in winner cells during cell competition

      One of the rare genetic conditions that affects very specifically winner cells without any impact in losers, and then can completely switch the outcome of competition (which opens an interesting therapeutic perspective on the long term) Weaknesses:

      The mechanistic understanding obviously remains quite limited at this stage especially since the signaling does not go through the PCP pathway.

      We agree that in the future, it will be desirable to gain a mechanistic understanding of Fmi’s role in competition.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      I have read the revised manuscript and have found issues that need to be resolved. The biggest concern is the overstatement of the results that loss of fmi from Myc-overexpressing clones turns them into losers. This is not shown in a compelling manner in the revised manuscript and the authors need to tone down their language or perform more experiments to support their claims.

      (1) I do not agree with the language used by the authors last paragraph of p. 4 stating loss of fmi from Myc supercompetitors (Fig. 2) makes them losers. At this point in the paper, they only use clone size as a readout. By definition, losers in imaginal discs die by apoptosis, which is not measured in this figure. As such, the authors do not prove that fmi-mutant Myc over-expressing clones are now losers at this point in the manuscript. The authors should discuss this in the results section regarding Fig. 2.

      We have modified the language in text and figure legend to acknowledge that the clone size data alone do not demonstrate competition.

      (2) Related to point #1, I do not agree with the language in the legend of Fig. 2H that the graph is measuring "supercompetition". They are only measuring clone ratios, not apoptosis. Growing to a smaller size does not make a clone have loser status without also assessing cell death.

      (a) I suggest that the authors remove the sentence "A ratio over 0 indicates supercompetition of nGFP+ clones, and below 0 indicates nGFP+ cells are losers." in the legend to Fig. 2H. Instead, they should describe the assay in times of clone ratios.

      The reviewer raises a valid point, as at this point in the manuscript we did not quantify cell death and proliferation. However, based on decades of knowledge of supercompetiton, Myc clones are classified as super-competitors in every instance they’ve been studied. (Myc clones show apoptosis when competing with WT cells, while at the same time they eliminate WT neighbors by apoptosis to become winners. Their faster proliferation rate may be what ultimately makes them winners.) We changed the language to address this distinction. 

      (3) In Fig. 4, they do attempt to monitor apoptosis, which is the fate of bona fide losers in imaginal tissue. However, I have several concerns about these data (panels 4I-K, O and P have been added to the revised manuscript.)

      (a) In Fig. 4I-K, why is there no death of WT cells which would be expected based on de la Cova Cell 2004? The authors need to comment on this.

      (b) Cell death should also be observed in the Myc over-expressing clones but none is seen in this disc (see de la Cova 2004 and PMID: 18257071 Fig. 4). The authors need to comment on this.

      We do not understand why the reviewer raises these two points. We see some cell death in >Myc eye discs both in winners and losers, as displayed in the graph. In our hands, the levels were on average very low. The example shown is representative of the analysis and shows apoptosis both in WT and >Myc cells, highlighted by the arrows in 4J. We added a mention to the arrows in the figure legend to make it clearer. In the main text, we already compared our observations to the same publication the reviewer mentions (De la Cova 2004). 

      (c) The data in panel 4O is not explained sufficiently in the legend or results section. What do the lines between the data points in the left side of the panel mean? Why is there a bunch of clustered data points in the right part of the Fig. 4O, when two different genotypes are listed below? I would have expected two clusters of points. The authors need to comment on this.

      We intended to convey as much information as possible in an informative manner in these graphs, and we regret not explaining better the analysis shown. We modified the legends for the apoptosis analysis to better explain the displayed data.

      (d) What is the sample size (n) for the genotypes listed in this figure? The authors need to comment on this and explicitly list the sample size in the legend.

      We added the n for both conditions to the figure. 

      (e) In panels 4L-N, why is the death occurring in the apparent center of the fmiE59>>Myc clone. If these clones are truly losers as the authors claim, then apoptosis should be seen at the boundaries between the fmiE59>>Myc clone and the WT clones. The results in this figure are not compelling, yet this is the critical piece of data to support their claim that fmiE59>>Myc clone are losers. The authors need to comment on this.

      The majority of cell death in this example is observed 1-3 cells away from the clone boundary. In some cases, we observe cell death farther from the boundary, but those cells were not counted in our analyses. As described in our methods, we only considered for the analysis cells at the clone boundary or in the vicinity, as those are the ones that most probably have apoptosis triggered by the neighboring clone.

      (f) There is no red line in Fig. 4O and 4P, in contrast to what is written in the legend in the revised manuscript. This should be corrected.

      We thank the reviewer for catching the error about the line. We have now simplified the graph by removing the line at Y=0 and just leave one dashed line, representing the mean difference between WT and >>Myc cells.

      (4) On p. 10, the reference Harvey and Tapon 2007 to support hpo-/- supercompetitor status is incorrect. The references are Ziosi 2010 and Neto-Silva 2010. This should be changed.

      We thank the reviewer for the correction. While the review we provided discusses the role of the Hpo pathway in proliferation and cancer, it does not discuss competition. The reference we intended to include here was Ziosi 2010. We now cite both in the revised manuscript.

      (5) The legend for Fig. 3A-H is missing from the revised manuscript. This needs to be added.

      This was likely a copy-edit glitch. The missing parts of the legend have been restored.

      (6) Material and methods is missing details on the hs-induced clones. The authors need to specifically state when the clones were generated and when they were analyzed in hours after egg laying.

      The timing of the heat-shock and analysis was described in the methods: “Heat-shock was performed on late first instar and early second instar larvae, 48 hrs after egg laying (AEL). Vials were kept at 25ºC after heat-shock until larvae were dissected”. And additionally, in the dissection methods: “Third instar wandering larvae (120 hrs AEL) were dissected…” We have included in this revision the length of the heat-shock (15 min). 

      I have read the rebuttal and some of my concerns are not sufficiently addressed.

      (8) I raised the point of continuously-generated clones becoming large enough to evade competition, and I disagree with the authors' reply. I think that competition of RasV12, scrib (or lgl) competition largely depends the size of the clone, which is de facto larger when generated by continuous expression of flp (such as eyeless or tubulin promoters used in this study). I think that at that point, we are at an impasse with respect to this issue, but I wanted to register my disagreement for the record. Related to this, one possible reason for the fragmentation of the fmimutant Myc overexpressing clones in the wing disc is because they were not continuously generated and hence did not merge with other clones.

      Please see the discussion above in the public comments. We remain unclear about what, exactly, the reviewer disagrees. As stated above, we think they are correct that the size of the clone is critical in determining winner vs loser status.

      Reviewer #2 (Recommendations for the authors):

      Although the authors have addressed some of my concerns, I still feel that a detailed mechanistic understanding is essential. I hope the authors will conduct additional experiments to solve this issue.

      We also consider the mechanism of interest and will pursue this in the future. To test our hypotheses we require a set of genetic mutants that are still in the making that will help us dissect the function and potential partners of Fmi, and we hope to have these results in a future publication.

      Reviewer #3 (Recommendations for the authors):

      - There is no clear demonstration that the relative decrease of clone size in UASMyc/Fmi mutant is mostly driven by either a context dependant suppression of growth and/or an increase of apoptosis (the latter being the more classic feature of loser phenotype).

      We believe that it is driven by both, and refrain from making assumptions about the magnitude of contribution from each. This question is something that we will be interested to explore in the future.

      The distribution of cell death in Fmi/UAS-Myc mutant is somehow surprising and may not fit with most of the competition scenarios where death is mostly restricted to clone periphery (although this may be quite variable and would require much more quantification to be clear).

      While we observe some cell death far from clone boundaries, most of the dying cells are a few cells away from a clone boundary. In other publications quantifying cell death, examples of cell death farther from the boundary are not rare (See for example Moreno and Basler 2004 Fig 6, De la Cova et al. Fig 2, Meyer et al 2014 Fig 2). We did not count cells dying far from clone boundaries in our analysis.

      I just noticed a few mistakes in the legend :

      Figure 3M legend is missing (it would be useful to know at which stage the quantification is performed)

      Another reviewer brought to our attention the problems with Fig 3 legend. We restored the missing parts.

      It would be good to give an estimate of the number of larvae observed when showing the representative cases in Figure 1 .

      This is a good point. We now include these numbers in the figure legend.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #2 (Recommendations for the authors): 

      Discussion, page 28. The argument that the authors put forward justifying the (small) size of the spontaneous EPSCs seems reasonable. Nonetheless, it would be good to have an amplitude distribution constructed with voltage-evoked EPSCs to compare with that of spontaneous EPSCs. Not the large initial EPSC, obtained upon IHC depolarization but rather EPSCs occurring later during the longer pulses (figure 4). The authors made the claim that upon IHC depolarization, EPSCs sizes increased, but this is not backed with data. 

      Following the reviewer recommendation, we have analyzed the voltage-evoked EPSCs occurring during the last 20 ms of the Masker stimulus. We compared the cumulative distribution of the amplitude of these eEPSCs to the cumulative distribution of the amplitude of the sEPSCs (Figure 1-figure supplement 1, panel G) from the same synapses. The two distributions are significantly different (p < 0.0001, Kolmogorov-Smirnov test), with evoked EPSCs having larger amplitudes (average sEPSC amplitude of -97.28 ± 2.22 pA [median 82.10 pA] vs average eEPSC amplitude of 135.8 ± 3.24 pA [median 120.0 pA]).

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The study investigates protein-protein interactions (PPIs) within the nuage, a germline-specific organelle essential for piRNA biogenesis in Drosophila melanogaster, using AlphaFold2 to predict interactions among 20 nuage-localizing proteins. The authors identify five novel interaction candidates and experimentally validate three of them, including Spindle-E and Squash, through co-immunoprecipitation assays. They confirm the functional significance of these interactions by disrupting salt bridges at the Spn-E_Squ interface. The study further expands its scope to analyze approximately 430 oogenesis-related proteins, validating three additional interaction pairs. A comprehensive screen of around 12,000 Drosophila proteins for interactions with the key piRNA pathway player, Piwi, identifies 164 potential binding partners. Overall, the research demonstrates that in silico approaches using AlphaFold2 can link bioinformatics predictions with experimental validation, streamlining the identification of novel protein interactions and reducing the reliance on extensive experimental efforts. The manuscript is commendably clear and easy to follow; however, areas for improvement should be addressed to enhance its clarity and rigor.

      Major Concerns:

      (1) While AlphaFold2 was developed and trained primarily for predicting protein structures and their interactions, applying it to predict protein-protein interactions is an extrapolation of its intended use. This introduces several important considerations and risks. First, it assumes that AlphaFold's accuracy in structure prediction extends to interactions, despite not being explicitly trained for this task. Additionally, the assumption that high-scoring models with structural complementarity imply biologically relevant interactions is not always valid. Experimental validation is essential to address these uncertainties, as over-reliance on computational predictions without such validation can lead to false positives and inaccurate conclusions. The authors should expand on the assumptions, limitations, and risks associated with using AlphaFold2 for predicting protein-protein interactions.

      We appreciate the reviewer's point. The prediction of protein-protein interactions using AlphaFold2 relies on the number of conserved homologous sequences and previous conformational data. We shall add limitations and risks to the AlphaFold2 prediction method in the revised manuscript.

      (2) The authors experimentally validated three interactions, out of five predicted interactions, using co-immunoprecipitation (co-IP). They attributed the lack of validation for the other two predictions to the limitations of the co-IP method. However, further clarification on the potential limitations of the co-immunoprecipitation behind the negative results would strengthen the conclusions. While co-IP is a widely used technique, it may not detect weak or transient interactions, which could explain the failure to validate some predictions. Suggesting alternative validation methods such as FRET or mass spectrometry could further substantiate the results. On the other hand, AlphaFold2 predictions are not infallible and may generate false positives, particularly when dealing with structurally plausible but biologically irrelevant interactions. By acknowledging both the potential limitations of co-IP and the possibility of false positives from AlphaFold2, the authors can provide a more balanced interpretation of their findings.

      We appreciate the reviewer's point of view. We have used the co-IP method to detect interactions in this study. However, as the reviewer pointed out, it is likely that weak and transient interactions may not be detected. We plan to add a note on the detection limits of the co-IP method and the possibility that AlphaFold2 method produces false positives in the revised manuscript.

      (3) In line 143, the authors state that "This approach identified 13 pairs; seven of these were already known to form complexes, confirming the effectiveness of AlphaFold2 in predicting complex formations (Table 2). The highest pcScore pair was the Zuc homodimer, possibly because AlphaFold2 had learned from Zuc homodimer's crystal structure registered in the database." While the authors mentioned the presence of the Zuc homodimer's crystal structure, they do not provide a systematic bioinformatics analysis to evaluate pairwise sequence identity or check for the presence of existing structures for all the proteins or protein pairs (or their homologs) in databases such as the Protein Data Bank (PDB) or Swiss-Model. Conducting such an analysis is critical, as it significantly impacts the novelty and reliability of AlphaFold2 predictions. For instance, high sequence identity between the query proteins could lead to high-scoring models for biologically irrelevant interactions. Including this information would strengthen the conclusions regarding the accuracy and utility of the predictions.

      We appreciate the reviewer's critical point. The AlphaFold2 method generates a high confidence score when the 3D structure of the protein of interest, or of proteins with very similar sequences, is solved. We will investigate whether the proteins used in this study are included in the 3D structure database and add the information to the revised manuscript.

      (4) While the manuscript successfully identifies novel protein interactions, the broader biological significance of these interactions remains underexplored. The manuscript could benefit from elaborating on how these findings may contribute to understanding the piRNA pathway and its implications on germline development, transposon repression, and oogenesis.

      We plan to add to the revise manuscript the potential biological significance of the novel protein-protein interactions presented in this manuscript.

      Reviewer #2 (Public review):

      Summary:

      In this paper, the authors use AlphaFold2 to identify potential binding partners of nuage localizing proteins.

      Strengths:

      The main strength of the paper is that the authors experimentally verify a subset of the predicted interactions.

      Many studies have been performed to predict protein-protein interactions in various subsets of proteins. The interesting story here is that the authors (i) focus on an organelle that contains quite some intrinsically disordered proteins and (ii) experimentally verify some (but not all) predictions.

      Weaknesses:

      Identification of pairwise interactions is only a first step towards understanding complex interactions. It is pretty clear from the predictions that some (but certainly not all) of the pairs could be used to build larger complexes. AlphaFold easily handles proteins up to 4-5000 residues, so this should be possible. I suggest that the authors do this to provide more biological insights.

      We thank the reviewer for his kind suggestions. Although dimer structure predictions were made in this manuscript, if a protein is predicted to interact with two other proteins, it is possible that three proteins could interact. We plan to add such trimer predictions to the revise manuscript.

      Another weakness is the use of a non-standard name for "ranking confidence" - the author calls it the pcScore - while the name used in AlphaFold (and many other publications) is ranking confidence.

      We take the reviewer’s point and will revise the text accordingly.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife Assessment

      This study addresses a question in sensory ethology and active sensing in particular. It links the production of a specific signal - electrosensory chirps - to various contexts and conditions to argue that the main function is to enhance conspecific localization rather than communication as previously believed. The study provides a lot of valuable data, but the methods section is incomplete making it difficult to evaluate the claims.

      We have now added to the methods a new paragraph describing in better detail the analysis done to prepare the data used in figure 7. The figure itself has been substantially changed: we now show EOD fields and electric images using voltage, instead of current and we have better illustrated the comparisons between chirps and beats using statistical analysis.

      Eventually, we are equally grateful to all Reviewers for the constructive criticism and for the time spent in evaluating our manuscript. It certainly helped to improve both the quality of the data presented as well as the readability of the text.

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors investigate the role of chirping in a species of weakly electric fish. They subject the fish to various scenarios and correlate the production of chirps with many different factors. They find major correlations between the background beat signals (continuously present during any social interactions) or some aspects of social and environmental conditions with the propensity to produce different types of chirps. By analyzing more specifically different aspects of these correlations they conclude that chirping patterns are related to navigation purposes and the need to localize the source of the beat signal (i.e. the location of the conspecific).

      The study provides a wealth of interesting observations of behavior and much of this data constitutes a useful dataset to document the patterns of social interactions in these fish. Some data, in particular the high propensity to chirp in cluttered environments, raises interesting questions. Their main hypothesis is a useful addition to the debate on the function of these chirps and is worth being considered and explored further.

      After the initial reviewers' comments, the authors performed a welcome revision of the way the results are presented. Overall the study has been improved by the revision. However, one piece of new data is perplexing to me. The new figure 7 presents the results of a model analysis of the strength of the EI caused by a second fish to localize when the focal fish is chirping. From my understanding of this type of model, EOD frequency is not a parameter in the model since it evaluates the strength of the field at a given point in time. Therefore the only thing that matters is the phase relationship and strength of the EOD. Assuming that the second fish's EOD is kept constant and the phase relationship is also the same, the only difference during a chirp that could affect the result of the calculation is the potential decrease in EOD amplitude during the chirp. It is indeed logical that if the focal fish decreased its EOD amplitude the target fish's EOD becomes relatively stronger. Where things are harder to understand is why the different types of chirps (e.g. type 1 vs type 2) lead to the same increase in signal even though they are typically associated with different levels of amplitude modulations. Also, it is hard to imagine that a type 2 chirp that is barely associated with any decrease in EOD amplitude (0-10% maybe), would cause a doubling of the EI strength. There might be something I don't understand but the authors should provide a lot more details on how this result is obtained and convince us that it makes sense.

      We hope we have now resolved the Reviewer’s concerns by applying major edits to Figure 7. We now use voltage - not current - to quantify the impact of chirps on electric images. The effect of chirps is here estimated using the integral of the beat AM, as a broad measure of the potential effects chirping may have on electroreceptors. We underline in the text that this analysis does not represent proof for any type of processing occurring in the fish brain, but we only express in hypothetical terms that - based on the beat perturbations measured - additional spatial information may potentially be available in electric images, as a consequence of chirping. Whether the fish uses this information, or not, needs to be assessed through electrophysiology in future studies.

      Finally, the reviewer is concerned about this sentence in the rebuttal - "The methods section has been edited to clarify the approach (not yet)". This section is unfinished, which suggests that it is difficult to explain the modeling results from a logical point of view. Thus the reviewer's major concern from the previous review remains unresolved. To summarize, the model calculates field strengths at an instant in time and integrates over time with a 500 ms window. This window is 10 times longer than the small chirps, while the longer chirps cover a much larger proportion of the window. Yet, the small chirps have a bigger impact on discriminability than the longer chirps. The authors should attempt to explain this seemingly contradictory result. This remains a major issue because this analysis was the most direct evidence that chirping could impact localization accuracy.

      We added a new method section describing the new figure and hopefully it is explaining more clearly how the effect of chirps is calculated. Since most p-units are affected by the beat cyclic AMs, any change on the electric image caused by a chirp will result in changes in transcutaneous voltage - i.e. the voltage measurable at the receptor level. Overall, this added analysis is not a central point of the manuscript, it is part of an attempt to hint to physiological mechanisms implied which cannot be explored in the current study. We do not mean to propose that these estimates represent alternatives to electrophysiological recordings, rather theoretical evidences which could in fact support this type of investigation. 

      Reviewer #2 (Public Review):

      Studying Apteronotus leptorhynchus (the weakly electric brown ghost knifefish), the authors provide evidence that 'chirps' (brief modulations in the frequency and amplitude of the ongoing wave-like electric signal) function in active sensing (specifically homeoactive sensing) rather than communication. Chirping is a behavior that has been well studied, including numerous studies on the sensory coding of chirps and the neural mechanisms for chirp generation. Chirps are largely thought to function in communication behavior, so this alternative function is a very exciting possibility that should have a great impact on the field.

      The authors provide convincing evidence that chirps may function in homeoactive sensing. In particular, the evidence showing increased chirping in more cluttered environments and a relationship between chirping and movement are especially strong and suggestive. Their evidence arguing against a role for chirps in communication is not as strong. However, based on an extensive review of the literature, the authors conclude, I think fairly, that the evidence arguing in favor of a communication function is limited and inconclusive. Thus, the real strength of this study is not that it conclusively refutes the communication hypothesis, but that it calls this hypothesis into question while also providing compelling evidence in favor of an alternative function.

      In summary, although the evidence against a role for chirps in communication is not as strong as the evidence for a role in active sensing, this study presents very interesting data that is sure to stimulate discussion and follow-up studies. The authors acknowledge that chirps could function as both a communication and homeactive sensing signal, and the language arguing against a communication function is appropriately measured. A given electrical behavior could serve both communication and homeoactive sensing. I suspect this is quite common in electric fish (not just in gymnotiforms such as the species studied here, but also in the distantly related mormyrids), and perhaps in other actively sensing species such as echolocating animals.

      We are grateful to the Reviewer for the kind assessment.

      Reviewer #3 (Public Review):

      Summary:

      This important paper provides the best-to-date characterization of chirping in weakly electric fish using a large number of variables. These include environment (free vs divided fish, with or without clutter), breeding state, gender, intruder vs resident, social status, locomotion state and social and environmental experience, without and with playback experiments. It applies state-of-the-art methods for reducing the dimensionality of the data and finding patterns of correlation between different kinds of variables (factor analysis, K-means). The strength of the evidence, collated from a large number of trials with many controls, leads to the conclusion that the traditionally assumed communication function of chirps may be secondary to its role in environmental assessment and exploration that takes social context into account. Based on their extensive analyses, the authors suggest that chirps are mainly used as probes that help detect beats caused by other fish as well as objects.

      Strengths:

      The work is based on completely novel recordings using interaction chambers. The amount of new data and associated analyses is simply staggering, and yet, well organized in presentation. The study further evaluates the electric field strength around a fish (via modelling with the boundary element method) and how its decay parallels the chirp rate, thereby relating the above variables to electric field geometry. The BEM modelling also convincingly predicts how the electric image of a receiver conspecific on a sending fish is enhanced by a chirp.

      The main conclusions are that the lack of any significant behavioural correlates for chirping, and the lack of temporal patterning in chirp time series, cast doubt on a primary communication goal for most chirps. Rather, the key determinants of chirping are the difference in frequency between two interacting conspecifics as well as individual subjects' environmental and social experience. The paper concludes that there is a lack of evidence for stereotyped temporal patterning of chirp time series, as well as of sender-receiver chirp transitions beyond the known increase in chirp frequency during an interaction. The authors carefully submit that the new putative echolocation function of chirps is not mutually exclusive with a possible communication function.

      These conclusions by themselves will be very useful to the field. They will also allow scientists working on other "communication" systems to perhaps reconsider and expand the goals of the probes used in those senses. A lot of data are summarized in this paper, with thorough referencing to past work.

      The alternative hypotheses that arise from the work are that chirps are mainly used as environmental probes for better beat detection and processing and object localization, and in this sense are self-directed signals. This led to their prediction that environmental complexity ("clutter") should increase chirp rate, which is fact was revealed by their new experiments. The authors also argue that waveform EODs have less power across high spatial frequencies compared to pulse-type fish, with a resulting relatively impoverished power of resolution. Chirping in wave-type fish could temporarily compensate for the lower frequency resolution while still being able to resolve EOD perturbations with a good temporal definition (which pulse-type fish lack due to low pulse rates).

      The authors also advance the interesting idea that the sinusoidal frequency modulations caused by chirps are the electric fish's solution to the minute (and undetectable by neural wetware) echo-delays available to it, due to the propagation of electric fields at the speed of light in water. The paper provides a number of experimental avenues to pursue in order to validate the non-communication role of chirps.

      We are grateful to the Reviewer for the kind assessment.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      The manuscript by Poltavski and colleagues describes the discovery of previously unreported enteric neural crestderived cells (ENCDC) which are marked by Pax2 and originating from the Placodes. By creating multiple conditional mouse mutants, the authors demonstrate these cells are a distinct population from the previously reported ENCDCs which originate from the Vagal neural crest cells and express Wnt1.

      These Pax2-positive ENCDCs are affected due to the loss of both Ret and Ednrb highlighting that these cells are also ultimately part of the canonical processes governing ENCDC and enteric nervous system (ENS) development. The authors also make explant cultures from the mouse GI tract to detect how Ednrb signaling is important for Ret signaling pathways in these cells and rediscovers the interactions between these 2 pathways. One important observation the authors make is that CGRP-positive neurons in the adult distal colon seem to be primarily derived from these Pax2-positive ENCDCs, which are significantly reduced in the Ednrb mutants, thus highlighting the role of Ednrb in maintaining this neuronal type.

      I appreciate the amount of work the authors have put into generating the mouse models to detect these cells, but there isn't any new insight on either the nature of ENCDC development or the role of Ret and Ednrb. Also, there are sophisticated single-cell genomics methods to detect rare cell type/states these days and the authors should either employ some of those themselves in these mouse models or look at extensively publicly available single-cell datasets of the developing wildtype and mutant mouse and human ENS to map out the global transcriptional profile of these cells. A more detailed analysis of these Pax2-positive cells would be really helpful to both the ENS community as well as researchers studying gut motility disorders.

      We would like to point out that the reviewer’s comments in both Public Review and in some cases reiterated in Recommendations for the Authors are rooted in several misunderstandings. The reviewer writes “Pax2-positive ENCDCs”, as if the Pax2 lineage (properly, the Pax2Cre-labeled lineage) of the ENS is a subset of neural crest, and states that “there isn’t any new insight” from our study on ENS development. Our conclusion is quite different, that the Pax2Cre lineage (placode-derived) is distinct from the neural crest-derived cell lineage. The reviewer may not have appreciated that our study establishes a fundamental reinterpretation of the very long-standing dogma that the ENS is derived solely from neural crest. We believe that finding and characterizing the unique contribution of an independent cell lineage to the ENS provides critical new perspectives into ENS development and the etiology of Hirschsprung disease. One feature of the Pax2Cre (placodal) lineage is as the source of CGRP-positive mechanosensory neurons in the colon (as the reviewer mentioned), but this is one feature of the larger conceptual discovery of the existence of a separate lineage contribution to the ENS, not the most important observation in and of itself.

      The reviewer continues by saying that we “rediscovered” the interaction between Ednrb and Ret in ENS development. In our study we show that the two lineages (placode-derived and neural crest-derived) employ Ednrb and Ret signaling in distinct ways. This isn’t simply rediscovery, this is new insight. To the extent that both lineages utilize both signaling axes (albeit with mechanistic differences) is a primary reason why the unique placodal lineage contribution to the ENS remained unsuspected until now. We have revised the text to make these points more clear in our revised manuscript.

      The reviewer also suggests single cell genomic methods, which is addressed below in our response to the reviewer’s first recommendation.

      Reviewer #2 (Public Review):

      This manuscript by Poltavski and colleagues explores the relative contributions of Pax2- and Wnt1- lineagederived cells in the enteric nervous system (ENS) and how they are each affected by disruptions in Ret and Endrb signaling. The current understanding of ENS development in mice is that vagal neural crest progenitors derived from a Wnt1+ lineage migrate into and colonize the developing gut. The sacral neural crest was thought to make a small contribution to the hindgut in addition but recent work has questioned that contribution and shown that the ENS is entirely populated by the vagal crest (PMID: 38452824). GDNF-Ret and Endothelin3-Ednrb signaling are both known to be essential for normal ENS development and loss of function mutations are associated with a congenital disorder called Hirschsprung's disease. The transcription factor Pax2 has been studied in CNS and cranial placode development but has not been previously implicated in ENS development. In this work, the authors begin with the unexpected observation that conditional knockout of Ednrb in Pax2-expressing cells causes a similar aganglionosis, growth retardation, and obstructed defecation as conditional knockout of Ednrb in Wnt1-expressing cells. The investigators then use the Pax2 and Wnt1 Cre transgenic lines to lineage-trace ENS derivatives and assess the effects of loss of Ret or Ednrb during embryonic development in these lineages. Finally, they use explants from the corresponding embryos to examine the effects of GDNF on progenitor outgrowth and differentiation.

      Strengths:

      -  The manuscript is overall very well illustrated with high-resolution images and figures. Extensive data are presented.

      -  The identification of Pax2 expression as a lineage marker that distinguishes a subset of cells in the ENS that may be distinct from cells derived from Wnt1+ progenitors is an interesting new observation that challenges the current understanding of ENS development.

      -  Pax2 has not been previously implicated in ENS development - this manuscript does not directly test that role but hints at the possibility.

      -  Interrogation of two distinct signaling pathways involved in ENS development and their relative effects on the two purported lineages.

      The reviewer provided a succinct and accurate summary of our analysis. We correct just the one statement that the ENS is entirely populated by vagal crest. The paper cited by the reviewer (PMID: 38452824) used Wnt1DreERT2 to lineage label the NC population, so of course only looked at neural crest (comparing vagal vs. sacral NC). The advance in our study is to newly document the independent contribution of the placodal lineage.

      Weaknesses:

      -  The major challenge with interpreting this work is the use of two transgenic lines, rather than knock-ins, Wnt1Cre and Pax2-Cre, which are not well characterized in terms of fidelity to native gene expression and recombination efficiency in the ENS. If 100% of cells that express Wnt1 do not express this transgene or if the Pax2 transgene is expressed in cells that do not normally express Pax2, then these observations would have very different interpretations and not support the conclusions made. The two lineages are never compared in the same embryo, which also makes it difficult to assess relative contributions and renders the evidence more circumstantial than definitive.

      We do not agree that the Cre lines being transgenics rather than knock-ins changes the utility of these reagents or the interpretation of the results; there are also potential problems with knock-in alleles. Wnt1Cre has been in use for 25 years as a pan-neural crest lineage cell marker with exceptional efficiency and specificity (including numerous studies of the ENS), so we disagree that it is not well characterized. Pax2Cre of course has not previously been studied in the ENS, but it has been broadly used in other contexts (e.g., craniofacial, kidney). That said, and as noted in our original manuscript, we are aware that an issue of this study is the uniqueness of the recombination domains of the two Cre lines.  As we wrote, Wnt1Cre and Pax2Cre cannot be combined into the same embryo because they are both Cre lines, and we do not have a suitable nonCre recombinase line to substitute for either. Instead, we demonstrate that the two lines recombine in distinct territories of the early embryonic ectoderm, and that the two lineages thus labeled are distinct in marker expression at the initial onset of their delamination, utilize Edn3-Ednrb and GDNF-Ret in distinct ways during their migration to the hindgut, and contribute to different terminal cell fates in the colon. We think this evidence of the distinct nature of the two lineages from start to finish is compelling rather than merely circumstantial.

      -  Visualization of the Pax2-Cre and Wnt-1Cre induced recombination in cross-sections at postnatal ages would help with data interpretation. If there is recombination induced in the mesenchyme, this would particularly alter the interpretation of Ednrb mutant experiments, since that pathway has been shown to alter gut mesenchyme and ECM, which could indirectly alter ENS colonization.

      We have several thoughts about this comment. First, we are uncertain why postnatal analysis would be informative, as ENS colonization occurs (or fails to occur in mutants) during embryogenesis. The reviewer might be thinking of a juvenile stage additional contribution to the ENS, which is addressed below (responses to Recommendations for the Authors) but as we discuss there is not relevant to our analysis. Second, we did examine recombination in the distal hindgut at E12.5 during ENS colonization (Fig. 1f and 1h) and did not see overlap between either Cre recombination domain and Edn3 mRNA expression (which is expressed by the nonENS mesenchyme). Furthermore, Ednrb is not expressed in the gut mesenchyme during ENS colonization (Fig. 7figure supplement 1), thus ectopic mesenchymal Cre expression, if any, by either line would have no impact in Cre/Ednrb mutants. Lastly, the reviewer’s idea could have been a plausible hypothesis at the onset of the project, but here we show positive evidence for a different explanation. We do not rigorously exclude the reviewer’s hypothesis, nor other theoretically possible models, but we think we have provided a strong case to support the direct involvement of Ret and Ednrb in ENS progenitors rather than in surrounding non-neural mesenchyme.

      -  No consideration of glia - are these derived from both lineages?

      To properly address this question would require new reagents and analyses that we have not yet initiated. While an interesting question from a developmental biology standpoint, we don’t think that this investigation would change any of the interpretations that we make in the manuscript.

      -  No discussion of how these observations may fit in with recent work that suggests a mesenchymal contribution of enteric neurons (PMID: 38108810).

      The recent paper cited by the reviewer is very explicit in describing this mesenchymal contribution to the ENS as occurring after postnatal day P11. Other than the terminal Hirschsprung phenotype, all of our analysis of cell lineage migration and fate and colonic aganglionosis was conducted at embryonic or early (P9) postnatal stages. We therefore do not see a relation of our work to this study. In light of this paper, however, we do agree that it would be worthwhile in a future study to explore Wnt1Cre and Pax2Cre lineage dynamics in the ENS of older mice.

      Reviewer #1 (Recommendations For The Authors):

      (1) The authors should reanalyze multiple single-cell RNA-seq datasets available now, to see if these cells are detected in those studies and then look at the global transcriptional profile of these Pax2-positive cells compared to the other vagal neural crest-derived ENCDCs. Some of these datasets can be found here - PMIDs: 33288908, 37585461, and https://www.gutcellatlas.org/.

      We disagree that the datasets from previous studies provide additional insights that are relevant to the current study. It must be appreciated that Wnt1Cre and Pax2Cre are genetic lineage tracers and that migratory ENS progenitor cells labeled with these reagents do not maintain expression of Wnt1 and Pax2 mRNA or protein. The Wnt1 and Pax2 genes are only transiently expressed within their distinct regions of the ectoderm, and their expression turns off as cells delaminate and begin migration. Thus, Pax2Cre-labeled ENS progenitor cells are not Pax2-positive thereafter. The single cell RNA-Seq studies suggested by the reviewer were collected from older embryos and postnatal mice, and do not represent the E10.5-E11.5 period that accounts for genesis of Ret-mediated and Ednrb-mediated Hirschsprung disease pathology. Even with the most recent work by Zhou et al (Dev Cell, 2024) that included E10.5 cells, this analysis only evaluated neural crest-derived Sox10Cre lineage cells, which does not include the placode-derived Pax2Cre lineage (as we show explicitly in Fig. 2-figure supplement 2).  Consequently, it would not be possible to find the “Pax2-positive cells” in these datasets. Performing a new transcriptomic analysis by isolating Pax2Cre-lineage and Wnt1Cre-lineage cells at the appropriate developmental time points could be the basis of future studies, but we think these are beyond the scope of the present paper. 

      (2) Even in their current quantification method of using immunofluorescent cells in a microscopic field, the authors count very few cells. The quantification in Figures 2v-2z is only from 4 embryos and is in the hundreds. This leads to misrepresentation of cell numbers and is best reflected in Figure 2x, where Wnt1Cre/Ret GI tracts have 0 Ret +ve cells, which we now know is not true even in ubiquitous Ret null embryos, where Ret null cells are detected as late as E14.5 (PMID 37585461)

      Because of the reviewer’s comment, we recognize that the specific detail about cell numbers wasn’t properly written. We didn’t count a few hundred cells total, it was a few hundred cells per embryo. Exact numbers are provided in the revised figure legend where “cells/embryo” is now explicitly stated. Multiplied by the number of embryos, this means that we evaluated approx. 1000 total cells per genotype and time point in cases where Ret+ and/or GFP+ (lineage+) cells were found. The total absence of such cells in Wnt1Cre/Ret mutants is a rigorous conclusion. Our results do not misrepresent nor contradict the study by Vincent et al (PMID 37585461). Our analyses were performed on gut tissue isolated at E10.5 and E11.5 stages, which is long before Schwann cell precursors (SCPs, the primary focus of the Vincent et al study) colonize the gut (E14.5; Uesaka et al, 2015. PMID: 26156989). Indeed, as the reviewer notes, SCPs migrate into the gut in a Retindependent manner. For being at a much earlier time point, our focus is on the cranial ectoderm sources of ENS progenitors. We have adjusted the text associated with Fig. 2 to make this more clear.

      (3) There are multiple sections in the manuscript that rehash already known facts, like the whole section about Wnt1 conditional Ret null mice which show failure of migration of ENCDCs. This has been shown multiple times and doesn't add anything to the author's story.

      We think this comment stems from the reviewer’s perception that the Pax2Cre lineage is a subset of neural crest. The Wnt1Cre data (including Ret-deficient and Ednrb-deficient embryos) presented in the manuscript are not intended to rehash what is already known but to establish important similarities and differences between the newly identified placode-derived and the well-established neural crest-derived ENS progenitor cells. In light of the reviewer’s suggestion #8 below, to move the Wnt1Cre lineage analysis to a supplement, this information remains in the main text to provide proper comparison to the Pax2Cre-lineage profile. We think we were fair in the text to the legacy of work on neural crest and ENS development and were explicit in using our Wnt1Cre analysis to compare to the Pax2Cre lineage. Finally, we point out that our analysis was conducted on a different genetic background (outbred ICR) compared to previous studies, and there are strain-specific differences in Hirschsprung-associated lethality between our background and previous studies, so it was not impossible that the behavior of the neural crest cell lineage in the ICR background could be different from past observations on different backgrounds. Although we did not identify any major differences, it is important that the information on NC behavior in this background be presented. 

      (4) Also, the conclusion drawn for Figure 5C "this indicates that the Wnt1Cre-derived cells do not harbor a cellautonomous response to GDNF" seems to suggest the authors are not very well versed with the ENS literature. GDNF as well as EDN3 are expressed from surrounding mesenchyme and are cell non-autonomous.

      The reviewer seems to have misread or misunderstood the specific statement as well as the more important broader conclusion of the experiment. First, of course the source of GDNF ligand in vivo is the mesenchyme. The explant assay was designed to eliminate this and then to substitute GDNF as provided experimentally. The focus of the experiment was to address the response to GDNF, not the source of GDNF. But more importantly, the experiment revealed a surprising outcome that the reviewer did not appreciate. In Pax2Cre/Ret mutants, the Wnt1Cre lineage still expresses Ret, yet does not grow out from the gut explant when provided with GDNF. This shows that the neural crest lineage requires Ret function in placode-derived cells in order to respond to GDNF. In other words, despite expressing Ret, the NC lineage does not harbor a cellautonomous response to GDNF, as we wrote. Because this might be confusing to some readers, we have revised the description of this analysis to hopefully be more clear.

      (5) The fact that Ret and Ednrb signaling pathways interact is not a novel finding and has been reported multiple times in Ret and Ednrb mutant mice and cell lines (PMID: 12355085, 12574515 , 27693352, 31818953), potentially through shared transcription factors (PMID:31313802).It would have been more relevant if the authors could show how the specific tyrosine residue (Y 1015) in Ret is phosphorylated in the presence of Ednrb.

      The observation that human mutations in RET and EDNRB both cause Hirschsprung disease is decades old, and of course numerous studies in human, mouse, and cells have addressed the relation between the two signaling pathways. We did not mean to imply that we were the first to discover that Ret and Ednrb signaling pathways interact. The reviewer cites a number of papers all from the Chakravarti lab that address this phenomenon; while these are a valuable contribution to the field, there is still more to be learned. The model elaborated in PMID: 31313802, in which Ret and Ednrb are both enmeshed in a common gene regulatory network, does not readily explain why each has a different phenotypic manifestation and doesn’t take into account the importance of the placodal lineage. The main new contributions of our paper are the existence of a new cell lineage that contributes to the ENS, and that the placodal and neural crest lineages utilize Ret and Ednrb signaling differently. The clarification of how these elements are differentially used by the two lineages explains long-segment and short-segment Hirschsprung disease (Ret and Ednrb mutants, respectively) far better than in past studies. The reviewer unfortunately dismisses these insights and seems to feel that a biochemical exploration of one specific component of the signaling interaction (Y1015 phosphorylation) would be more relevant. This should be the basis of future studies and are beyond the scope of the new findings reported in the present paper. 

      (6) What is the mechanism of the presence of Y1015 phosphorylation in 33% of Ednrb deficient Pax2Cre cells? It appears to me what the authors report as absent phosphorylation in the 67% of cells could be just weak staining or cells missing in prep.

      The reviewer, referring to Fig. 7q, presumably meant to say Wnt1Cre rather than Pax2Cre. The reviewer overlooked that we provided an explanation for this observation in our original manuscript. This sentence reads “Because Ednrb is expressed only in a subset of Wnt1Cre-derived enteric progenitor cells (Figure 7 – figure supplement 1), the residual Y1015 phosphorylation observed in Wnt1Cre/Ednrb mutant cells is likely to occur in the Ednrb-negative Wnt1Cre-derived cell population”. The sentence is retained unchanged in the revised manuscript. The explanation is not because of weak staining or problems with tissue preparation.

      (7) The references the authors cite regarding the previous discovery of Ret expression in the nucleus are incorrect. The review articles the authors cite do not mention anything about Ret expression in the nucleus. The evidence of nuclear localization of Ret previously comes from overexpression studies in HEK293 cells (PMID: 25795775). Such overexpression studies are fraught with generating noisy data for well-documented reasons. But if this observation is correct, the authors miss a great opportunity to identify what the Ret protein is doing in the nucleus. Is it in direct contact with its known transcription factors like Sox10 and Rarb? This would shed a lot of light on the possible mechanism of Ret LoF observed in Ret mutant mice

      The reviewer overlooked that the one of the review articles that we cited (Chen, Hsu, & Hung, 2020) has a dedicated paragraph for RET (section 3.14), which summarizes the work by Barheri-Yarmand et al (PMID: 25795775) which is the very paper noted by the reviewer in the comment above. The reviewer also somewhat misstated the results of the Barheri-Yarmand et al study. By immunostaining, this paper showed nuclear localization of endogenous Ret, albeit a version of Ret with a disease-associated mutation that makes it constitutively active by constitutive autophosphorylation. Nonetheless, this was endogenous Ret. The paper also used overexpression of GFP-tagged RET in HEK293 cells to show that wildtype RET can behave in a similar manner, at least under these circumstances. Our point is simply that Ret (and other receptor tyrosine kinases) can be found in the nucleus in certain biological contexts, and our observations are consistent with this precedent.

      The reviewer also suggests a biochemical follow-up analysis related to this observation, which we agree would be of interest. Such an investigation however is beyond the scope of the present study.

      (8) The manuscript could benefit from a major rewrite by reorganizing sections to make it easy for the readers to follow the narrative.

      Many sections about the role of Ret and Ednrb in Wnt1cre-derived ENCDCs can be moved to a supplement. These facts are well-documented and have been proven before.

      This was addressed in our response to comment #3 of this reviewer. The figures have been kept as main figures in the revised manuscript to allow side-by-side comparison to parallel analysis of the Pax2Cre lineage.

      - The observation that only a handful of Pax2Cre cells at E10.5 express Ret and the observation that conditional Ret null abrogates these cells at E11.5, are not presented together and makes connecting these two facts difficult.

      Ret expression at E10.5 and E11.5 are both shown in the same figure (Fig. 2). In the presentation of these results, we first describe in normal development that Ret is expressed differently in E10.5 ENS progenitors between the Pax2Cre and Wnt1Cre lineages. This is additional support for the argument that the two lineages are molecularly distinct. Then comes evaluation of postnatal fates with different markers before we return to embryonic Ret expression. We acknowledge that this can make it difficult to connect these observations. We decided to retain the original organization in order to not lose this important conclusion. However, we have revised the text to hopefully make this connection between the sections more congruent.

      Reviewer #2 (Recommendations For The Authors):

      - The labeling of some as "figure supplements" is really hard to follow in the text and confusing to interpret when a main figure or supplemental figure is being referenced, and which one.

      We understand this comment, but this is journal style and outside of our control. We have kept the journal format in the revised manuscript.

      - The data in Figures 3b-c is well established in the field and somewhat misinterpreted. NOS1 neurons in the mouse ENS and their projections have been well described (Sang and Young, 1996, and other studies). CGRP immunoreactivity would reflect both ENS CGRP-expressing neurons and visceral afferents from DRG.

      There of course is a history of analysis of NOS1, CGRP, and other markers in the ENS. The focus of the analysis in Fig. 3 is to demonstrate how the cells that express these markers are impacted by gene manipulation in the Wnt1Cre and Pax2Cre lineages. For the giant migrating contractions that are associated with defecation, ample past electrophysiological studies have established that mechanosensory CGRP+ neurons trigger NOS+ inhibitory neurons (and ACh+ excitatory neurons) of the myenteric plexus to propel colonic contents. Thus, these are the relevant markers to explain the lack of colonic peristalsis in Ednrb-deficient mice. To our awareness, our results with NOS1 do not contradict any past study, including the Sang and Young 1996 description. Regarding CGRP, indeed the reviewer is correct that this marker is expressed by both neuronal subtypes. Two arguments support the specific derivation of ENS mechanosensory neurons from the Pax2 lineage. First, the ENS and DRG neurons can be distinguished by the location of their cell bodies and their axon extensions in the gut wall; only the ENS neurons are deficient in Pax2Cre/Ednrb mutants (as documented in Fig. 3). Second, the DRG population is derived from neural crest and is not labeled by Pax2Cre. If this population of CGRP+ neurons had functional relevance to colonic peristalsis, this would not be altered in Pax2Cre/Ednrb mutants. Indeed, the CGRP+ afferent nerve endings of DRG origin in the distal colon are mechanical distension sensors but do not modulate either ENS or autonomic nervous system activity (PMID: 37541195). We believe that our interpretation is correct.

      - The evidence in Figure 3 supporting the claim that NOS1 and CGRP-expressing enteric neurons come from distinct lineages is weak. IHC for CGRP is notoriously poor at labeling soma in the ENS. IHC for tdTomato to ensure the detection of low levels of Tomato expression and quantification of observations would strengthen this claim.

      CGRP is a vesicular peptide which is stored and transported in vesicles, therefore the antibody against CGRP labels vesicular particles of soma and synaptic vesicles along the axons of those CGRP-producing neurons.

      It is not expected to label the entire cytoplasm (or the range of subcellular organelles) as NOS antibody does. We did included quantification of data in Figure 3-figure supplement 1 in the manuscript to support the claim of lineage derivation. As described in the Methods section of the manuscript, we used binary threshold selection for Tomato+ cell count using Fiji-Image J, which detects both TomatoHigh and TomatoLow cells as Tomato+; we feel this is equal to or even superior to IHC for this analysis. 

      - IHC panels in Figures 3h-o are largely uninterpretable. Most of the signal seems to be non-specific background staining in the mucosa and quantification of mucosal signal in this context does not seem meaningful.  

      We disagree with the reviewer’s comment. As described in the response above, CGRP+ mechanosensory neurons send their peripheral axon projections to innervate mucosa (sensory epithelial cells), and NOS+ inhibitory motor axons innervate the circular muscle. Thus, panels h-o of Fig. 3 focus on the axonal profile and are not intended to visualize soma, which is why sagittal views are presented instead of flatmount views. All of the controls were performed side-by-side to confirm that the signal is real and interpretable.

      Note also that the colon does not have villi so this annotation should be revised.

      We appreciate that the reviewer brought this misstatement to our attention. We corrected this error in the revised manuscript.

      - Phospho-RET staining in Figure 7 is difficult to discern and interpret with high background. Positive and negative controls would strengthen these data.

      Fig. 7 shows phospho Ret-Y1015 staining in lineage-labeled Wnt1Cre/Ednrb/R26nTnG mutants. The strength of the signal to noise in the figure is a matter of Ret expression level and the quality of the anti-pY1015 antibody. We are not aware of a meaningful positive control that has been validated in the literature that we could use for comparison. The ideal negative control would be to perform the same analysis in Wnt1Cre/Ret/R26nTnG mutants, but because this manipulation eliminates the entire NC cell lineage from the colon, there would be no NC cells in which to visualize background staining in this lineage with this antibody when Ret protein is not present. We note that anti-pY1096 did not show a difference in staining between control and mutant, which supports the interpretation of a specific impact on pY1015. We also point out here, as in the text, that we do not yet have any validation that phosphorylation of Y1015 is functionally important in NC migration to the distal colon. Clearly, more work to address this role and to demonstrate the mechanism of phosphorylation of this specific residue in response to Edn3-Ednrb signaling will be needed.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review):

      Summary:

      The authors set out to measure the diffusion of small drug molecules inside live cells. To do this, they selected a range of flourescent drugs, as well as some commonly used dyes, and used FRAP to quantify their diffusion. The authors find that drugs diffuse and localize within the cell in a way that is weakly correalted with their charge, with positively charged molecules displaying dramatically slower diffusion and a high degree of subcellular localization. <br /> The study is important because it points at an important issue related to the way drugs behave inside cells beyond the simple "IC50" metric (a decidedly mesoscopic/systemic value). The authors conclude, and I agree, that their results point to nuanced effects that are governed by drug chemistry that could be optimized to make them more effective. 

      We are grateful to the reviewer for summarizing the work and appreciate him/her pointing out that it is high time to consider the drug aggregation and high degree of subcellular localization while optimizing to make them more effective beyond the mesoscopic value like "IC50".

      Strengths: 

      The work examines an understudied aspect of drug delivery. 

      The work uses well-established methodologies to measure diffusion in cells 

      The work provides an extensive dataset, covering a range of chemistries that are common in small molecule drug design 

      The authors consider several explanations as to the origin of changes in cellular diffusion

      We are grateful to the reviewer for pointing out the strengths of the manuscript.

      Weaknesses: 

      The results are described qualitatively, despite quantitative data that can be used to infer the strength of the proposed correlations. 

      The statistical treatment of the data is not rigorous and not visualized according to best practices, making it difficult for readers to assess the significance of the findings. 

      Some important aspects of drug behavior are not discussed quantitatively, such as the cell-to-cell or subcellular variability in concentration. 

      It is unclear if the observed behavior of each drug in the cell actually relates to its efficacy - though this is clearly beyond the scope of this specific work.

      We have addressed the weaknesses found by the reviewer (see bellow in Reviewer #1 Recommendations For The Authors). Concerning the last point, It would have been indeed very valuable to find a relation between drug's observable behavior and their efficacy, but as the reviewer indicates, it is beyond the scope of this work.

      Reviewer #2 (Public Review): 

      Summary:

      Blocking a weak base compound's protonation increased intracellular diffusion and fractional recovery in the cytoplasm, which may improve the intracellular availability and distribution of weakly basic, small molecule drugs and be impactful in future drug development. 

      We are thankful to the reviewer for summarizing our work and acknowledging that the points raised above can be impactful in future drug development.

      Strengths: 

      (1) The intracellular distribution of drugs and the chemical properties that drive their distribution are much needed in the literature. Thus, the idea behind this paper is of relevance. 

      (2) The study used common compounds that were relevant to others. 

      (3) Altering a compound's pKa value and measuring cytosolic diffusion rates certainly is inciteful on how weak base drugs and their relatively high pKa values affect distribution and pharmacokinetics. This particular experiment demonstrated relevance to drug targeting and drug development. 

      (4) The manuscript was fairly well written. 

      We are thankful to the reviewer for pointing out the strengths of the manuscript like the intracellular distribution of drugs and properties that drive it, which are missing in the literature.

      Weaknesses: 

      (1) Small sample sizes. 2 acids and 1 neutral compound vs 6 weak bases (Figure 1). 

      We fully agree with the reviewer on this point. However, the major limitation we have faced here is the small number of drug/drug-like molecules that fluorescent with sufficient high quantum yields. For this study, we initially screened 1600 drugs for their fluorescence in the visible spectrum, and penetration into cells, resulting in 16 drugs. Of those, a small number was suitable for FRAP due to low quantum yield. For some of the molecules (Mitoxantrone, Priaquine), recovery was minimal, making them challenging to study. We added this information in the materials and method section under “Selection of drugs used in this study” (p.10).

      (2) A comparison between the percentage of neutral and weak base drug accumulation in lysosomes would have helped indicate weak base ion trapping. Such a comparison would have strengthened this study. 

      For weakly basic compounds, the ionic form and the non-ionic form of the molecules always remain in equilibrium. The direction of the equilibrium depends on the pH of the medium, which determines the major form of the drug molecules in the solution. Our examples of GSK3 inhibitor (neutral compound, pka~7.0, as predicted by Chemaxon), shows behaviour very similar to the other basic drugs (pka>8) inside the cells. As lysosome pH is about 5.0, the neutral drug also gets protonated inside the lysosomes, as the colocalization study reveals (Figure 4). We added Fig S16 C-D, where we show co-localization of three drugs within the lysosomes showing that all the three weak base drugs colocalize to acidic lysosomes from moderately to extensively. See also in p. 11 under “Confocal microscopy and FRAP Analysis section”.

      (3) When cytosolic diffusion rates of compounds were measured, were the lysosomes extracted from the image using Imaris to determine a realistic cytosolic value? In real-time, lysosomes move through the cytosol at different rates. Because weak base drugs get trapped, it is likely the movement of a weak base in the lysosome being measured rather than the movement of a weak base itself throughout the cytosol. This was unclear in the methods. Please explain.

      We want to thank the reviewer for pointing this out. To clarify the point, we added to the material and method section in p. 13 the following text: “When the areas of bleach were selected in the drug-treated cell cytoplasm, we avoided the lysosomes as much as possible, within the resolution limits of the confocal microscope. Lysosomes themselves were measured to move within the cytoplasm with an diffusion coefficient of 0.03-0.071 µm2 s−1  (Bandyopadhyay et al., 2014), which is much slower than the diffusion measured for even the slowest compounds using fast Line FRAP, further validating that we did not measure lysosome diffusion.” In addition, we show that in cells after Bafilomycin A1 or Na-Azide treatments the number of lysosomes was reduced drastically (Figures S8& S9, and Figure 7), while the rates of diffusion remain very slow, similar to those measured without lysosomal inhibitors.   

      (4) Because weak base drugs can be protonated in the cytoplasm, the authors need to elaborate on why they thought that inhibiting lysosome accumulation of weak bases would increase cytosolic diffusion rates. Ion trapping is different than "micrometers per second" in the cytosol. Moreover, treating cells with sodium azide de-acidifies lysosomes and acidifies the cytosol; thus, more protons in the cytosol means more protonation of weak base drugs. The diffusion rates were slowed down in the presence of lysosome inhibition (Figure 7), which is more fitting of the story about blocking protonation increases diffusion rates, but in this case, increasing cytosolic protonation via lysosome de-acidification agents decreases diffusion rates. Please elaborate.

      We thank the reviewer for the comment. We added to the results in p. 7 (top) the following “While we selected bleach spots to be small and located outside of lysosomes, this does not assure that some of the bleached area does not include smaller lysosomes. Therefore we investigated whether inhibiting lysosomal trapping will eliminate slow diffusion of cationic drugs.” In addition, we added to the results in p. 7-8 the following: “Comparative FRAP profiles and diffusion coefficients (Figure 7B-D and 7F-H) were slow, but conversely to Bafilomycin, sodium azide treatment did cause a further reduction is rates from Dconfocal 2.4±0.1 µm2s-1  to 1.8±0.1µm2s-1 for quinacrine and from 0.6 to  0.45 µm2s-1 for the GSK3 inhibitor (Figure 7C and G). Both Bafilomycin and sodium azide treatments resulted in elimination of drug confinement in the lysosome, and the small difference in diffusion rates may be a result of the de-acidification of the lysosomes by sodium azide, which may increase the protons in the cytosol upon treatment.”

      Reviewer : A discussion of the likely impact: 

      The manuscript certainly adds another dimension to the field of intracellular drug distribution, but the manuscript needs to be strengthened in its current form. Additional experiments need to be included, and there are clarifications in the manuscript that need to be addressed. Once these issues are resolved, then the manuscript, if the conclusions are further strengthened, is much needed and would be inciteful to drug development.

      Reviewer #1 (Recommendations For The Authors):

      Major issues: 

      The paper suffers from poor statistical treatment of the data. FRAP recovery curves should be shown for each repeat, overlaid by an average with SDs as errorbars or shaded regions shown. In bar plots, SEMs should be eliminated in favor of StdDevs. All datapoints should be shown for each bar in Figs. 3-8. To show differences in D_confocal appropriate statistical tests should be conducted. In addition it is unclear what an "independent repeat" is. Does this mean 30 separate imaging sessions/drug treatments/etc? Is it 30 cells on the same coverslip? Is it a combination of both? All reported errors, SD or SEM, should have a single significant digit. Guidelines and best practices for representing quantitative imaging data are all described and visualized in detail in Lord et al. JBS 2020. 

      We improved the statistics and added the individual progression curves and did the statistics on them as requested. See Figure S2 for individual FRAP curves of fluorescein, GSK3 inhibitor and and quinacrine. Statistical analysis of the individual FRAP curves is in Figure 3B, 4B, 5B, 7C and G. For details see figures legends and material and methods p. 13 in “Determination of Dconfocal from FRAP results”. Line FRAP was done from the cells taken from different plates, treated independently (see text p. 13).   

      The extensive (and commendable!) dataset the authors have collected can be put to better use than what is currently done. The main text figures in the current form of the preprint are mostly descriptive and their discussion is qualitative, to the point where the author's conclusions are supported only anecdotally. Instead, I would much rather see panels that collate the entire dataset (both protein and drugs) numerically, comparing diffusion values in buffer/cytoplasm/nucleus for all drugs (Like Fig. S6, which is in my opinion the most important in the paper but for some reason relegated to the SI). In addition I would like to see correlations within the dataset, such as D_confocal vs. pKa, vs. concentration (as measured by overall fluorescence signal, see my comment below), vs. mw, or vs. specific chemical moieties (number of charges, aromatic rings, etc). Such correlations should be discussed in terms of a correlation coefficient if conclusions were to be drawn from them, and include errors if available. 

      We want to thank the reviewer for these suggestions. We now made new Figures 9, and S16 to compare multiple parameters. Figure 9C shows a clear relation between pKa and Dconfocal, but no relation was found between logP, MW or number of aromatic rings and Dconfocal. Fig. S3 also shows the relation between drug concentration and Dconfocal values. These data are now discussed in the discussion section in p. 9 (bottom). 

      The drug sequestration hypothesis and other conclusions brought forth by the authors could be further tested by looking at the concentration dependence of the drugs inside eachcell and/or its partitioning between different subcellular compartments. The concentration dependence of these drugs is discussed in a very anecdotal fashion using two concentrations - and despite some cases showing an effect no further studies were done. Drug concentrations in this experiment can vary between cells between repeats or even within a single repeat as a result of drug chemistry and delivery methods (microinjection/passive permeability). This is especially important since it is unclear what clinically-relevant concentrations are for each drug (or at least an IC50 for the cell types tested here). I would like to see a quantitative measure of concentrations as another metric to compare diffusion behavior (see my comment above as well). 

      And maybe one thing to consider in addition would be some discussion in the paper about what sub-cellular distributions might actually mean in the context of drug efficacy (asking for myself as well!) - a paragraph describing recent works on the topic with some references could be instructive. 

      We want to thank the reviewer for the suggestion. We added now Figure S3, showing the relation between fluorescence intensity in each cell (which is directly related to the concentration of the compound) and FRAP rates and percent recovery for fluorescein, GSK inhibitor and Quinacrine. The results show now relation between drug concentration and FRAP rates, and some relation towards percent recovery. These data are now discussed in the main text (p. 4 bottor and p.6) and in the discussion (p. 9, bottom).

      Minor issues: 

      Readers could benefit from a schematic showing the line FRAP method. It is difficult to understand from the text.

      We show now in Figure 2 the line-FRAP method, and discuss it in the introduction (p. 3 top).

      Have the authors considered enrichment in the cell membrane? Summed intensity projections or co-labeling with membrane dyes could prove useful to identify if the membrane is enriched in fluorescence.

      The microscopy slides, including the super-resolution image in Figure S15 do not show enrichment of membranes.

      Cell extracts obtained by chemical lysis are problematic because they contain surfactants. This comparison might not be meaningful. 

      The reviewer is correct about surfactants; However, this is only for illustration to show the crowd density of the cell extracts compared to live cells.

      Unclear why "Bleach size" plots are shown. They are not discussed in the main text. 

      We show now a bleach size plot in Figure 2, where we explain the method. We removed them from the other figures.

      Some figure panels have a strange aspect ratio, causing text to look distorted. 

      We corrected the figure distortion in the revised manuscript.

      How are the values of D_confocal in buffer compared with past literature? Should these not all be diffusion limited? BCECF - larger than many of the drugs used here - shows ~ 100 μm^2/s in buffer (Verkman TiBS 2002).

      We discussed this in our previous work (Ref. 13, iscience 2022, Dey et al.) Dconfocal is a relative diffusion rate and should not be confused with single-molecule diffusion coefficients. FRAP cannot measure the diffusion of more than 100 μm^2/s in the buffer. However, when comparing apparent FRAP rates between different fluorophores, it is not quantitative due to the major implication of the bleach radius towards diffusion rates. The rate constant normalized by bleach radius^2 is the proper way to compare i.e., our Dconfocal. (Ref. JMB 2021, iScience 2022 by Dey et al.).

      Reviewer #2 (Recommendations For The Authors): 

      Recommendations: 

      (1) Page 3 at the bottom of the Introduction states, "...sodium azide (Hiruma et al., 2007) inhibited accumulation in lysosomes, cellular diffusion...increased only slightly." However, Figure 7C, F shows a sodium azide-induced decrease in the Dconfocal cellular diffusion. Please clarify.

      Thank you for pointing this out; we corrected it in the revised version, including adding statistics.

      (2) Page 6 states, "Quinacrine accumulation in the lysosome was observed also immediately after micro-injection, with aggregation increasing over time. Dconfocal of 4.2{plus minus}0.2 µm2 s-1 was calculated from line-FRAP immediately after micro-injection, slowing to 2.2{plus minus}0.1 µm2 s-1 following 2 hours incubations, with fractional recoveries of 0.63 and 0.57 respectively." If lysosome sequestration does not have an effect on cytosolic diffusion rates as the manuscript concludes, why do the authors think the diffusion rate decreased here within 2 hours? A solid conclusion would strengthen the conclusions of this manuscript rather than passing over it.

      Thank you for pointing this out. We added the following text to page 7: “It is notable that the Dconfocal for Quinacrine remained consistent regardless of Bafilomycin treatment, 2 hours after incubation (Fig. S9D, 2.4±0.1 µm2s-1). However, when measured immediately after injection, the diffusion coefficient was higher at 4.2 µm2s-1 (Fig. S5D). This result does not support the notion that the faster diffusion measured immediately after cellular injection relates to lysosomal aggregation, and would better support self-aggregation, or aggregation with other molecules in the cell, which increases over time. This notion is further supported by the almost complete lack in FRAP observed 24 hours after injection (Fig. S5C).”

      (3) In the Results section, the subheading states, "Inhibition of lysosomal sequestration is only slightly increasing diffusion in cells", but the conclusion for bafilomycin was...Dconfocal values were not altered by Bafilomycin A1", and the conclusion for sodium azide was diffusion coefficients (Figure 7B-C and 7E-F) were not much changed for the two drugs and stayed low... similarly to what was observed with Bafilomycin." The clear question is what is the result, "slightly increased diffusion, decreased diffusion, or had no significant effect at all"? Please clarify the wording in the manuscript to accurately describe the results. 

      Indeed, a small difference is obsevered between the two treatments. We added now statistical significance to Fig. 7D and H and to Fig. S8 and S9. In addition, we clarified this point in the text in p.7-8: “Comparative FRAP profiles and diffusion coefficients (Figure 7B-D and 7F-H) were slow, but conversely to Bafilomycin, sodium azide treatment did cause a further reduction is rates from Dconfocal 2.4±0.1 µm2s-1  to 1.8±0.1µm2s-1 for quinacrine and from 0.6 to  0.45 µm2s-1 for the GSK3 inhibitor (Figure 7C and G). Both Bafilomycin and sodium azide treatments resulted in elimination of drug confinement in the lysosome, and the small difference in diffusion rates may be a result of the de-acidification of the lysosomes by sodium azide, which may increase the protons in the cytosol upon treatment.”

      (4) In Figure 8B, why was the Dconfocal for AM-fluorescein with or without sodium azide not included here? Besides consistency, the results might demonstrate significance. Please elaborate on the occlusion of this data. 

      Fraction recovery after FRAP of AM-fluorescein was very low. Calculating Dconfocal rates with such low fraction recovery is meaningless, as in the time of measurement only a small fraction recovered. Therefore, we calculated Dconfocal only when fraction recovery was at least 0.5.

      (5) Throughout the Results section, the ideas and experiments are of relevance, but the suggestions/conclusions at the end of each paragraph of this section seem lightly thought out. For example, as stated on Page 8, "...however, this did not contribute new information to the puzzle." For a chemistry paper, a chemical suggestion strengthens the manuscript. 

      We want to thank the reviewer for these suggestions. We now made new Figures 9, and S16 to compare multiple parameters. Figure 9C shows a clear relation between pKa and Dconfocal, but no relation was found between logP, MW or number of aromatic rings and Dconfocal. Fig. S16 also shows the relation between drug concentration and Dconfocal values. We revised the discussion section to giver more weith to these quantitative assessments. These data are now discussed in p. 9.

      In conclusion, the manuscript's ideas are needed, but the conclusions drawn from the experiments need to be strengthened, more explanatory, and consistent with the main conclusion of the manuscript.

      See answer to point 5.

    1. Author response:

      Reviewer #1 (Public review):

      Summary:

      Mehmet Mahsum Kaplan et al. demonstrate that Meis2 expression in neural crest-derived mesenchymal cells is crucial for whisker follicle (WF) development, as WF fails to develop in wnt1-Cre;Meis2 cKO mice. Advanced imaging techniques effectively support the idea that Meis2 is essential for proper WF development and that nerves, while affected in Meis2 cKO, are dispensable for WF development and not the primary cause of WF developmental failure. The study also reveals that although Meis2 significantly downregulates Foxd1 in the mesenchyme, this is not the main reason for WF development failure. The paper presents valuable data on the role of mesenchymal Meis2 in WF development. However, further quantification and analysis of the WF developmental phenotype would be beneficial in strengthening the claim that Meis2 controls early WF development rather than causing a delay or arrest in development. A deeper sequencing data analysis could also help link Meis2 to its downstream targets that directly impact the epithelial compartment.

      Strengths:

      (1) The authors describe a novel molecular mechanism involving Mesenchymal Meis2 expression, which plays a crucial role in early WF development.

      (2) They employ multiple advanced imaging techniques to illustrate their findings beautifully.

      (3) The study clearly shows that nerves are not essential for WF development.

      We thank the reviewer for valuable comments that will help improve our study.

      Weaknesses:

      (1) The authors claim that Meis2 acts very early during development, as evidenced by a significant reduction in EDAR expression, one of the earliest markers of placode development. While EDAR is indeed absent from the lower panel in Figure 3C of the Meis2 cKO, multiple placodes still express EDAR in the upper two panels of the Meis2 cKO. The authors also present subsequent analysis at E13.3, showing one escaped follicle positive for SHH and Sox9 in Figures 1 and 3. Does this suggest that follicles are specified but fail to develop? Alternatively, could there be a delay in follicle formation? The increase in Foxd1 expression between E12.5 and E13.5 might also indicate delayed follicle development, or as the authors suggest, follicles that have escaped the phenotype. The paper would significantly benefit from robust quantification to accompany their visual data, specifically quantifying EDAR, Sox9, and Foxd1 at different developmental stages. Additionally, analyzing later developmental stages could help distinguish between a delay or arrest in WF development and a complete failure to specify placodes.

      The earliest DC (Foxd1) and placodal (EDAR, Lef1) markers tested in this study were observed only in the escaped WFs whereas these markers were missing in expected WF sites in mutants. This was also reflected in the loss of typical placodal morphology in the mutant’s epithelium. On the other hand, escaped WFs developed normally as shown by the analysis in Supp Fig 1A-B showing their normal size. These data suggest that development of escaped WFs is not delayed because they would appear smaller in size. To strengthen this conclusion, we will analyze whiskers at E18.5 in Meis2 cKO mice by staining Edar, Foxd1, Sox9 and/or Lef1 in revision and results will be added in the revised manuscript. Two-week time for this provisional response is too short to gather all these data. As far as quantification is concerned, we have already quantified the number of whiskers in controls and mutants at E12.5 and E13.5 in all whole mount experiments we did, i.e. Shh ISH and Sox9 or EDAR whole mount IFC. We pooled all these numbers together and calculated the whisker number reduction to 5.7+/-2.0% at E12.5 and 17.1+/-5.9 at E13.5 (page 3, row 114). We will also quantify the whisker number at E15.5 and E18.5 in the revised manuscript.

      (2) The authors show that single-cell sequencing reveals a reduction in the pre-DC population, reduced proliferation, and changes in cell adhesion and ECM. However, these changes appear to affect most mesenchymal cells, not just pre-DCs. Moreover, since E12.5 already contains WFs at different stages of development, as well as pre-DCs and DCs, it becomes challenging to connect these mesenchymal changes directly to WF development. Did the authors attempt to re-cluster only Cluster 2 to determine if a specific subpopulation is missing in Meis2 cKO? Alternatively, focusing on additional secreted molecules whose expression is disrupted across different clusters in Meis2 cKO could provide insights, especially since mesenchymal-epithelial communication is often mediated through secreted molecules. Did the authors include epithelial cells in the single-cell sequencing, can they look for changes in mesenchyme-epithelial cell interactions (Cell Chat) to indicate a possible mechanism?

      We agree with the reviewer that the effect of Meis2 on cell proliferation and expression of cell adhesion and ECM markers are more general because they take place in the whole underlying mesenchyme. Our genetic tools did not allow specific targeting of DC or pre-DCs. Nonetheless, we trust that our data show that mesenchymal Meis2 is required for the initial steps of WF development including Pc formation. As far as bioinformatics data are concerned, this data set was taken from the large dataset GSE262468 covering the whole craniofacial region which led to very limited cell numbers in the cluster 2 (DC): WT_E12_2 --> 28, WT_E13_2 --> 131, MUT_E12_2 --> 19, MUT_E13_2 --> 28. Unfortunately, such small cell numbers did not allow further sub-clustering, efficient normalization, integration and conclusions from their transcriptional profiles. Although a number of interesting differentially expressed genes were identified (see supplementary datasets), none of them convincingly pointed at reasonable secreted molecule candidate.  

      We agree with the reviewer that cellchat analysis could provide robust indication of the mesenchymal-epithelial communication, however our datasets included only mesenchymal cell population (Wnt1-Cre2progeny) and epithelial cells were excluded by FACS prior to sc RNA-seq. (Hudacova et al. https://doi.org/10.1016/j.bone.2024.117297)

      (3) The authors aim to link Meis2 expression in the mesenchyme with epithelial Wnt signaling by analyzing Lef1, bat-gal, Axin1, and Wnt10b expression. However, the changes described in the figures are unclear, and the phenotype appears highly variable, making it difficult to establish a connection between Meis2 and Wnt signaling. For instance, some follicles and pre-condensates are Lef1 positive in Meis2 cKO. Including quantification or providing a clearer explanation could help clarify the relationship between mesenchymal Meis2 and Wnt signaling in both epidermal and mesenchymal cells. Did the authors include epithelial cells in the sequencing? Could they use single-cell analysis to demonstrate changes in Wnt signaling?

      We have now analyzed changes in Lef1 staining intensity in the epithelium and in the upper dermis. According to these quantifications, we observed a considerable decline in the number of Lef1+ placodes in the epithelium which corresponds to the lower number of placodes. On the other hand, Lef1 intensity in the ‘escaped’ placodes were similar between controls and mutants. Lef1 signal in the upper dermis is very strong overall and its quantification did not reveal any changes in the DC and non-DC region of the upper dermis. These data corroborate with our coclusion that Meis2 in the mesenchyme is not crucial for the dermal Wnt signaling but is required for induction of Lef1 expression in the epithelium. However, once ‘escaper’ placodes appear, they display normal wnt signaling in Pc, DC and subsequent development. These quantification data will be added to the revised manuscript.

      (4) Existing literature, including studies on Neurog KO and NGF KO, as well as the references cited by the authors, suggest that nerves are unlikely to mediate WF development. While the authors conduct a thorough analysis of WF development in Neurog KO, further supporting this notion, this point may not be central to the current work. Additionally, the claim that Meis2 influences trigeminal nerve patterning requires further analysis and quantification for validation.

      We agree with the reviewer that analysis of the Neurogenin knockout mice should not be central to this report. Nonetheless, a thorough analysis of WF development in Neurog1 KO was needed to distinguish between two possible mechanisms: whisker phenotype in Meis2 cKO results from 1. impaired nerve branching 2. Function of Meis2 in the mesenchyme. We will modify the text accordingly to make this clearer to readers. We also agree that nerve branching was not extensively analyzed in the current study but two samples from mutant mice were provided (Fig1 and Supp Videos), reflecting the consistency of the phenotype (see also Machon et al. 2015). This section was not central to this report either but led us to focus fully on the mesenchyme. We think that Meis2 function in cranial nerve development is very interesting and deserves a separate study.

      (5) Meis2 expression seems reduced but has not entirely disappeared from the mesenchyme. Can the authors provide quantification?

      In the revised manuscript, we will provide wt/mut quantification of Meis2 expression in the dermis.

      Reviewer #2 (Public review):

      Summary:

      In this manuscript, Kaplan et al. study mesenchymal Meis2 in whisker formation and the links between whisker formation and sensory innervation. To this end, they used conditional deletion of Meis2 using the Wnt1 driver. Whisker development was arrested at the placode induction stage in Meis2 conditional knockouts leading to the absence of expression of placodal genes such as Edar, Lef1, and Shh. The authors also show that branching of trigeminal nerves innervating whisker follicles was severely affected but that whiskers did form in the complete absence of trigeminal nerves.

      Strengths:

      The analysis of Meis2 conditional knockouts convincingly shows a lack of whisker formation and all epithelial whisker/hair placode markers were analyzed. Using Neurog1 knockout mice, the authors show equally convincingly that whiskers and teeth develop in the complete absence of trigeminal nerves.

      We thank the reviewer for valuable comments that will help improve our study.

      Weaknesses:

      The manuscript does not provide much mechanistic insight as to why mesenchymal Meis2 leads to the absence of whisker placodes. Using a previously generated scRNA-seq dataset they show that two early markers of dermal condensates, Foxd1 and Sox2, are downregulated in Meis2 mutants. However, given that placodes and dermal condensates do not form in the mutants, this is not surprising and their absence in the mutants does not provide any direct link between Meis2 and Foxd1 or Sox2. (The absence of a structure evidently leads to the absence of its markers.)

      We apologize for unclear explanation of our data. We meant that Meis2 is functionally upstream of Foxd1 because Foxd1 is reduced upon Meis2 deletion. This means that during WF formation, Meis2 operates before Foxd1 induction and does not mean necessarily that Meis2 directly controls expression of Foxd1. Yes, we agree with reviewer’s note that Foxd1 and Sox2, as known DC markers, decline because the number of WF declines. We wanted to convince readers that Meis2 operates very early in the GRN hierarchy during WF development. We also admit that we provide poor mechanistic insights into Meis2 function as a transcription factor. We think that this weak point does not lower the value of the report showing indispensable role of Meis2 in WFs and possibly all HFs.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public Review): 

      [...] Strengths: 

      The method the authors propose is a straightforward and inexpensive modification of an established split-pool single-cell RNA-seq protocol that greatly increases its utility, and should be of interest to a wide community working in the field of bacterial single-cell RNA-seq. 

      Weaknesses: 

      The manuscript is written in a very compressed style and many technical details of the evaluations conducted are unclear and processed data has not been made available for evaluation, limiting the ability of the reader to independently judge the merits of the method. 

      Thank you for your thoughtful and constructive review of our manuscript. We appreciate your recognition of the strengths of our work and the potential impact of our modified PETRI-seq protocol on the field of bacterial single-cell RNA-seq. We are grateful for the opportunity to address your concerns and improve the clarity and accessibility of our manuscript.

      We acknowledge your feedback regarding the compressed writing style and lack of technical details,which are constrained by the requirements of the Short Report format in eLife. We will addresse these issues in our revised manuscript as follows:

      (1) Expanded methodology section: We will provide a more comprehensive description of our experimental procedures, including detailed protocols for the ribosomal depletion step and data analysis pipeline. This will enable readers to better understand and potentially replicate our methods.

      (2) Clarification of technical evaluations: We will elaborate on the specifics of our evaluations, including the criteria used for assessing the efficiency of ribosomal depletion and the methods employed for identifying and characterizing subpopulations within the E. coli biofilm model.

      (3) Data availability: We apologize for the oversight in not making our processed data readily available. We have deposited all relevant datasets, including raw and source data, in appropriate public repositories (GEO number: GSE260458) and provide clear instructions for accessing this data in the revised manuscript.

      (4) Supplementary information: To maintain the concise nature of the main text while providing necessary details, we will inculde additional supplementary information. This will cover extended methodology, detailed statistical analyses, and comprehensive data tables to support our findings.

      (5) Discussion of limitations: We will include a more thorough discussion of the potential limitations of our modified protocol and areas for future improvement.

      ​We believe these changes will significantly improve the clarity and reproducibility of our work, allowing readers to better evaluate the merits of our method.

      Reviewer #2 (Public Review): 

      [...] Strengths: 

      The introduced rRNA depletion method is highly efficient, with the depletion for E.coli resulting in over 90% of reads containing mRNA. The method is ready to use with existing PETRI-seq libraries which is a large advantage, given that no other rRNA depletion methods were published for split-pool bacterial scRNA-seq methods. Therefore, the value of the method for the field is high. There is also evidence that a small number of cells at the bottom of a static biofilm express PdeI which is causing the elevated c-di-GMP levels that are associated with persister formation. Given that PdeI is a phosphodiesterase, which is supposed to promote hydrolysis of c-di-GMP, this finding is unexpected. 

      Weaknesses: 

      With the descriptions and writing of the manuscript, it is hard to place the findings about the PdeI into existing context (i.e. it is well known that c-di-GMP is involved in biofilm development and is heterogeneously distributed in several species' biofilms; it is also known that E.coli diesterases regulate this second messenger, i.e. https://journals.asm.org/doi/full/10.1128/jb.00604-15). <br /> There is also no explanation for the apparently contradictory upregulation of c-di-GMP in cells expressing higher PdeI levels. Perhaps the examination of the rest of the genes in cluster 2 of the biofilm sample could be useful to explain the observed association. 

      Thank you for your thoughtful and constructive review of our manuscript. We are pleased that the reviewer recognizes the value and efficiency of our rRNA depletion method for PETRI-seq, as well as its potential impact on the field. We would like to address the points raised by the reviewer and provide additional context and clarification regarding the function of PdeI in c-di-GMP regulation.

      We acknowledge that c-di-GMP’s role in biofilm development and its heterogeneous distribution in bacterial biofilms are well studied. We appreciate the reviewer's observation regarding the seemingly contradictory relationship between increased PdeI expression and elevated c-di-GMP levels. This is indeed an intriguing finding that warrants further explanation.

      PdeI was predicted to be a phosphodiesterase responsible for c-di-GMP degradation. This prediction is based on sequence analysis where PdeI contains an intact EAL domain known for degrading c-di-GMP. However, it is noteworthy that PdeI also contains a divergent GGDEF domain, which is typically associated with c-di-GMP synthesis. This dual-domain architecture suggests a potential for complex regulatory roles. As reported, the knockout of the major phosphodiesterase PdeH in E. coli leads to the accumulation of c-di-GMP. Further, a point mutation on PdeI's divergent GGDEF domain (G412S) in this PdeH knockout strain resulted in decreased c-di-GMP levels, implying that the wild-type GGDEF domain in PdeI has a role in maintaining or increasing c-di-GMP levels in the cell. Additionally, PdeI contains a CHASE (cyclases/histidine kinase-associated sensory) domain. Combined with our experimental results demonstrating that PdeI is a membrane-associated protein, we predict that PdeI functions as a sensor that integrates environmental signals with c-di-GMP production under complex regulatory mechanisms. The experimental evidence, along with domain analysis, suggests that PdeI could contribute to c-di-GMP synthesis, rebutting the notion that it solely functions as a phosphodiesterase. Furthermore, our single-cell experiments showed a positive correlation between PdeI expression levels and c-di-GMP levels (Fig. 2J). HPLC LC-MS/MS analysis further confirmed that PdeI overexpression (induced by arabinose) upregulated c-di-GMP levels (Fig. 2K). Importantly, in our HPLC LC-MS/MS analysis, we compared the PdeI overexpression strain with the wild-type MG1655 strain, thereby excluding the influence of other genes in cluster 2. In summary, while PdeI is predicted to be a phosphodiesterase based on its sequence and the presence of an EAL domain, the additional presence of a divergent GGDEF domain and experimental evidence suggests that PdeI has a function in upregulating c-di-GMP levels. These findings support the hypothesis that PdeI may have both synthetic and regulatory roles in c-di-GMP metabolism.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The work by Joseph et al "Impact of the clinically approved BTK inhibitors on the conformation of full-length BTK and analysis of the development of BTK resistance mutations in chronic lymphocytic leukemia" seeks to comparatively analyze the effect of a range of covalent and noncovalent clinical BTK inhibitors upon BTK conformation. The novel aspect of this manuscript is that it seeks to evaluate the differential resistance mutations that arise distinctly from each of the inhibitors.

      Strengths:

      This is an exciting study that builds upon the fundamental notion of ensemble behavior in solutions for enzymes such as BTK. The HDX-MS and NMR experiments are adequately and comprehensively presented.

      We thank the reviewer for this positive feedback.

      Weaknesses:

      While I commend the novelty of the study, the absence of important controls greatly tempers my enthusiasm for this work. As stated in the abstract, there are no broad takeaways for how resistance mutation bias operated from this study, although the mechanism of action of 2 common resistance mutations is useful. How these 2 resistance mutations connect to ensemble behavior, is not obvious. This is partly because BTK does not populate just binary "open"/"closed" conformations, but there are likely multiple intermediate conformations. Each inhibitor appears to preferentially "select" conformations by the authors' own assessment (line 236) and this carries implications for the emergence of resistance mutations. The most important control that would help is to use ADP or nonhydrolyzable and ATP as a baseline to establish the "inactive" and "active" conformations. All of the HDX-MS and NMR studies use protein that has no nucleotide present. A major question that remains is whether each of the inhibitors preferentially favors/blocks ADP or ATP binding. This then means it is not equivalent to correlate functional kinase assay conditions with either HDX-MS or NMR experiments.

      We thank the reviewer for raising this point. The BTK inhibitors studied here are active site inhibitors that completely prevent (block) nucleotide (both ATP and ADP) binding. We believe the other question being asked here is whether the different BTK inhibitors bind preferentially to the ADP or ATP bound kinase (do the conformational states favored by ADP versus ATP bound BTK affect drug binding). We agree this is an interesting question that deserves further study. Here we are focused on the ligand bound state itself rather than on the conformational state selection mechanism of each inhibitor. Thus, HDX-MS and NMR work to compare ligand bound to apo-, ADP, and ATP bound BTK is beyond the scope of this manuscript. That said, previous work (doi: 10.1038/s41598-017-17703-5) has shown that the related TEC kinase, ITK, preferentially binds ADP when the kinase is in the autoinhibited conformation. Since we have previously shown that BTK adopts the autoinhibited conformation in the nucleotide free form (https://doi.org/10.7554/eLife.89489.2), we suggest that the comparison we have carried out here between drug bound and apo-protein is valid. Future work will carefully address the conformational preferences of all three conditions, apo-, ADP- and ATP-bound.

      Reviewer #2 (Public Review):

      Summary:

      Previous NMR and HDX-MS studies on full-length (FL) BTK showed that the covalent BTKi, ibrutinib, causes long-range effects on the conformation of BTK consistent with disruption of the autoinhibited conformation, based on HDX deuterium uptake patterns and NMR chemical shift perturbations. This study extends the analyses to four new covalent BTKi, acalabrutinib, zanubrutinib, tirabrutinib/ONO4059, and a noncovalent ATP competitive BTKi, pirtobrutinib/LOXO405.

      The results show distinct conformational changes that occur upon binding each BTKi. The findings show consistent NMR and HDX changes with covalent inhibitors, which move helix aC to an 'out' position and disrupt SH3-kinase interactions, in agreement with X-ray structures of the BTKi complexed with the BTK kinase domain. In contrast, the solution measurements show that pirtobrutinib maintains and even stabilizes the helix aC-in and autoinhibited conformation, even though the BTK:pritobrutinib crystallizes with helix aC-out. This and unexpected variations in NMR and HDX behavior between inhibitors highlight the need for solution measurements to understand drug interactions with the full-length BTK. Overall the findings present good evidence for allosteric effects by each BTKi that induce distal conformational changes which are sensitive to differences in inhibitor structure.

      The study goes on to examine BTK mutants T474I and L528W, which are known to confer resistance to pirtobrutinib, zanubritinib, and tirabrutinib. T474I reduces and L528W eliminates BTK autophosphorylation at pY551, while both FL-BTK-WT and FL-BTK-L528W increase HCK autophosphorylation and PLCg phosphorylation. These show that mutants partially or completely inactivate BTK and that inactive FL-BTK can activate HCK, potentially by direct BTK-HCK interactions. But they do not explain drug resistance. However, HDX and NMR show that each mutant alters the effects of BTKi binding compared to WT. In particular, T474I alters the effects of all three inhibitors around W395 and the activation loop, while L528W alters interactions around W395 with tirabrutinib and pirtobrutinib, and does not appear to bind zanubrutinib at all. The study concludes that the mutations might block drug efficacy by reducing affinity or altering binding mode.

      Strengths:

      The work presents convincing evidence that BTK inhibitors alter the conformation of regions distal to their binding sites, including those involved in the SH3-kinase interface, the activation loop, and a substrate binding surface between helix aF and helix aG. The findings add to the growing understanding of allosteric effects of kinase inhibitors, and their potential regulation of interactions between kinase and binding proteins.

      We thank the reviewer for these positive comments.

      Weaknesses:

      The interpretation of HDX, NMR, and kinase assays is confusing in some places, due to ambiguity in quantifying how much kinase is bound to the inhibitor. It would be helpful to confirm binding occupancy, in order to clarify if mutants lower the amount of BTK complexed with BTKi as implied in certain places, or if they instead alter the binding mode. In addition, the interpretation of the mutant effects might benefit from a more detailed examination of how each inhibitor occupies the ATP pocket and how substitutions of T474 and L528 with Ile and Trp respectively might change the contacts with each inhibitor.

      We thank the reviewer for these suggestions. As requested we have now modified the manuscript to clearly state the effects of the mutations on inhibitor binding. Additionally, we have included a new figure to discuss the interaction of the inhibitors within the BTK kinase active site to provide a better explanation for the impact of the resistance mutations.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Major Comments:

      (1) What is the binding affinity of ATP/ADP to BTK? BTK is purified by the authors as an apoenzyme (by the final purification by SEC, all protein should be completely stripped of nucleotide)- but must toggle between ATP and ADP-bound states. Do the inhibitors completely sterically block nucleotide binding? Do they only block one or the other- ADP/ATP binding? Do they weaken ADP/ATP binding? The authors have an opportunity with NMR to establish a clear baseline to compare the inhibitors' effects on BTK. It is not clear if the authors' assumption is that all BTKi share a common mode of action (Line 114).

      All BTK inhibitors studied in this work (Ibrutinib, Acalabrutinib, Zanubrutinib, Tirabrutinib and Pirtobrutinib) share a common mode of action. They are active site inhibitors that completely block nucleotide (ATP and ADP) binding. The introduction to the manuscript has been updated to add this information (lines 70-71, pg. 4).

      "The covalent BTK inhibitors (Ibrutinib, Acalabrutinib, Zanubrutinib and Tirabrutinib) and the non-covalent BTK inhibitor Pirtobrutinib bind tightly to the BTK active site (Kinact/KI or KD values in the nM range; DOI: 10.1056/NEJMoa2114110). In contrast, previous studies have reported nucleotide affinity for TEC kinases that are lower (KD in the µM range), (doi: 10.1038/s41598-017-17703-5). Additionally, the same work has shown that the conformational state of TEC kinases can impact nucleotide binding. The TEC kinases have a higher affinity for ADP (KD ~ 20 µM), as compared to ATP (KD ~ 15 fold lower than ADP), when the full-length protein adopts the autoinhibited conformation. Disruption of the TEC kinase autoinhibited conformation (by mutation) decreases the affinity for ADP, allowing ATP to bind, enabling kinase activity. Nevertheless, regardless of the conformational state of BTK, all the BTK inhibitors studied here block both ADP and ATP binding to the active site."

      (2) Is there an effect of nucleotide binding bias on resistance mutation emergence? Is there a nucleotide binding bias in the resistance mutations characterized in this study? There likely is - BTK L528W is catalytically inactive. It is not clear if this mutant stays bound to ADP or to ATP and cannot transfer the phosphate to its substrate. How does BTK T474I interact with ADP/ATP? This is needed before concluding - in lines 289-291- that mutations cause only minor conformational changes. This needs a qualifier - in the nucleotide-free apo conformation.

      The BTK L528W mutation introduces a bulky sidechain into the BTK kinase active site that sterically impedes both ATP and ADP binding. In fact, previous studies (https://doi.org/10.1016/j.jbc.2022.102555) have confirmed the inability of the BTK L528W mutant to bind ATP.

      The BTK T474I mutation could alter nucleotide binding. However, The BTK T474I mutation lowers the overall activity of BTK, and is consistent with previous work that have shown the same (https://doi.org/10.1021/acschembio.6b00480). The decrease in overall kinase activity cannot account for the development of resistance (which typically requires increased kinase activity). Hence, a decrease in inhibitor binding is likely driving resistance.

      Lines 293 (pg. 14) have been modified to indicate that the conformational changes observed in the BTK mutants are in the absence of nucleotide as requested.

      (3) What is the half-life BTK? And does inhibitor binding to BTK change the half-life of the inhibitor?

      BTK has a long half-life of 48-72 h (DOI: https://doi.org/10.1124/jpet.113.203489). Unbound covalent inhibitors are rapidly cleared from the body with short half-lives on the order of < 4h. Non-covalent BTK inhibitors typically have a longer half-life on the order of 20h. Once bound to BTK, the irreversible nature of binding by covalent inhibitors make them unavailable to other molecules. CLL patients are treated typically with a once daily or twice daily dose of BTK inhibitor. Hence, inhibitor binding to BTK does not alter the half-life of free inhibitor.

      (4) Are there broad differences between covalent and single non-covalent inhibitors upon resistance mutation bias? And nucleotide binding?

      The biggest difference observed between BTK covalent and non-covalent inhibitors in the emergence of resistance mutations is the occurrence of the C481S mutation in patients treated with covalent inhibitors. This resistance mutation is absent in patients treated with non-covalent BTK inhibitors. Patients that develop mutations in BTK C481 can no longer be treated with any of the approved covalent BTK inhibitors (as they all use BTK C481 for covalent linkage). To ensure BTK inhibition, patients with mutations in C481 can be treated with non-covalent BTK active site inhibitors. All currently approved BTK inhibitors (covalent and non-covalent) are active site inhibitors that compete with nucleotide binding.

      (5) It's unclear why the authors chose to evaluate the impact of inhibitor binding on the linker kinase domain first. This seems unnecessary.

      NMR analysis is easier on the smaller BTK linker kinase domain (LKD) fragment compared to the full-length protein. Hence for practical reasons we used the BTK LKD fragment.

      (6) Line 508 - there seems to be a gap in understanding protein half-lives, inhibitor half-lives, and the emergence of resistance mutations in this manuscript itself. The manuscript falls short of a mechanistic descriptor of variable inhibitors and resistance mutation bias.

      The half-life of the inhibitors assessed in this study are provided in Table 1 of this manuscript. The emergence of resistance mutations such as C481 are likely due to a direct consequence of differences in inhibitor half-life as described in the discussion section of this manuscript (page 23).

      (7) HDX-MS reports the conformational average difference across the ensemble but does not distinguish between the number of intermediary conformations. The authors should clarify that this is a limitation of an average readout method such as HDX-MS. This is currently not addressed.

      A sentence describing this limitation has been added (lines 219-221, pg. 11) as requested.

      Minor  Points:

      (1) Some of the qualitative descriptors are unnecessary - line 284 - "Slightly towards....". Line 286 - "Slight stabilizing effect on the conformation..." How slight is slight?

      Qualitative descriptors have been removed from the manuscript as requested.

      (2) The authors should provide SPR data with Kon and Koff values for Pirtobrutinib binding to BTK ( in the presence of ARP and ADP).

      SPR analysis of Pirtobrutinib has previously been reported. Pirtobrutininb binds to BTK wild-type with a KD of 0.9 nM (DOI: 10.1056/NEJMoa2114110). As mentioned earlier in response to comment 1, Pirtobrutinib binds to the BTK kinase active site and is competitive with both nucleotides (ATP and ADP, which bind with lower affinity, KD in the µM range).

      (3) In Figure 2, the legend needs to describe the specific time point represented. Same with Figure 5.

      The HDX-MS changes that are mapped onto the structure represent the maximal changes observed at any time point. The figure legends have been modified as requested to clarify this.

      Reviewer #2 (Recommendations For The Authors):

      (1) Figure 7 is an amazing and impressive finding, but it could use two controls: First a blot of pY551 to show more rigorously that FL-BTK-WT and L528W autophosphorylation is unaffected by zanubrutinib binding, just to eliminate the possibility that elevated pY551 accounts for the enhanced HCK phosphorylation.

      Both BTK FL enzymes (WT and L528W) in this assay are catalytically inactive and do not contribute to autophosphorylation on BTK Y551 (BTK FL WT is inhibited by Zanubrutinib and BTK FL L528W is catalytically dead). Additionally, BTK FL WT and BTK FL L528W are both able to activate HCK. Hence differences in pY551 levels between these BTK proteins cannot explain how both proteins are able to activate HCK.

      Nevertheless, as requested, we probed for pY551 levels on BTK. While BTK cannot autophosphorylate itself on BTK Y551 in this assay, BTK Y551 is able to be phosphorylated by HCK. BTK Y551 phosphorylation levels were higher in BTK FL WT compared to BTK FL L528W likely due to Y551 on the activation loop being less accessible in the BTK L528W mutant (which is more stabilized in the autoinhibited conformation) compared to the WT protein. This data has been added as a new panel in Figure 7a.

      Additionally, we tested the ability of the BTK FL L528W/Y551F double mutant to activate HCK. The BTK FL L528W/Y551F double mutant is able to activate HCK similar to BTK FL L528W single mutant, demonstrating that phosphorylation on Y551 is not necessary for HCK activation by BTK FL L528W. This new data has been added as supplemental figure S2a. Taken together, pY551 levels on BTK do not contribute to enhanced HCK phosphorylation. The results section of the manuscript has been modified to include this additional data (Lines 319-335, pg. 15-16).

      Second, controls performed in the absence of Zanubrutinib are needed for the time courses with HCK alone, HCK + FL-BTK WT, and HCK + FL-BTK-L528W. This would help show that the ability of BTK to increase the phosphorylation of HCK and PLCg1 is (or isn't) dependent on drug interactions with BTK, HCK, or PLCg.

      BTK FL L528W can enhance phosphorylation on PLCg by HCK even in the absence of Zanubrutinib. We have added this data as a new supplemental figure S2b. We have not included BTK FL WT in this analysis as in the absence of Zanubrutinib, we would have two active enzymes (HCK and BTK) in the assay which would complicate the interpretation of the data. The results section of the manuscript has been modified to include this additional data (Lines 333-335, pg. 16).

      And please comment: in cells, does zanubrutinib treatment (or any other drug) increase pY phosphorylation of HCK or PLCg?

      All clinically approved BTK inhibitors (covalent and non-covalent) inhibit BTK WT activity and decrease PLCg phosphorylation in cells. There have been no reports, to our knowledge, of any clinically approved BTK inhibitor causing an increase in HCK activity.

      (2) Sections of the Results discussing Figures 8 and 9 are confusing to read because they variously propose that the mutants (i) reduce inhibitor occupancy, or (ii) alter the inhibitor binding mode. However, some of the results unambiguously show an altered binding mode instead of reduced inhibitor binding.

      a) For example, HDX clearly shows protection by tira, zanu, and pirto, therefore reduced inhibitor binding does not seem to be an option. Therefore, I recommend modifying lines 357-363. "The differences in deuterium exchange for drug binding to WT and mutant BTK suggest that the T474I mutation either causes a reduction in inhibitor binding or otherwise alters the mode of drug interaction in the active site. "

      While the HDX-MS data of BTK T474I shows protection by Tirabrutinib, Zanubrutinib and Pirtobrutinib, the magnitude of the protection is reduced in the BTK T474I mutant compared to WT BTK (Fig. 8e) suggesting a reduction in inhibitor binding. These results are consistent with previous SPR analysis of the BTK T474I mutant which also showed reduced binding to Zanubrutinib, Acalabrutinib and Pirtobrutinib (DOI: 10.1056/NEJMoa2114110). The manuscript (lines 381-383, pg. 18) has been modified to clearly state that the BTK T474I mutation causes a reduction in inhibitor binding.

      b) I recommend modifying lines 370-373.

      " In stark contrast to the BTK T474I mutant, the BTK 370 L528W mutant does not show any change in deuterium incorporation in the presence of 371 Zanubrutinib, Tirabrutinib or Pirtobrutinib, providing strong evidence that the BTK L528W 372 mutant does not bind the inhibitors (Fig.8d)."

      Lines 432-435: Although the L528W mutation alters binding to both Tirabrutinib 432 and Pirtobrutinib, the NMR data suggests that it retains partial binding unlike the HDX-MS data 433 that suggests complete disruption of binding. The higher inhibitor concentrations used in the NMR 434 experiments compared to the HDX-MS experiments likely explain this discrepancy."

      The discordance in the L528W mutant between the lack of any HDX protection by tira and pirto versus the clear chemical shift of W395 by NMR is worrisome. If the HDX experiments were really done under conditions where binding occupancy was too low, then it seems important to redo these experiments at higher drug concentrations.

      Alternatively, and perhaps more useful would be to report Kd for binding of these inhibitors to the two mutants. That would allow the authors to interpret these results more definitively.

      SPR analysis of inhibitor binding to full-length BTK WT, T474I and L528W has been previously reported (DOI: 10.1056/NEJMoa2114110). The covalent BTK inhibitors (Ibrutinib, Acalabrutinib, and Zanubrutinib) and the non-covalent BTK inhibitor Pirtobrutinib bind tightly to full-length WT BTK (Kinact/KI or KD values in the nM range). The BTK T474I mutation disrupts binding to Zanubrutinib, Acalabrutinib and Pirtobrutinib, but not Ibrutinib and Fenebrutinib. BTK L528W mutation disrupts binding to Zanubrutinib, Acalabrutinib, Ibrutinib and Pirtobrutinib, but not Fenebrutinib. These previously published results are consistent with the HDX-MS and NMR data presented here. The manuscript has been modified to clearly state that the mutations reduce drug binding instead of altered binding.

      c) Recommend adding data to confirm statements in lines 419-421:

      "Spectral overlays of the BTK L528W mutant with and without Zanubrutinib show no 419 chemical shift changes (Fig. 9a, right panel) suggesting that the mutation completely disrupts 420 inhibitor binding in complete agreement with the HDX-MS data (Fig. 8d).

      428-432: The Pirtobrutinib-bound BTK L528W spectrum (Fig. 9c) shows two resonance positions, 428 one of which overlaps with the W395 resonance in the apo protein and the other that corresponds to that of the mutant protein bound to Pirtobrutinib. This data suggests a mixture of inhibitor bound and unbound BTK kinase domain in solution, likely due to a reduction in Pirtobrutinib affinity 431 caused by the L528W mutation."

      Likewise, direct measurements of binding affinity to L528W would be helpful. It is not completely convincing that the effects of this mutant are due to the reduced binding of either inhibitor. The effects of pirtobrutinib may instead reflect a slow exchange of W395 instead of 50% occupancy. For example, what happened in the rest of the spectra? Were other chemical shifts apparent in either case, which might address binding stoichiometry? It would be useful to show the full spectra in Supplemental figures, as well as any titrations that may have been done to confirm that the inhibitors are added at saturating concentration.

      As requested the full-spectra of Pirtobrutinib bound to BTK L528W has now been added as supplemental figure S1c. In the BTK L528W bound to Pirtobrutinib spectrum, two cross peaks are visible for multiple resonances, one of which overlaps with that of the apo BTK L528W spectrum, suggesting that there is a mixture of apo and inhibitor bound forms of BTK L528W.

      The clinically approved inhibitors that we are working with here (Ibrutinib, Acalabrutinib, Zanubrutinib, Tirabrutinib and Pirtobrutinib have reported IC50 values in the nM range (0.5 nM, 3 nM, 0.3 nM, 6.8 nM and 3.68 nM respectively). All the NMR work presented here was carried out at a 1:1.33, protein:inhibitor ratio (absolute concentration of the inhibitor was 200 µM). NMR titrations of BTK WT have been carried out with Ibrutinib (https://doi.org/10.7554/eLife.60470) and Tirabrutinib. Complete binding is observed at a 1:1 molar ratio of protein:inhibitor, consistent with the previously reported binding characteristics. Mass spec analysis also shows one covalent inhibitor bound to each BTK WT protein (Fig. 4a). The BTK T474I and L528W mutants were tested at the same protein:inhibitor ratio as WT BTK for ease of comparison.

      (3) The Discussion could use a structural perspective on the likely effects of each mutation on inhibitor binding. Both residues occupy positions in beta7 and the hinge, which are commonly found to form hydrophobic and polar contacts with ATP competitive inhibitors in many kinases. This would be useful to discuss and show as a figure, in order to give the non-kinase expert a better understanding of why the mutations might affect inhibitor binding. The variations in structures of each inhibitor and how they contact these two positions might be useful to inspect, and ask why some inhibitors but not others are affected by mutation, and why some inhibitors but not others induce effects over long distances to W395 and the activation loop.

      As requested, we have added a new paragraph in the discussion and a new figure (Fig. 10), to expand on likely effects of the mutations on inhibitor binding. The allosteric effects of some of the BTK inhibitors, on the other hand are currently being investigated and is beyond the scope of the current manuscript.

      (4) The authors propose that small differences in Tm and stability of L358W account for its effect on resistance. Does this mutant show elevated expression in patient tumors over those with WT BTK?

      Preliminary data indicates that BTK L528W levels are elevated in one of two patients carrying this resistance mutation. However, due to the low number of patients tested, we have chosen to not include the data in this study but will continue to pursue this question in future work.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      The Editors have assessed your revised submission and rather than issuing a further decision letter we are writing to invite you to make a few small amendments to this version of the paper as listed below.

      We added a summary paragraph at the end of the introduction for clarity.

      (1) RMSD values in Fig 2-source data 1 (and possibly reflected in Fig 2C) appear to be improbably duplicated, specifically ACh runs 1/2, Ebx runs 1/3, and error values for Ebx vs. ACh.

      Thanks for bringing this to our attention. The values are now corrected.

      (2) Shaded area in Fig 2-supplement 5D is inaccurate for depicting loop C.

      The shaded area now reflects residues in loop C, residues 189-198.

      (3) In Fig 2-supplement 4 where an abrupt change in ligand RMSD is implied to represent a cis-trans flip, the accompanying figure showing snapshots misleadingly depicts a different simulation of CCh instead of ACh.

      The snapshot was from the correct ACh simulation. It was mislabeled as CCh in the legend, which now stands corrected.

      (4) Legend to Fig 3 seems misleading regarding colors in the porcupine plots.

      The color pattern indicated in the legend represents the FEL plot and not the porcupine plot. Description about the porcupine plot is not associated with any color.

      (5) Some shaded regions in Fig 6-supplement 2 do not correspond to intervals reported in Fig 4-source data 1.

      Thanks. This is now corrected to match the table.

      Given that some of the above points have remained unaddressed from the prior round of review, the authors should double check that they have addressed any other relevant prior comments not explicitly listed here.

      Finally, the revised first results section has removed the explanation as to why the authors opted to simulate a dimer (i.e., affinity being affected only by local perturbations). The authors should consider reincorporating this explanation for readers, as well as adding a reference to Wang et al. 1997 (PMID: 9222901) in regard to lines 116-119.

      The revised section now includes an added explanation on why dimer was used in simulations. Gupta et. al., J Gen Physiol. 2017 Jan; 149(1): 85–103 was added, as it includes residues from not just the M1 domain that Wang et al covers, but other TMD regions also.

    1. Author response:

      eLife Assessment

      Zhang et al. present important findings that reveal a new role for TET2 in controlling glucose production in the liver, showing that both fasting and a high-fat diet increase TET2 levels, while its absence reduces glucose production. TET2 works with HNF4α to activate the FBP1 gene upon glucagon stimulation, while metformin disrupts TET2-HNF4α interaction, lowering FBP1 levels and improving glucose homeostasis. While the results are solid, more details about the mechanisms and methods are needed to strengthen the study's conclusions

      Thanks for the positive evaluation and constructive comments, which will significantly improve the quality of the manuscript. We will provide more details about the mechanisms and methods in the revised version.

      Reviewer #1 (Public review):

      Summary:

      Zhang et al. describe a delicate relationship between Tet2 and FBP1 in the regulation of hepatic gluconeogenesis.

      Strengths:

      The studies are very mechanistic, indicating that this interaction occurs via demethylation of HNF4a. Phosphorylation of HNF4a at ser 313 induced by metformin also controls the interaction between Tet2 and FBP1.

      Weaknesses:

      The results are briefly described, and oftentimes, the necessary information is not provided to interpret the data. Similarly, the methods section is not well developed to inform the reader about how these experiments were performed. While the findings are interesting, the results section needs to be better developed to increase confidence in the interpretation of the results.

      We thank the reviewer for the positive evaluation and constructive comments. There is a factual error in the paragraph of “Strengths”. The comment that “The studies are very mechanistic, indicating that this interaction occurs via demethylation of HNF4a. Phosphorylation of HNF4a at ser 313 induced by metformin also controls the interaction between Tet2 and FBP1.” should be revised as follows: “The studies are very mechanistic, indicating that this interaction occurs via demethylation of FBP1. Phosphorylation of HNF4a at ser 313 induced by metformin also controls the interaction between Tet2 and HNF4a.”

      Following reviewer’s suggestions, we will provide all the necessary information in methods section to inform the reader about how these experiments were performed, and improve the description of the results in the revised revision.

      Reviewer #2 (Public review):

      Summary:

      This study reveals a novel role of TET2 in regulating gluconeogenesis. It shows that fasting and a high-fat diet increase TET2 expression in mice, and TET2 knockout reduces glucose production. The findings highlight that TET2 positively regulates FBP1, a key enzyme in gluconeogenesis, by interacting with HNF4α to demethylate the FBP1 promoter in response to glucagon. Additionally, metformin reduces FBP1 expression by preventing TET2-HNF4α interaction. This identifies an HNF4α-TET2-FBP1 axis as a potential target for T2D treatment.

      Strengths:

      The authors use several methods in vivo (PTT, GTT, and ITT in fasted and HFD mice; and KO mice) and in vitro (in HepG2 and primary hepatocytes) to support the existence of the HNF4alpha-TET-2-FBP-1 axis in the control of gluconeogenesis. These findings uncovered a previously unknown function of TET2 in gluconeogenesis.

      Weaknesses:

      Although the authors provide evidence of an HNF4α-TET2-FBP1 axis in the control of gluconeogenesis, which contributes to the therapeutic effect of metformin on T2D, its role in the pathogenesis of T2D is less clear. The mechanisms by which TET2 is up-regulated by glucagon should be more explored.

      We thank the reviewer for the supports and constructive comments, and agree with the reviewer that the current version mainly focused on the function of HNF4α-TET2-FBP1 axis in the control of gluconeogenesis. We will explore the pathogenesis of T2D and the mechanism how TET2 is up-regulated by glucagon in the revised revision.

      Both reviewers made positive comments and we will address all the reviewers’ concerns either by new experiments or clarifications. We thank editors and reviewers for the constructive comments, which will significantly improve the quality of the manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      We thank the reviewers for their overall positive evaluation of the manuscript and finding MChIP-C to be a valuable technological advance. To address the reviewer’s helpful comments and recommendations, we performed several additional analyses and improved the text and figures.

      Briefly, we extended and clarified the main text and methods, added analyses of interactions at consensus and method-specific CTCF/DHS sites (Figure S3), added additional comparison tracks to other methods in specific loci (Figure 4), added examples of MChIP-C E-P interactions at previously-verified loci (Figure S2a) and added extensive MChIP-C downsampling analysis (Figure S6).

      Recommendations for authors:

      Reviewer #2 (Recommendations For The Authors:

      (1) Provide .HiC and .cool files for the community to explore the data.

      We thank the reviewer for this suggestion. We have uploaded both the raw and processed data to GEO. We note that .cool and .hic formats may be less useful for this type of data, since it includes only promoter-based interactions and thus the resulting interaction matrix is extremely sparse at the relevant resolutions. In addition, we provide an online genomic browser for our data.

      (2) Provide an R or bioconda package for future data processing.

      We thank the reviewer for this suggestion. We have organized and streamlined the relevant code for processing MChIP-C data and it is available as a github repository.

      (3) The authors should avoid using "mln" for "million".

      We thank the reviewer for this suggestion. We have corrected this in the text.

      Reviewer #3 (Recommendations For The Authors):

      (1) Figure 2- A handful of sites identified by MChIP-C should be verified by 3C or 4C to validate they are true interactions using an orthogonal approach.

      We thank the reviewer for this suggestion. As we show in the current manuscript (and supported by several papers using MNase-based C-methods), C-methods based on restriction enzymes are considerably less sensitive than those based on MNase, so using these methods for anecdotal validation may not be adequate. In addition, it is difficult to extract accurate quantitative measurements from 3C and 4C due to challenges in bias normalization. As a large-scale alternative, we analyzed a set of consensus promoter-CTCF and promoter-DHS interactions identified by all 3 methods (PLAC-seq/Micro-C/MChIP-C; Figure S3). We find that MChIP-C shows clearly superior resolution and sensitivity on these consensus sites. In fact, even for sites which were only called by one of the competing methods, we still see better signal in the MChIP-C data (suggesting that our simplistic MChIP-C peak-calling approach could be improved for further gain). However, as this analysis focuses on “easily detectable” consensus sites, we also emphasize the importance of inspecting interactions which are not detected clearly by alternative methods. To this end, we now show in our manuscript interaction profiles for 11 loci (MYC, PTGER3, CITED2, BTG1, ANTXR2, SEMA7A, LMO2, GATA1, HBG2, VEGFA, MYB), each showing high-resolution MChIP-C interactions which coincide with expected genomic features (p300, CTCF, H3K27ac, known enhancers) and are not clearly observable in Micro-C and PLAC-seq. We also note that the extended overlap of detected MChIP-C interactions with functionally validated enhancers (as measured by CRISPRi) provides an additional large-scale orthogonal validation.

      (2) A supplemental table indicating read pair depth, etc, similar to S02, should be added for the datasets used for comparison (HiChIP-etc). Given the age differences between some of the reference data used, it may represent simply an improvement by increasing sequencing depth rather than a true technical advantage.

      We thank the reviewer for this suggestion. We have added the sequencing depths of the relevant datasets in the methods section. We also performed extensive downsampling analyses as explained in response to the next point.

      (3) I would recommend performing a downsampling analysis to determine at what point the MChIP-C data reaches saturation in terms of the number of reads, with a comparison to the HiChIP reference data. This would allow a more objective measure of the sensitivity of the assays with reference to read depth.

      We thank the reviewer for this suggestion. First, we note that downsampling does not affect the high sensitivity and resolution results as shown in aggregate plots (e.g. Figure 2 and Figure S3). However, downsampling can affect individual peak calling. We thus downsampled our data to 50%, approximately matching the number of total informative reads of both PLAC-seq and Micro-C (i.e. ~20M). We also further downsampled our data to 25% and 10%. With respect to prediction of K562 functionally validated enhancer-promoter interactions (Figure S6b), even at 25% downsampling MChIP-C achieves both a higher recall and higher precision than the other methods, with a slightly higher false-positive rate. At 10% sampling, recall is slightly worse than Micro-C and PLAC-seq, but both the precision and false-positive rate are better than the alternatives. With respect to saturation, we plotted the number of unique distal cis read pairs versus the total number of reads (Figure S6c), and find that our MChIP-C data does not yet show saturation. We also show that downsampling our data to 50% maintains  ~80% of the called interactions (Figure S6d).

      (4) "our results suggest that MChIP-C achieves superior sensitivity and resolution compared to C-methods based on standard restriction enzymes." The sensitivity claims are supported by Figure 2, but not the resolution claims. This is particularly challenging when using histone marks since they can be broad. To directly compare the resolution of MChIP-C to other approaches such as ChIA-PET or HiChIP CTCF or a similar DNA binding protein is required.

      We thank the reviewer for this suggestion. We first note that actually both sensitivity and resolution are relevant for the results shown in Figure 2 and for the signal-to-noise calculations. This is because the low resolution of PLAC-seq peaks can result in very broad peaks that cover the entire area of the interrogated window (5kb on each side), which could seem like low sensitivity. However, we believe that the new Figure S3 may show the higher resolution of MChIP-C more clearly, as do the 11 locus interaction profiles tracks shown in Figure 2, Figure 4 and Figure S2.

      Public reviews:

      Reviewer #1:

      The authors presented a new MNase-based proximity ligation method called MChIP-C, allowing for the measurement of protein-mediated chromatin interactions at single-nucleosome resolution on a genome-wide scale. With improved resolution and sensitivity, they explored the spatial connectivity of active promoters and identified the potential candidates for establishing/maintaining E-P interactions. Finally, with published CRISPRi screens, they found that most functionally verified enhancers do physically interact with their cognate promoters, supporting the enhancer-promoter looping model.

      The study's experimental approach and findings are interesting. However, several issues need to be addressed.

      (1) The authors described that "the lack of interaction between experimentally-validated enhancers and their cognate promoters in some studies employing C-methods has raised doubts regarding the classical promoter-enhancer looping model", so it's intriguing to see whether the MChIP-C could indeed detect the E-P interactions which were not identified by C-methods as they mentioned (Benabdallah et al., 2019; Gupta et al., 2017). I agree that they identified more E-P interactions using MChIP-C, but specifically, they should show at least 2-3 cases. It's important since this is the main conclusion the authors want to draw.

      We thank the reviewer for this suggestion. As we show in the current manuscript (and supported by several papers using MNase-based C-methods), C-methods based on restriction enzymes are considerably less sensitive than those based on MNase, so using these methods for anecdotal validation may not be useful. In addition, it is difficult to extract accurate quantitative measurements from 3C and 4C due to challenges in bias normalization. As a large-scale alternative, we analyzed a set of consensus promoter-CTCF and promoter-DHS interactions identified by all 3 methods (PLAC-seq/Micro-C/MChIP-C; new Figure S3). We find that MChIP-C shows clearly superior resolution and sensitivity on these consensus sites. However, as this analysis focuses on “easily detectable” consensus sites, we also emphasize the importance of inspecting interactions which are not detected clearly by alternative methods. To this end, we now show in our manuscript interaction profiles for 11 loci (MYC, PTGER3, CITED2, BTG1, ANTXR2, SEMA7A, LMO2, GATA1, HBG2, VEGFA, MYB), each showing high-resolution MChIP-C interactions which coincide with expected genomic features (p300, CTCF, H3K27ac, known enhancers) and are not clearly observable in Micro-C and PLAC-seq. We also note that the extended overlap of detected MChIP-C interactions with functionally validated enhancers (as measured by CRISPRi) provides an additional large-scale orthogonal validation.

      (2) The authors compared their data to those of Chen et al. (Chen et al., 2022), who used PLAC-seq with anti-H3K4me3 antibodies in K562 cells and standard Micro-C data previously reported for K562, concluding that "MChIP-C achieves superior sensitivity and resolution compared to C-methods based on standard restriction enzymes.". This is not convincing since they only compared their data to one dataset. More datasets from other cell lines should be included.

      We thank the reviewer for this suggestion. We would like to clarify that all datasets in the paper are K562 datasets, and this cell line is unique in the availability of CRISPRi screens, PLAC-Seq, Micro-C, and hundreds of ChIP-Seq tracks for it. We would expect datasets from other cell types to have changes in their regulatory interactions, so they would be less adequate for direct comparison. In addition, the general resolution and sensitivity limitations (e.g. due to restriction fragment size) are not dependent on cell type and has been shown in other MNase-based method papers.

      (3) The reasons for choosing Chen's data (Chen et al., 2022) and CRISPRi screens (Fulco et al., 2019; Gasperini et al., 2019) should be provided since there are so many out there.

      We thank the reviewer for this comment. We selected these CRISPRi screen datasets since they match the cell type (K562) which we used for MChIP-C, and we selected the PLAC-seq data as it is the only PLAC-seq/HiChIP dataset which matches both the cell type (K562) and the antibody (H3K4me3).

      (4) The authors identify EP300 histone acetyltransferase and the SWI/SNF remodeling complex as potential candidates for establishing and/or maintaining enhancer-promoter interactions, but not RNA polymerase II, mediator complex, YY1, and BRD4. More explanation is needed for this point since they're previously suggested to be associated with E-P interactions.

      We thank the reviewer for this comment. We apologize for this point being unclear: as Figure S5 shows, we actually did identify Pol2, mediator YY1 and BRD4 as predictive features, but P300 and SWI/SNF show somewhat higher predictive power. We have now clarified this in the text.

      (5) The limitations of the method should be discussed.

      We thank the reviewer for this suggestion. We have now added to the text a discussion of what we view as the current main limitation of the method, namely its low fraction of informative reads.

      Reviewer #2:

      Summary:

      Golov et al performed the capture of MChIP-C using the H3K4me3 antibody. The new method significantly increases the resolution of Micro-C and can detect clear interactions which are not well described in the previous HiChIP/PLAC-seq method. Overall, the paper represents a significant technological advance that can be valuable to the 3D genomic field in the future.

      Strengths:

      (1) The authors established a novel method to profile the promoter center genomic interactions based on the Micro-C method. Such a method could be very useful to dissect the enhancer promoter interaction which has long been an issue for the popular HiC method.

      (2) With the MChIP-C method the authors are able to find new genomic interactions with promoter regions enriched in CTCF. The author has significantly increased the detection sensitivity of such methods as PLAC-seq, Micro-C, and HiChIP.

      (3) The authors identified a new type of interaction between the CTCF-less promoter and the CTCF binding site. This particular type of interaction could explain the CTCF's function in regulating gene transcription activity as observed in many studies. I personally think the second stripe model of P-CTCF interaction is more likely as this has been proposed for the super-enhancer stripe model before. The author should also discuss this part of the story more.

      Weaknesses:

      (1) The data presentation should include the contact heat map. The current data presentation makes it hard for the readers to have a comprehensive view of pair-wise interactions between promoters and the PIR. In particular, these maps may directly give answers to the proposed model of promoter-CTCF interactions by the authors in Figure 3a.

      We thank the reviewer for this suggestion. We note that since the data mainly includes promoter-based interactions, the resulting interaction matrix is extremely sparse at the relevant resolutions. Specifically with respect to promoter-CTCF interactions, without a good sampling of the entire interaction matrix it is difficult to confidently distinguish between the two models only based on MChIP-C data, as it would require data about interaction between non-promoter regions and CTCF.

      (2) In Fig 3D, there seems a very limited increase of power predicting MChIP-C signal for DHS-promoter pairs beyond the addition of CTCF. This figure could be simplified with fewer factors.

      We thank the reviewer for this suggestion. We agree that the last factors do not add predictive power, but we do not think this overly complicates the figure and we prefer to leave these for the reader to evaluate.

      (3) The current method seems to have a big fraction of unusable reads. How the authors process the data should be included to allow for future reproduction. Ideally, the authors should generate a package on R or Bioconda for this processing.

      We thank the reviewer for this suggestion. We agree that the fraction of informative reads is small with respect to some other methods, and expect future versions of MChIP-C to address this limitation. We have organized and streamlined the relevant code for processing MChIP-C data and it is available as a github repository.

      Reviewer #3:

      Summary:

      This manuscript represents a technological development- specifically a micrococcal nuclease chromatin capture approach, termed MChIP-C to identify promoter-centered chromatin interactions at single nucleosome resolution via a specific protein, similar to HiChIP, ChIA-PET, etc.. In general, the manuscript is technically well done. Two major issues raise concerns that need to be addressed. First, it does not appear that novel chromatin interactions identified by MChIP-C which were missed by other approaches such as HiChIP, were validated. This is central to the argument of "improved" sensitivity, which is one of the key factors to assess sensitivity. Second is the question of resolution. Because the authors focus on a histone mark (H3K4me3) it is unclear whether the resolution of the assay truly exceeds other approaches, especially microC. These two issues are not completely supported by the data provided.

      Strengths:

      The method appears to hold promise to improve both the sensitivity and resolution of protein-centered chromatin capture approaches.

      Weaknesses:

      (1) Specific validation experiments to demonstrate the identification of previously missed novel interactions are missing.

      We thank the reviewer for this suggestion. Given that such interactions are missed by Micro-C and PLAC-seq, it would not make sense to use these methods for validation. We thus propose that MChIP-C interactions can be validated by their overlap with expected genomic features. To this end, we now show in our manuscript interaction profiles for 11 loci (MYC, PTGER3, CITED2, BTG1, ANTXR2, SEMA7A, LMO2, GATA1, HBG2, VEGFA, MYB), each showing high-resolution MChIP-C interactions which coincide with expected genomic features (p300, CTCF, H3K27ac, known enhancers) and are not clearly observable in Micro-C and PLAC-seq. In addition, the higher overlap of MChIP-C interactions with functionally-validated K562 enhancer-promoter interactions (provided by CRISPRi screens) provides further functional validation for novel MChIP-C interactions.

      (2) It is unclear if the resolution is really superior based on the data provided.

      We thank the reviewer for this comment. We first note that actually both sensitivity and resolution are relevant for the results shown in Figure 2 and for the signal-to-noise calculations. This is because the low resolution of PLAC-seq peaks can result in very broad peaks that cover the entire area of the interrogated window (5kb on each side), which could seem like low sensitivity. However, we believe that the new Figure S3 may show the higher resolution of MChIP-C more clearly, as do the 11 locus interaction profiles tracks shown in Figure 2, Figure 4 and Figure S2.

      (3) It is unclear how much advantage the approach has, especially compared to existing approaches such as HiChIP since sequencing depth as a variable is not adequately addressed.

      We thank the reviewer for this comment. First, we note that downsampling does not affect the high sensitivity and resolution results as shown in aggregate plots (e.g. Figure 2 and Figure S3). However, downsampling can affect individual peak calling. We thus downsampled our data to 50%, approximately matching the number of total informative reads of both PLAC-seq and Micro-C (i.e. ~20M). We also further downsampled our data to 25% and 10%. With respect to prediction of K562 functionally validated enhancer-promoter interactions (Figure S6b), even at 25% downsampling MChIP-C achieves both a higher recall and higher precision than the other methods, with a slightly higher false-positive rate. At 10% sampling, recall is slightly worse than Micro-C but both the precision and false-positive rate are better than the alternatives.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The manuscript proposes that 5mC modifications to DNA, despite being ancient and widespread throughout life, represent a vulnerability, making cells more susceptible to both chemical alkylation and, of more general importance, reactive oxygen species. Sarkies et al take the innovative approach of introducing enzymatic genome-wide cytosine methylation system (DNA methyltransferases, DNMTs) into E. coli, which normally lacks such a system. They provide compelling evidence that the introduction of DNMTs increases the sensitivity of E. coli to chemical alkylation damage. Surprisingly they also show DNMTs increase the sensitivity to reactive oxygen species and propose that the DNMT generated 5mC presents a target for the reactive oxygen species that is especially damaging to cells. Evidence is presented that DNMT activity directly or indirectly produces reactive oxygen species in vivo, which is an important discovery if correct, though the mechanism for this remains obscure.

      Strengths:

      This work is based on an interesting initial premise, it is well-motivated in the introduction and the manuscript is clearly written. The results themselves are compelling.

      We thank the reviewer for their positive response to our study.  We also really appreciate the thoughtful comments raised.  Adding the considerations raised below to the manuscript will considerably strengthen our findings.

      Weaknesses:

      I am not currently convinced by the principal interpretations and think that other explanations based on known phenomena could account for key results. Specific points below.

      (1) As noted in the manuscript, AlkB repairs alkylation damage by direct reversal (DNA strands are not cut). In the absence of AlkB, repair of alklylation damage/modification is likely through BER or other processes involving strand excision and resulting in single stranded DNA. It has previously been shown that 3mC modification from MMS exposure is highly specific to single stranded DNA (PMID:20663718) occurring at ~20,000 times the rate as double stranded DNA. Consequently, the introduction of DNMTs is expected to introduce many methylation adducts genome-wide that will generate single stranded DNA tracts when repaired in an AlkB deficient background (but not in an AlkB WT background), which are then hyper-susceptible to attack by MMS. Such ssDNA tracts are also vulnerable to generating double strand breaks, especially when they contain DNA polymerase stalling adducts such as 3mC. The generation of ssDNA during repair is similarly expected follow the H2O2 or TET based conversion of 5mC to 5hmC or 5fC neither of which can be directly repaired and depend on single strand excision for their removal. The potential importance of ssDNA generation in the experiments has not been considered.

      We thank the reviewer for this interesting and insightful suggestion.  Our interpretation of our findings is that a subset of MMS-induced DNA damage, specifically 3mC, overlaps with the damage introduced by DNMTs and this accounts for increased sensitivity to MMS when DNMTs are expressed.  However, the idea that the introduction of 3mC by DNMT actually makes the DNA more liable to damage by MMS, potentially through increasing the level of ssDNA, is also a potential explanation, which could operate in addition to the mechanism that we propose.

      (2) The authors emphasise the non-additivity of the MMS + DNMT + alkB experiment but the interpretation of the result is essentially an additive one: that both MMS and DNMT are introducing similar/same damage and AlkB acts to remove it. The non-additivity noted would seem to be more consistent with the ssDNA model proposed in #1. More generally non-additivity would also be seen if the survival to DNA methylation rate is non-linear over the range of the experiment, for example if there is a threshold effect where some repair process is overwhelmed. The linearity of MMS (and H2O2) exposure to survival could be directly tested with a dilution series of MMS (H2O2).

      We thank the reviewer for this point.  As in the response to point #1, the reviewer’s hypothesis of increased potency of MMS, potentially through increased ssDNA, downstream of 3mC induction by DNMT, is a good one.  The reviewers’ suggestion would produce a highly non-linear response to MMS treatment in the AlkB mutant in the DNMT background, so we agree that investigating non-linearity over a wider range rather than inferring from the non-additivity of a single point would be useful in evaluating the results so we will add a dose-response curve for DNMT-expressing cells to MMS to the revised version of the manuscript.

      (3) The substantial transcriptional changes induced by DNMT expression (Supplemental Figure 4) are a cause for concern and highlight that the ectopic introduction of methylation into a complex system is potentially more confounded than it may at first seem. Though the expression analysis shows bulk transcription properties, my concern is that the disruptive influence of methylation in a system not evolved with it adds not just consistent transcriptional changes but transcriptional heterogeneity between cells which could influence net survival in a stressed environment. In practice I don't think this can be controlled for, possibly quantified by single-cell RNA-seq but that is beyond the reasonable scope of this paper.

      We fully agree with the reviewer and, indeed, we are very interested in what is driving the transcriptional changes that we observed.  Work is currently underway in the lab to investigate this further but, as the reviewer suggests, is beyond the scope of this paper.  However, we will include a more extensive comment about the transcriptional changes in the discussion of the revised manuscript.

      (4) Figure 4 represents a striking result. From its current presentation it could be inferred that DNMTs are actively promoting ROS generation from H2O2 and also to a lesser extent in the absence of exogenous H2O2. That would be very surprising and a major finding with far-reaching implications. It would need to be further validated, for example by in vitro reconstitution of the reaction and monitoring ROS production. Rather, I think the authors are proposing that some currently undefined, indirect consequence of DNMT activity promotes ROS generation, especially when exogenous H2O2 is available. It would help if this were clarified.

      We thank the reviewer for picking this up.  In the current version’s discussion, we raised two possible explanations for why DNMT (even without H2O2) increases the ROS levels.  One idea is direct activity of DNMT, and one is through the product of DNMT activity acting as a platform to generate more ROS from endogenous or exogenous sources.  We argued that direct activity is less likely, exactly as the reviewer points out.  It is, however, not impossible and we agree with the reviewer that, if it were to be the case, it would be a striking result.  In the revised version of the manuscript we will include an experiment to test whether DNMTs can generate ROS in vitro, which may provide preliminary evidence to distinguish between the two hypotheses we raised, and we will also edit the text of the discussion to clarify our reasoning. 

      Reviewer #2 (Public review):

      5-methylcytosine (5mC) is a key epigenetic mark in DNA and plays a crucial role in regulating gene expression in many eukaryotes including humans. The DNA methyltransferases (DNMTs) that establish and maintain 5mC, are conserved in many species across eukaryotes, including animals, plants, and fungi, mainly in a CpG context. Interestingly, 5mC levels and distributions are quite variable across phylogenies with some species even appearing to have no such DNA methylation.

      This interesting and well-written paper discusses the continuation of some of the authors' work published several years ago. In that previous paper, the laboratory demonstrated that DNA methylation pathways coevolved with DNA repair mechanisms, specifically with the alkylation repair system. Specifically, they discovered that DNMTs can introduce alkylation damage into DNA, specifically in the form of 3-methylcytosine (3mC). (This appears to be an error in the DNMT enzymatic mechanism where the generation 3mC as opposed to its preferred product 5-methylcytosine (5mC), is caused by the flipped target cytosine binding to the active site pocket of the DNMT in an inverted orientation.) The presence of 3mC is potentially toxic and can cause replication stress, which this paper suggests may explain the loss of DNA methylation in different species. They further showed that the ALKB2 enzyme plays a crucial role in repairing this alkylation damage, further emphasizing the link between DNA methylation and DNA repair.

      The co-evolution of DNMTs with DNA repair mechanisms suggests there can be distinct advantages and disadvantages of DNA methylation to different species which might depend on their environmental niche. In environments that expose species to high levels of DNA damage, high levels of 5mC in their genome may be disadvantageous. This present paper sets out to examine the sensitivity of an organism to genotoxic stresses such as alkylation and oxidation agents as the consequence of DNMT activity. Since such a study in eukaryotes would be complicated by DNA methylation controlling gene regulation, these authors cleverly utilize Escherichia coli (E.coli) and incorporate into it the DNMTs from other bacteria that methylate the cytosines of DNA in a CpG context like that observed in eukaryotes; the active sites of these enzymes are very similar to eukaryotic DNMTs and basically utilize the same catalytic mechanism (also this strain of E.coli does not specifically degrade this methylated DNA) .

      The experiments in this paper more than adequately show that E. coli expression of these DNMTs (comparing to the same strain without the DNMTS) do indeed show increased sensitivity to alkylating agents and this sensitivity was even greater than expected when a DNA repair mechanism was inactivated. Moreover, they show that this E. coli expressing this DNMT is more sensitive to oxidizing agents such as H2O2 and has exacerbated sensitivity when a DNA repair glycosylase is inactivated. Both propensities suggest that DNMT activity itself may generate additional genotoxic stress. Intrigued that DNMT expression itself might induce sensitivity to oxidative stress, the experimenters used a fluorescent sensor to show that H2O2 induced reactive oxygen species (ROS) are markedly enhanced with DNMT expression. Importantly, they show that DNMT expression alone gave rise to increased ROS amounts and both H2O2 addition and DNMT expression has greater effect that the linear combination of the two separately. They also carefully checked that the increased sensitivity to H2O2 was not potentially caused by some effect on gene expression of detoxification genes by DNMT expression and activity. Finally, by using mass spectroscopy, they show that DNMT expression led to production of the 5mC oxidation derivatives 5-hydroxymethylcytosine (5hmC) and 5-formylcytosine (5fC) in DNA. 5fC is a substrate for base excision repair while 5hmC is not; more 5fC was observed. Introduction of non-bacterial enzymes that produce 5hmC and 5fC into the DNMT expressing bacteria again showed a greater sensitivity than expected. Remarkedly, in their assay with addition of H2O2, bacteria showed no growth with this dual expression of DNMT and these enzymes.

      Overall, the authors conduct well thought-out and simple experiments to show that a disadvantageous consequence of DNMT expression leading to 5mC in DNA is increased sensitivity to oxidative stress as well as alkylating agents.

      Again, the paper is well-written and organized. The hypotheses are well-examined by simple experiments. The results are interesting and can impact many scientific areas such as our understanding of evolutionary pressures on an organism by environment to impacting our understanding about how environment of a malignant cell in the human body may lead to cancer.

      We thank the reviewer for their response to our study, and value the time taken to produce a public review that will aid readers in understanding the key results of our study. 

      Reviewer #3 (Public review):

      Summary:

      Krwawicz et al., present evidence that expression of DNMTs in E. coli results in (1) introduction of alkylation damage that is repaired by AlkB; (2) confers hypersensitivity to alkylating agents such as MMS (and exacerbated by loss of AlkB); (3) confers hypersensitivity to oxidative stress (H2O2 exposure); (4) results in a modest increase in ROS in the absence of exogenous H2O2 exposure; and (5) results in the production of oxidation products of 5mC, namely 5hmC and 5fC, leading to cellular toxicity. The findings reported here have interesting implications for the concept that such genotoxic and potentially mutagenic consequences of DNMT expression (resulting in 5mC) could be selectively disadvantageous for certain organisms. The other aspect of this work which is important for understanding the biological endpoints of genotoxic stress is the notion that DNA damage per se somehow induces elevated levels of ROS.

      Strengths:

      The manuscript is well-written, and the experiments have been carefully executed providing data that support the authors' proposed model presented in Fig. 7 (Discussion, sources of DNA damage due to DNMT expression).

      Weaknesses:

      (1) The authors have established an informative system relying on expression of DNMTs to gauge the effects of such expression and subsequent induction of 3mC and 5mC on cell survival and sensitivity to an alkylating agent (MMS) and exogenous oxidative stress (H2O2 exposure). The authors state (p4) that Fig. 2 shows that "Cells expressing either M.SssI or M.MpeI showed increased sensitivity to MMS treatment compared to WT C2523, supporting the conclusion that the expression of DNMTs increased the levels of alkylation damage." This is a confusing statement and requires revision as Fig. 2 does ALL cells shown in Fig. 2 are expressing DNMTs and have been treated with MMS. It is the absence of AlkB and the expression of DNMTs that that causes the MMS sensitivity.

      We thank the reviewer for this and agree that this needs to be clarified with regards to the figure presented and will do so in the revised manuscript. 

      (2) It would be important to know whether the increased sensitivity (toxicity) to DNMT expression and MMS is also accompanied by substantial increases in mutagenicity. The authors should explain in the text why mutation frequencies were not also measured in these experiments.

      This is an important point because it is not immediately obvious that increased sensitivity would be associated with increased mutagenicity (if, for example, 3mC was never a cause of innacurate DNA repair even in the absence of AlkB).  We will carry out this experiment and include these data in the revised version of the manuscript.  Detailed consideration of the types and sources of mutations is beyond the scope of this manuscript, but we are also working on this and hope to produce data on this in the future. 

      (3) Materials and Methods. ROS production monitoring. The "Total Reactive Oxygen Species (ROS) Assay Kit" has not been adequately described. Who is the Vendor? What is the nature of the ROS probes employed in this assay? Which specific ROS correspond to "total ROS"?

      The ROS measurement was with a kit from ThermoFisher: https://www.thermofisher.com/order/catalog/product/88-5930-74.  The probe is DCFH-DA.  This is a general ROS sensor that is oxidised by a large number of cellular reactive oxygen species hence we cannot attribute the signal to a single species.  Use of a technique with the potential to more precisely identify the species involved is something we plan to do in future, but is beyond what we can do as part of this study.  We will include a comment to this effect in the revised version of the manuscript.

      (4) The demonstration (Fig. 4) that DNMT expression results in elevated ROS and its further synergistic increase when cells are also exposed to H2O2 is the basis for the authors' discussion of DNA damage-induced increases in cellular ROS. S. cerevisiae does not possess DNMTs/5mC, yet exposure to MMS also results in substantial increases in intracellular ROS (Rowe et al, (2008) Free Rad. Biol. Med. 45:1167-1177. PMC2643028). The authors should be aware of previous studies that have linked DNA damage to intracellular increases in ROS in other organisms and should comment on this in the text.

      We thank the reviewer for this point.  We note that the increased ROS that we observed occur in the presence of DNMTs alone and in the presence of H2O2, not in the presence of MMS; however, the point that DNA damage in general can promote increased ROS in some circumstances is well taken and we will include a comment on this in the discussion of the revised version.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      It is evident that studying leukocyte extravasation in vitro is a challenge. One needs to include physiological flow, culture cells and isolate primary immune cells. Timing is of utmost Importance and a reproducible setup essential. Extra challenges are met when extravasation kinetics in different vascular beds is required, e.g., across the blood-brain barrier. In this study, the authors describe a reliable and reproducible method to analyze leukocyte TEM under physiological flow conditions, including this analysis. That the software can also detect reverse TEM is a plus.

      Strengths:

      It is quite a challenge to get this assay reproducible and stable, in particular as there is flow included. Also for the analysis, there is currently no clear software analysis program, and many labs have their own methods. This paper gives the opportunity to unify the data and results obtained with this assay under label-free conditions. This should eventually lead to more solid and reproducible results.

      Also, the comparison between manual and software analysis is appreciated.

      We thank the Reviewer for their positive evaluation of our manuscript and highlighting the value of obtaining more reproducible and unbiases results, as well as detection of forward and reverse transmigration with UFMTrack.

      Weaknesses:

      The authors stress that it can be done in BBB models, but I would argue that it is much more broadly applicable. This is not necessarily a weakness of the study but more an opportunity to strengthen the method. So I would encourage the authors to rewrite some parts and make it more broadly applicable.

      We thank the Reviewer for this suggestion. In the revised version of our manuscript, we have now emphasized the broader applicability of UFMTrack to analyze the interaction of immune cells with 2dimensional endothelial monolayers in various contexts in the abstract, introduction, and discussion sections.

      Reviewer #2 (Public Review):

      Summary:

      This paper develops an under-flow migration tracker to evaluate all the steps of the extravasation cascade of immune cells across the BBB. The algorithm is useful and has important applications.

      Strengths:

      Algorithm is almost as accurate as manual tracking and importantly saves time for researchers.

      We thank the Reviewer for this positive evaluation of our work.

      Weaknesses:

      Applicability can be questioned because the device used is 2D and physiological biology is in 3D. Comparisons to other automated tools was not performed by the authors.

      We thank the Reviewer for pointing our attention to these weaknesses in our manuscript.

      We have clarified in the revised manuscript that using 2D endothelial monolayer models in parallel laminar flow chambers is still a state-of-the-art methodology for studying the multi-step extravasation process of immune cells across endothelial monolayers under physiological flow by in vitro live cell imaging. These models provide excellent optical quality that is not yet achieved in 3D models. We have extended the introduction to emphasize the limitations of existing tools that motivated us to establish UFMTrack. We have furthermore extended the discussion section to highlight the features unique to our UFMTrack framework.

      Reviewer #3 (Public Review):

      Summary:

      The authors aimed to establish a faster and more efficient method of tracking steps of T-cell extravasation across the blood brain barrier. The authors developed a framework to visualize, recognize and track the movement of different immune cells across primary human and mouse brain microvascular endothelial cells without the need for fluorescence-based imaging. The authors succinctly describe the basic requirements for tracking in the introduction followed by an in-depth account of the execution.

      We thank the Reviewer for their positive evaluation of our manuscript and highlighting the value of label-free analysis of the multistep immune cell extravasation cascade with UFMTrack.

      Weaknesses and Strengths:

      Materials & methods and results:

      (1) The methods section also lacks details of the microfluidic device that the authors talk about in the paper. Under physiological sheer stress, the T-cells detach from the pMBMEC monolayer, and are hence unable to be detected; however, this observation requires an explanation pertaining to the reason of occurrence and potential solutions to circumvent it to ensure physiologically relevant experimental parameters.

      We thank the Reviewer for pointing out this oversight. We have used a custom-made microfluidic device that has been published and described in detail before. This information has now been included in the Methods Section under Point 7, and the two references describing the flow chamber in depth are mentioned below and have been included in the manuscript.  

      Coisne Caroline, Ruth Lyck and Britta Engelhardt. 2013. Live cell imaging techniques to study T cell trafficking across the blood-brain barrier in vitro and in vivo. Fluids and Barriers of the CNS 10:7 doi:10.1186/20458118-10-7; 21 January 2013

      Lyck R, Hideaki Nishihara, Sidar Aydin, Sasha Soldati and Britta Engelhardt. 2022. Modeling brain vasculature immune interactions in vitro. Angogenesis, 2nd edition. Editors PatriciaD’Amore and Diane Bielenberg Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a041185

      T cell detachment is a physiologically relevant parameter besides T cell arrest, polarization, crawling, probing, and transmigration during the interaction with an endothelial monolayer. T cell detachment means that post-arrest, the T cell cannot engage adhesion molecules required for subsequent polarization and, eventually, transmigration. 

      (2) The author describes a method for debris exclusion using UFMTrack that eliminates objects of <30 pixels in size from analysis based on a mean pixel size of 400 for T lymphocytes. However, this mean pixel size appears to stem from in-vitro activated CD8 T cells, which rapidly grow and proliferate upon stimulation. In line with this, activated lymphocytes exhibit increased cytoplasmic area, making them appear less dense or “brighter” by phase microscopy compared to naïve lymphocytes, which are relatively compact and subsequently appear dimmer. Given this, it is not clear whether UFMTrack is sufficiently trained to identify naïve human lymphocytes in circulating blood, nor smaller, murine lymphocytes. Analysis of each lymphocyte subtype in terms of pixel size and intensity would be beneficial to strengthen the claim that UFMTrack can identify each of these populations. Additionally, demonstrating that UFMTrack can correctly characterize the behavior of naïve versus activated lymphocytes isolated from murine and human sources would strengthen the claim that UFMTrack can be broadly applied to study lymphocyte dynamics in diverse models without additional training

      We thank the Reviewer for the suggestion to more precisely evaluate the range of cell sizes that can be analyzed by our framework. We have included a visualization of crawling cell sizes successfully analyzed by the UFMTrack in Supplementary Figure 7. It demonstrates that the human peripheral blood mononuclear cells, that are almost twice as small as the activated mouse CD4 T cells used in these assays, can be successfully segmented, tracked, and analyzed with the UFMTrack framework. Thus, our UFMTrack framework is suitable for a broad application to differentially sized immune cells during their interaction with the endothelial cell monolayer under flow. 

      (3) Average precision was compared to the analysis of UFMTrack but it is unclear how average precision was calculated. This information should have been included in the methods section

      We thank the Reviewer for pointing our attention to the missing information. We have added a subsection, “Performance Analysis”, to the Materials and Methods section, where we describe the statistical methods and the performance metrics used to evaluate the UFMTrack framework.

      (4) CD4 and CD8 T cells exhibit distinct biology and interaction kinetics driven in part by their MHC molecule affinity and distinct receptor expression profiles. Thus, it is unclear why two distinct mechanisms of endothelial cell activation are needed to see differences between the populations.

      We thank the Reviewer for pointing out that different cytokine stimulations of endothelial cells were used in the assays used here to test our UFMTrack to analyze CD4 and CD8 T cell interactions with the endothelial monolayer. While the Reviewer is correct that CD4 and CD8 T cells use different mechanism to cross the pMBMEC monolayer as show by us (doi: 10.1002/eji.201546251.) and others and that recognition of cognate antigen on MHC class I on pMBMECs will arrest CD8 T cells and lead to CD8 T-cell mediated apoptosis ( doi: 10.1038/s41467-023-38703-2.) the focus of the present study was not on comparing CD4 and CD8 T cell interactions with the pMBMEC monolayer but rather to test suitability of UFMTrack to study the different multi-step transmigration of these T cell subsets across the endothelial monolayer. 

      (5) The BMECs are barrier tissues but were cultured on µdishes in this study. To study the transmigration of T-cells across the endothelium, the model would have been more relevant on a semi-permeable membrane instead of a closed surface.

      We understand the critique of the Reviewer, but laminar flow chambers with endothelial monolayers still provide a state-of-the-art and established methodology to study immune cell migration across endothelial monolayers by in vitro live cell imaging including endothelial cells forming the blood-brain barrier.  

      (6) Methods are provided for the isolation and expansion of human effector and memory CD4+ T cells. However, there is no mention of specific CD4+ T cell populations used for analysis with UFMTrack, nor a clear breakdown of tracking efficiency for each subpopulation. Further, there is no similar method for the isolation of CD8+ T cell compartments. A clear breakdown of the performance efficiency of UFMTrack with each cell population investigated in this study would provide greater insight into the software’s performance with regard to tracking the behavior and movement of distinct immune populations.

      We thank the Reviewer for this comment. Since a fair performance evaluation requires collecting reliable and consistent manual annotations, in this work we have performed such analysis only for the mouse CD8 T-cell population migrating on the pMBMEC monolayer. We have chosen this as a reference since it is a different cell population than the one the segmentation model was trained on. This provides an insight into how high performance is expected when other immune cell types are studied than the ones used for model development.

      (7) The results section is quite extensive and discusses details of establishment of the framework while highlighting both the pros and cons of the different aspects of the process, for example the limitation of the two models, 2D and 2D+T were highlighted well. However, the results section includes details which may be more fitting in the methods section.

      We thank the Reviewer for highlighting the extensive work carried out in the development of our UFMTrack framework. We decided to include in the results section only the description of key elements and design decisions taken when developing the framework, such as the need to include a time series of images for successful segmentation of the transmigrated cells. At the same time, the majority of implementational details can be found in the Supplementary Material.

      (8) A few statements in the results section lacked literary support, which was not provided in the discussion either, such as support for increased variance of T-cell instantaneous speed on stimulated vs non-stimulated pMBMECs. Another example is the enhancement of cytokine stimulation directed T-cell movement on the pMBMECs that the authors observed but failed to relay the physiological relevance of it. The authors don’t provide enough references for developments in the field prior to their work which form the basis and need for this technology.

      We thank the Reviewer for this comment and for asking for literature references. However, we cannot provide such references as these are original observations we made by employing the UFMTrack framework.  This shows that UFMTrack observes T-cell behaviors that have previously been overlooked. Their physiological relevance will have to be explored in separate studies. We have extended the introduction section to include the details on the existing methods developed in the field, as well as their weaknesses that motivated the development of the UFMTrack framework.

      (9) The rationale for use of OT-1 and 2D2-derived murine lymphocytes is unclear here. The OT-1 model has been generated to study antigen-specific CD8+ T cell responses, while the 2D2 model has been generated to recapitulate CD4 T cell-specific myelin oligodendrocyte glycoprotein (MOG) responses.

      To establish and test the UFMTrack framework, we have made use of the specific T-cell subsets and endothelial cell models we generally use within our research context. Especially for animal work, this is according to the 3R rules requesting to reduce animal experimentation.  

      Figures and text:

      (1) There are certain discrepancies and misarrangement of figures and text. For example, discussion of the effect of sheer flow on T cell attachment as part of the introduction in figure 1 and then mentioning it in the text again in the results section as part of figure 4 is repetitive.

      We thank the Reviewer for pointing our attention to this misarrangement. We have adjusted the label of Figure 4 to emphasize that this effect is correctly captured by the UFMTrack.

      (2) Section IV, subsection 1 of the results section, refers to ‘data acquisition section above’ in line 279, however the said section is part of materials and methods which is provided towards the end of the manuscript.

      We thank the Reviewer for pointing our attention to this misarrangement. We have adjusted the text to reflect the correct chapter order.

      (3) There are figures in the manuscript that have not been referenced in the results section, for example, figure 3A and B. Figure 1 hasn’t been addressed until subsection 7 of materials and methods

      We thank the Reviewer for pointing our attention to this misarrangement. We have adjusted the text to refer to all figure panels and the clarification of the cell multiplicity estimation in the supplementary information section. References to Figure 1 were added in the introduction section to illustrate the in vitro under flow imaging setup as well as the typical T cell behaviors in such experiments.

      (4) A lack of significance but an observed trend of increased variance of T cell instantaneous speed is reported in line 296-298; however, the graph (figure 4G) shows a significant change in instantaneous speed between non-stimulated and TNFα-stimulated systems. This is misleading to the readers.

      We thank the Reviewer for pointing our attention to this discrepancy. We have expanded the text to indicate a low statistical significance for the TNF and no significance but just a trend for the IL1-beta conditions.

      (5) The authors talk about three beginner experimentors testing the manual T cell tracking process but figure 5 only showcases data from two experimentors without stating the reason for excluding experimentor 1.

      We thank the Reviewer for pointing our attention to this ambiguity. While both the migration analysis and the manual cell tracking were performed by all three beginner experimenters, the cell tracking data for the first one was unfortunately lost due to a hardware failure.

      Discussion:

      (1) While the discussion captures the major takeaways from the paper, it lacks relevant supporting references to relate the observation to physiological conditions and applicability.

      This study is not about the physiological relevance of the microfluidic devices and immune cells used but rather about advancing methodology to analyze dynamic immune cell behavior on endothelial monolayers under physiological flow. Therefore, the discussion does not extend to comparing the physiological relevance of the specific in vitro models employed in this study.   

      (2) The discussion lacks connection to the results since the figures were not referenced while discussing an observed trend

      We thank the Reviewer for pointing our attention to this misarrangement. We have included the references to the relevant figures as well as supporting references.

      (3) The authors briefly looked into mouse and human BMECs and their individual interaction with Tcells, but don’t discuss the differences between the two, if any, that challenged their framework.

      We thank the Reviewer for pointing our attention to this weakness. We have added to the discussion section clarifications on the challenges of analyzing the T cell interactions with the HBMEC and the BMDM interactions with the pMBMEC monolayer.

      (4) Even though though the imaging tool relies on difference in appearance for detection, the authors talk about lack of feasibility in detecting transmigration of BMDMs due to their significantly different appearance. The statement lacks a problem solving approach to discuss how and why this was the case.

      We thank the Reviewer for pointing our attention to this weakness and apologize for the misleading explanation of the problem of analyzing the BMDM sample. Since the transmigrated part of the macrophages differs in appearance from a transmigrated part of a T cell, its detection by a Deep Neural Network trained on the T cell data is worse than that for the T cells. At the same time, the detection performance before the transmigration is sufficient for the BMDM migration analysis. The potential approaches to alleviate this are added to the discussion section.

      Relevance to the field:

      Utilizing the framework provided by the authors, the application can be adapted and/or utilized for visualizing a range of different cell types, provided they are different in appearance. However, this would require extensive changes to the script and won’t be adaptable in its current form.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      The authors should announce in the abstract that the software analysis Track is downloadable and free to use for all researchers. They may consider providing some sort of helpdesk, although I realize that that may run into too much time.

      As said above, they stress that it can be done in BBB models, but I would argue that it is much more broadly applicable.

      We thank the Reviewer for these suggestions. We have emphasized the broader applicability of UFMTrack in the abstract and pointed out the public availability of the code and data.

      Can they add an experiment that shows that it also works for neutrophils for example? I understand that on paper yes it should work, but the neutrophils are of course faster etc.

      This is an excellent suggestion, but we tested UFMTrack within the current framework of ongoing research, which does not include the investigation of neutrophil transmigration across endothelial monolayers.  

      Also, the combination of different leukocytes in one TEM assay would really be a step forward. If the software can detect different-sized leukocytes, then this should be possible.

      We thank the Reviewer for this suggestion. We have added Supplementary Figure 7, demonstrating the range of cell sizes that were successfully analyzed by the UFMTrack framework throughout our manuscript. We also added a statement to the discussion that according to this data, “simply by discriminating cells by size, it is possible to extend UFMTrack to study the interaction of several types of immune cells migrating on top of a cellular monolayer under flow.”

      Extra challenges: can the method also discriminate between paracellular and transcellular migration modes? In particular for T-cells this is known to happen.

      We thank the Reviewer for this suggestion. We have added this to the potential applications of UFMTrack in the discussion section. While this differentiation is not feasible relying solely on the phasecontrast imaging data, UFMTrack can simplify this analysis by providing automatically the predictions of the transmigration locations, for analysis of the fluorescent data of the junctional labels.

      Reviewer #2 (Recommendations For The Authors):

      This paper develops an under-flow migration tracker to evaluate all the steps of the extravasation cascade of immune cells across the BBB. The algorithm is useful and has important applications. There are several points that need to be addressed, particularly about the claims made by the authors.

      Please see the comments below for more details:

      • Lines 88-92: Add a citation for the characteristics of the BBB as a barrier

      We have added two references accordingly.  

      • Lines 94-95: Can the authors indicate what models were used for these studies and how those compare to their in vitro model? In addition, can the authors say whether T cells were manually tracked in this study to translate results to the clinic and whether the results were successful when translated to the clinic? This may enhance the argument that automatic trackers are needed if the translation was not 100% successful

      This introductory paragraph summarizes in vivo and in vitro observations from several laboratories. Although these studies include manual tracking of T cells, they do not necessarily distinguish all sequential steps of the multi-step T cell transmigration cascade. Thus, automated tracking may provide additional insights, allowing for increased translation of findings to the clinic.  

      • Lines 96-98: Citing the work of Roger Kamm and Noo Li Jeon would be helpful here as they pioneered these BBB microfluidic models and have protocol papers on how to build them and how to use them for cancer cell extravasation studies. Roger Kamm has also worked on several extravasation studies with neutrophils, monocytes, and PBMCs from 3D vasculatures in microfluidic devices, under flow using pressurized fluid or recirculating pumps. Mentioning those would be helpful as they are directly related to what the authors are presenting in their paper.

      We thank the Reviewer for this comment, and we consider the work of Roger Kamm and Noo Li Jeon as very valuable for the field. However, these authors have focused on developing functional 3D microfluidic devices, including, e.g., all cells of the neurovascular unit which is not the focus of this present study that solely employed parallel flow chamber devices and endothelial monolayers.  

      • Lines 110-116: Can the authors comment on the use of ImageJ or similar automatic tracking tools and how these compare to the under-flow migration tracker developed in this paper? Several groups use ImageJ to track cellular migration successfully and in an automatic manner with short intervals between each frame. One paper that comes to mind is Chen et al: DOI: 10.1073/pnas.1715932115 where neutrophil migration in 3D was assessed with ImageJ in microfluidic devices of the vasculature. If the authors can highlight differences between their tool and what is currently available and used for automatic tracking (e.g. ImageJ), this would help in understanding the advantages of the migration tracker developed in this paper.

      • Lines 118-121: Add citations for the current state of the art for T cell extravasation tracking

      We thank the Reviewer for these suggestions. We have extended the introduction to add more details on the available tools for tracking migrating immune cells and their limitations, as well as the discussion section to emphasize the features unique to the developed UFMTrack framework.

      • Figure 1: The device used by the authors is considered to be a 2D microfluidic device with a monolayer of mouse brain endothelial cells. I would recommend the authors to carefully revise the claims made in the paper to mention that this is a 2D device as opposed to a 3D device, in order to not mislead readers who may be expecting these analyses to be performed in 3D vasculatures.

      We thank the Reviewer for this suggestion. We have included in the summary the mention of the 2dimensional nature of the employed BBB model.

      • Figure 1: The T cells used in this study are not fluorescently-labeled but the authors mention that this is an issue from current state-of-the-art tools. I would recommend that the authors remove this point as being an issue because it is not addressed in their paper. The T cells are also not labeled in this study so this limitation of other systems is not addressed in this paper.

      We apologize to the Reviewer as we do not understand this question. There will be many experimental conditions not allowing to study fluorescently tagged T cells. Therefore, UFMTrack is tailored to follow and analyze T cells and other immune cells during their interaction with endothelial monolayers independent of a fluorescence tag.  

      • Figure 1: Was the shear stress controlled manually with a syringe? Or with the use of a pressure controller? I would clarify this aspect and discuss human errors that can be introduced from manually controlling the pressure applied to the monolayer.

      We thank the Reviewer for pointing our attention to this ambiguity. We have added a mention of the automated syringe pump used to control the shear stress in the text where the values of shear stress applied to the sample are first mentioned.

      • Figure 1: Does T cell attachment occur within the first 5 minutes? Can the authors comment on how they chose this timeline and the percentage of T cells that are washed off at the second step at 1.5 dynes/cm^2? Is 30 seconds enough to ensure all the non-adhered T cells are washed off with 1.5 dyns/cm^2?

      Superfusion of the T cells over the endothelial monolayer is performed under 0.5 dynes/cm2 to allow the T cells to settle on the endothelial cell monolayer under flow. After increasing to physiological, flow non adherent T cells detach within 30 seconds, as described by the Reviewer. We have included in the Methods Section Point 7 the references describing in depth the design of the flow chamber device and methods used here.  

      • Line 154: How many images were used in the training vs. testing dataset for T cell migrations?

      We thank the Reviewer for pointing our attention to this missing information. We have added the sizes of the training and validation datasets. Specifically, the 226MPix of available imaging data was split into 154Mpix training and 37 MPix validation sets. The gap in between was introduced to avoid a correlation between validation and training set that would compromise the performance evaluation.

      • Are the supplementary videos at real speed or accelerated?

      We thank the Reviewer for pointing our attention to this missing information. The videos are sped up by a factor of 96. We have added this information to the Supplementary video descriptions.  

      • Lines 208 216: Can the authors comment on how their initial adhesion timeframe of 30sec before starting the recording at 5.5min affects the number of T cells with rapid displacement? 30 seconds may not be enough to ensure T cells have adhered to the endothelium

      Please see our comment above. The methodology used in the present assays has been set up and validated in numerous publications. We have included in the Methods Section under Point 7 the references describing in depth the design of the flow chamber device and the methods used here.  

      • Lines 275-277: Was the number of testing images 18? Can the authors comment on how this compares to training dataset size and whether these numbers are enough to achieve robust results?

      We apologize for this ambiguity in our manuscript. The framework was evaluated on 18 imaging datasets, each corresponding to 32 minutes of recording, not 18 images. We have added this clarification to the “CD4+ T cell analysis” subsection. The total size of these datasets is 18 datasets * 191 timeframe/dataset * 9.9MPix/frame = 34MPix

      • Figure 4B: Can the authors add statistics here? Individual datapoints on the error bars would be helpful too. 

      We thank the Reviewer for pointing our attention to this weakness. The data corresponds to the statistical errors as evaluated based on all cells in the 18 datasets. We have added the total number of cells in each of the endothelium stimulation conditions to the text.

      • Figure 4C-J: Can the authors put individual datapoints here as well and explain whether they considered each T cell to be one datapoint or each endothelium (averaging all T cells) to be one datapoint? 

      We thank the Reviewer for this suggestion. However, adding about one thousand points corresponding to each cell would be impractical. We thus present the distributions of the evaluated from the data metrics as a histogram on the violin plot instead of the swarm plot.

      • Figure 4: Did the authors wash the monolayers before introducing T cells? Soluble unbound cytokines may still be present and there are two different questions that would be studied here: “Is the inflamed endothelium affecting T cell migration?” (if washing was performed) or “Is T cell and microenvironmental inflammation affecting T cell migration?” (if no washing was performed)

      The endothelial monolayers are “washed” by starting the flow in the flow chamber device and this is before superfusing the T cells over the endothelial monolayer. We agree that our flow chamber device combined with UFMTrack will allow to address all these questions.

      • Figure 4I: Are all the T cells decelerating? (negative AM speed)

      We thank the Reviewer for this question. The cells are moving along the flow, which, in our experiments, is from left to right. The vector of speed is thus pointing against the x-axis, and thus the AM speed is negative.

      • Lines 302 306: Please explain how this compares to ImageJ or similar trackers that can achieve similar outputs. 

      We thank the Reviewer for this question. We have added a statement in the “T-cell tracking” section emphasizing that standard trackers are incapable of correctly capturing large displacements.

      • Lines 306-309: It is not lower for TNF stimulation though. How do the authors address this? TNF is also a pro-inflammatory cytokine.

      We have previously shown that stimulation of pMBMECs with IL-1 and TNF-a induces different cell surface levels of ICAM-1 and VCAM-1, which will influence T cell behavior on the pMBMEC monolayer.  

      • Lines 313-315: Could this be because the monolayer was not washed and soluble cytokines affected T cell response directly?

      Please see our answer to lines 306-309.  

      • Lines 319: Please cite Roger Kamm and Noo Li Jeon’s papers on BBB models with human BMECs, pericytes and astrocytes in 3D microfluidic devices.

      We thank the Reviewer again for pointing out these studies. As mentioned above, as our present study does not explore 3D models of the BBB, we think it does not fit into the framework of our study to elaborate on 3D models of the BBB. In addition, this would require the inclusion of a discussion of the work of others like, e.g., Peter Searson and others.  

      • Figure 5: Several statistics are missing from parts of the figure. Please add those.

      We apologize – but we do not understand which statistical analysis the Reviewer is missing from this Figure.  

      • Can the authors comment on the number of T cells perfused over the monolayer and if this ratio of T cells to endothelial cells makes physiological sense? Too many T cells may result in endothelium inflammation and increased diapedesis.

      The number of T cells used to suprerfuse over the endothelial monolayer is tested to avoid aggregation of T cells in suspension and thus artificial interactions with the endothelial monolayer. T cell behavior on the pMBMEC monolayer remains the same over the dilution of factor 10.  

      • Lines 381 383: How does this compare to analyses that look at the cross-section of the endothelium? It is difficult to assess transmigration looking at the top view of the endothelium. Perhaps, cross-section assessments will identify differences in manual vs. automatic tracking.

      There is, to the best of our knowledge, no microscopic device that would allow for in vitro live cell imaging of a live endothelial monolayer – this is in the presence of tissue culture medium – from the side at a resolution that would allow to define transmigration. Our current study rather shows the UFMTrack can distinguish cells moving above or below the endothelial monolayer.  

      • Figure 5J: This is probably the most important argument of the paper. If the authors can show statistical differences in their graph, this would greatly help convince readers that this tool is necessary and actually computationally efficient compared to manual work by researchers.

      We thank the Reviewer for this suggestion. However, comparing a single data point for automated measurement with four manual experimenter analysts is not a statistically sound comparison. We believe that Figure 5K is clearly showing the factor 5 difference in analysis speed as compared to manual analysis. More importantly, though, the automated analysis is taking the machine time, lifting the need for the experimenter to invest even 1/5th of the original analysis time.

      • Figure 6: Did the authors use autologous immune cells and endothelial cells? This is particularly relevant with the use of human-derived T cells (line 436) on the BMEC monolayer. Can the authors comment on non-self reactivity by the T cells encountering BMEC from another human subject?

      Autologous T cell interaction with BMECs would only be possible when using hiPSC-derived EECM-BMECs and the T cells from the same individual. All other experimental frameworks will not include autologous interactions. This is the experimental framework used by most authors studying immune cell interactions with commercially available donors. We have not studied alloreactive interactions in our assays and thus cannot further comment.  

      • Figure 6M,N,O: How does this compare to ImageJ for tracking of fluorescent cells? I recommend the authors to try that, at least for this section, as this may enhance their argument for their tool vs. standard tools like ImageJ if success rates are higher for their tool.

      We thank the Reviewer for this suggestion. We included a note on the analysis of the fluorescent datasets using the  TrackMate plugin for imageJ performed previously in our lab in the “Human T cells on immobilized recombinant BBB adhesion molecules” subsection.

      • Figure 6: Please put individual datapoints on the bar or violin plots where they are missing.

      We thank the Reviewer for this suggestion. However, adding about one thousand points corresponding to each cell would be impractical. We thus present the distributions of the evaluated from the data metrics as a histogram on the violin plot instead of the swarm plot.

      • Lines 467-471: This argument is important and should be mentioned earlier in the introduction.

      Another point that can be mentioned is the application of this platform to imaging modalities in vivo (mouse or human) given that there is no fluorescent staining in these cases. This review may be relevant: https://doi.org/10.1002/jcb.10454

      We thank the Reviewer for this suggestion. We have clarified in the introduction that UFMTrack does not require fluorescent labels of the imaged migrating cells and relies solely on the phase contrast imaging data.

      • Discussion: Please address a few more potential applications to this study. One can be cancer and immune infiltration.

      We thank the Reviewer for this suggestion. We have elaborated on additional potential applications to the discussion section.

      Reviewer #3 (Recommendations For The Authors):

      (1) Line 327-328: The authors talk about ‘As we have previously shown…pMBMEC monolayers differs between CD4+ and CD8+ cells…’. Where was this shown? If it was in a previously published article, please provide a reference.

      We have added these missing references.  

      (2) Line 353: Please provide clear location on where to find the associated information instead of stating ‘see below’.

      We thank the Reviewer for pointing our attention to this ambiguity. We have corrected the phrase to “see next paragraph”

      (3) Line 439: Please correct the acronym to BMECs

      We thank the Reviewer for pointing our attention to this typo. We have corrected it.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Recommendations for the authors:

      Reviewer #2:

      No further questions, but please do add a sentence or two about the lack of these additional points in the discussion as a limitation to the study.

      We have included additional “limitations of the study” in the Discussion Section.

      Reviewer #3:

      The authors have added to the discussion some critical remarks about the limitations of the study, which will help in the assessment of the conclusions.

      In sum, the manuscript has significantly improved during the revision.

      Some minor points should be changed, though

      Page 18 marked: "What causes an age-dependent decrease in mitochondrial OXPHOS genes across tissues, however, is largely unknown." I assume, the authors do not suggest that the abundance of genes is reduced, which means elimination of DNA? Be more precise about this.

      We thank the reviewer for pointing this out. We have clarified this to mean “OXPHOS gene expression” and made a couple changes accordingly.

      Page 18 marked : a paragraph was added addressing the increase in mitochondrial respiration in the heart, this should be discussed in the light of literature as it was done for skeleton muscle the following paragraph

      We have included additional paragraphs in the Discussion Section to talk about increased mitochondrial respiration in the aging heart in the context of published literature.

      Figure 2: it was asked for error bars for the OCR measurements. Response: We have added the error bars and statistical significance to revised Figure 2; however, is it correct that there are no significant differences?

      Figure 2 ranks tissues based on the OCR values within a single group of mice (male or female, young or old) and is not a comparison between male vs female, or young vs old. For this reason, no statistics were included as they are not needed here. The goal of this figure is to highlight the OCR distribution across tissues within a single sex and age group.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment:

      This important study reveals that the malaria parasite protein PfHO, though lacking typical heme oxygenase activity, is vital for the survival of Plasmodium falciparum. Structural and localization analyses showed that PfHO is essential for apicoplast maintenance, particularly in gene expression and biogenesis, indicating a novel adaptive role for this protein in parasite biology. While the results supporting the claims of the authors are convincing, the lack of data defining a molecular understanding or mechanism of action of the protein in question limits the impact of the study. 

      We appreciate the positive assessment. We agree that further mechanistic understanding of PfHO function remains a key future challenge. Indeed, we made extensive efforts to unravel the molecular interactions and mechanisms that underpin the critical function of PfHO. We elucidated key interactions between PfHO and the apicoplast genome, reliance of these interactions on the electropositive N-terminus, association of PfHO with DNA-binding proteins, and a specific defect in apicoplast mRNA levels upon PfHO knockdown. The major limitation we faced in further defining PfHO function is the general lack of understanding of apicoplast transcription and broader gene expression in this organelle. That limitation and the challenges to overcome it go well beyond our study and will require concerted efforts across several manuscripts (likely by multiple groups) to define the mechanistic features of apicoplast gene expression. We look forward to contributing further molecular understanding of PfHO function as broader understanding of apicoplast transcription emerges.

      Public Reviews:

      Reviewer #1 (Public Review):

      Malaria parasites detoxify free heme molecules released from digested host hemoglobins by biomineralizing them into inert hemozoin. Thus, why malaria parasites retain PfHO, a dead enzyme that loses the capacity of catabolizing heme, is an outstanding question that has puzzled researchers for more than a decade. In the current manuscript, the authors addressed this question by first solving the crystal structure of PfHO and aligning it with structures of other heme oxygenase (HO) proteins. They found that the N-terminal 95 residues of PfHO, which failed to crystalize due to their disordered nature, may serve as signal and transit peptides for PfHO subcellular localization. This was confirmed by subsequent microscopic analysis with episomally expressed PfHO-GFP and a GFP reporter fused to the first 83 residues of PfHO (PfHO N-term-GFP). To investigate the functional importance of PfHO, the authors generated an anhydrotetracycline (aTC) controlled PfHO knockdown strain. Strikingly, the parasites lacking PfHO failed to grow and lost their apicoplast. Finally, by chromatin immunoprecipitation (ChIP), quantitative PCR/RT-PCR, and growth assays, the authors showed that both the cognate N-terminus and HO-like domain were required for PfHO function as an apicoplast DNA interacting protein.

      The authors systemically performed multidisciplinary approaches to address this difficult question: what is the function of this enzymatically dead PfHO? I enjoyed reading this manuscript and its thoughtful discussion. This study is not of clinical importance for antimalarial treatments but also deepens our understanding of protein function evolution. While I understand these experiments are challenging to conduct in malaria parasites, the data quality of some of the experiments could be improved. For example, most of the Western blots and Southern blots are not of high quality. 

      We thank the reviewer for the positive comments but are a bit puzzled by the final statement about western and Southern blot quality. We agree that the two anti-PfHO western blots probed with custom antibody (Fig. 3- source data 2 and 8) have substantial background signal in the higher molecular mass region >75 kDa. However, we note that the critical region <50 kDa is clear in both cases and readily enables target band visualization. All other western blots probing GFP or HA epitopes are of high quality with minimal off-target background. We present two Southern blot images. We agree that the signal is somewhat faint for the Southern blot demonstrating on-target integration of the aptamer/TetR-DOZI plasmid (Fig. 3- fig. supplement 4), although we note that the correct band pattern for integration is visible. We also note that the accompanying genomic PCR data is unambiguous. The Southern blot for GFPDHFRDD incorporation into the PfHO locus (Fig. 3- fig. supplement 1) has clear signal and strongly supports on-target integration. The minor background signal in the lower left region of the image does not extend into the critical lanes nor impact interpretation of correct clonal integration.

      As noted below, we have obtained a second western blot image to evaluate the decrease in PfHO protein expression in -aTC conditions. This revised image, which we now include in Fig. 3, shows clean detection of the PfHO signal in the critical molecular mass region below 40 kDa in +aTC conditions and substantial loss of this signal in -aTC conditions (relative to HSP60 loading control).

      Reviewer #2 (Public Review):

      Summary: 

      Blackwell et al. investigated the structure, localization, and physiological function of Plasmodium falciparum (Pf) heme oxygenase (HO). Pf and other malaria parasites scavenge and digest large amounts of hemoglobin from red cells for sustenance. To counter the potentially cytotoxic effects of heme, it is biomineralized into hemozoin and stored in the food vacuole. Another mechanism to counteract heme toxicity is through its enzymatic degradation via heme oxygenases. However, it was previously found by the authors that PfHO lacks the ability to catalyze heme degradation, raising the intriguing question of what the physiological function of PfHO is. In the current contribution, the authors determine that PfHO localizes to the apicoplast, determine its targeting sequence, establish the essentiality of PfHO for parasite viability, and determine that PfHO is required for proper maintenance of apicoplasts and apicoplast gene expression. In sum, the authors establish an essential physiological function for PfHO, thereby providing new insights into the role of PfHO in plasmodium metabolism. 

      Strengths: 

      The studies are rigorously conducted and the results of the experiments unambiguously support a role for PfHO as being an apicoplast-targeted protein required for parasite viability and maintenance of apicoplasts. 

      Weaknesses: 

      While the studies conducted are rigorous and support the primary conclusions, the lack of experiments probing the molecular function of PfHO limits the impact of the work. Nevertheless, the knowledge that PfHO is required for parasite viability and plays a role in the maintenance of apicoplasts is still an important advance.

      We appreciate the positive assessment. We agree that further mechanistic understanding of PfHO function remains a key future challenge. Indeed, we made extensive efforts to unravel the molecular interactions and mechanisms that underpin the critical function of PfHO. We elucidated key interactions between PfHO and the apicoplast genome, reliance of these interactions on the electropositive N-terminus, association of PfHO with DNA-binding proteins, and a specific defect in apicoplast mRNA levels upon PfHO knockdown. The major limitation we faced in further defining PfHO function is the general lack of understanding of apicoplast transcription and broader gene expression. That limitation and the challenges to overcome it go well beyond our study and will require concerted efforts across several manuscripts (likely by multiple groups) to define the mechanistic features of apicoplast gene expression. We look forward to contributing further molecular understanding of PfHO function as broader understanding of apicoplast transcription emerges.

      Recommendations for the authors: 

      Reviewer #1 (Recommendations For The Authors): 

      Specifically, I would like to see the expression of PfHO in the 3D7 strain and PfHOaptamer/TetR-DOZI parasites detected by PfHO antibody on the same blot. The reason is that while most of the western blots show that PfHO appears as both pro- and processed-form, Figure 3-S5B shows only the processed-form of PfHO in all life stages of 3D7. It would be interesting to find out if the processing of PfHO1 is strain/stage-specific, and whether it is regulated by heme levels. It may also be interesting to find out if the pro-form of PfHO is also functional (i.e. mutate the cleavage site). 

      We agree with the reviewer that Fig. 3- figure supplement 5B shows predominant detection of a single band for PfHO in untagged 3D7 parasites. In our experience, the detection of the unprocessed, pro form of PfHO can vary idiosyncratically with different experiments and cultures. In support of this variable detection of unprocessed PfHO in 3D7, we note in Fig. 3A that we detected both the unprocessed and processed forms of PfHO in a western blot of endogenously tagged PfHO-GFP-DHFRDD in 3D7 parasites with an intact apicoplast. We agree with the reviewer that future studies of stage-dependent processing of PfHO may give insights into conditions that favor or disfavor detection of the unprocessed protein. 

      Given prior evidence for vestigial heme binding by PfHO (Sigala et al. JBC 2012), we considered whether such heme binding might modulate PfHO expression, stability, and/or function. It is unknown if heme is present inside the apicoplast, and we currently lack evidence for heme-dependent function or expression by PfHO. Future studies can test this possible dependence.

      Regarding processing and possible function of the cleaved peptide, we note that the Nterminal 18 amino acids are expected to constitute the signal peptide that is cleaved cotranslationally with import into the ER. Our data indicate that PfHO undergoes further processing upon import into the apicoplast to remove a further 15 residues. We currently have no evidence nor expectation that these additional residues contribute to PfHO function beyond targeting to the apicoplast.

      I am also confused as to why the authors used rabbit anti-PfHO and rabbit anti-Ef1α on the same blot for Figure 3C, which makes it difficult to appreciate the expression changes of PfHO. Given the high non-specific background of PfHO antibody shown by other Western blots (Figure 3 - Source data 2), I would like to see a blot stained with only PfHO antibody to show that expression of PfHO has been efficiently reduced in the absence of aTC. 

      Bands for Ef1α (50 kDa) and untagged PfHO (~32 kDa) are readily distinguished by western blot analysis based on their distinct molecular masses and electrophoretic mobilities. We agree that staining with the anti-PfHO antibody resulted in background bands in other regions of the gel image, especially in the higher molecular mass region >75 kDa. We note that additional strong evidence for down-regulation of PfHO expression is provided in Fig. 3- figure supplement 6, which shows specific loss of PfHO mRNA transcript levels in -aTC conditions by RT-qPCR. 

      Nevertheless, we have followed the reviewer’s suggestion and provided a new WB image of PfHO expression ±aTC (probed only with rabbit anti-PfHO antibody) that shows strong down-regulation of PfHO protein levels in -aTC conditions, consistent with the strong growth phenotype observed. We have inserted this revised, cleaner western blot image into Fig. 3 (along with detection of HSP60 levels in replicate samples as loading control) and placed the prior image into Fig. 3- figure supplement 6. In both cases, densitometry analysis indicates an 80-85% reduction in PfHO levels in -aTC conditions.

      The authors proposed that PfHO interacts with apicoplast genome DNA via the electropositive N-terminus. Interestingly, these positively charged residues are not conserved between Plasmodium, Theileria, and Babesia. I will be curious to follow the authors' future work to investigate the function of this electropositive N-terminus, possibly by comparative and mutagenesis analysis. 

      We agree that further molecular studies of DNA-binding determinants by PfHO and its N-terminus will be insightful.

      The Quantitative RT-PCR analysis revealed that loss of PfHO specifically resulted in decreased apicoplast RNA. I wonder if the authors plan to conduct RNAseq analysis on the PfHO knockdown strain across multiple life stages, to get a clearer picture of PfHO function in malaria parasites. 

      Our RT-qPCR data across multiple asexual stages prior to organelle loss indicate that abundance of all apicoplast-encoded transcripts drops precipitously and uniformly upon PfHO knockdown (Fig. 5- figure supplement 7). Given the small size of the apicoplast genome and the polycistronic nature of apicoplast transcription, we assume that RNA-Seq studies would result in a similar observation. We hypothesize that PfHO knockdown and subsequent dysfunctions may interfere with RNA polymerase assembly on DNA and/or processivity. We are currently testing these hypotheses.

      I noticed that the authors did not discuss the function of PfHO in apicoplast organelle biogenesis. Since ClpM (previously termed ClpC) is the only apicoplast-encoded Clp subunit that is essential for apicoplast biogenesis, does the author think that PfHO knockdown parasites lost their apicoplast due to decreased ClpM expression? If that were the case, would episomally expression or nuclear knockin of ClpM rescue PfHO deficiency in the absence of isopentenyl pyrophosphate (IPP)? 

      We share the reviewer’s curiosity to understand how loss of apicoplast transcripts leads to organelle dysfunction and defective IPP synthesis. We agree that ClpM function may be critical to import of nuclear-encoded proteins necessary for apicoplast function. SufB encoded on the apicoplast genome is also expected to be essential for Fe-S cluster synthesis in the apicoplast and to be required for Fe-S-dependent IPP synthesis. We have expanded the first Discussion section to address these possible connections.

      Minor: 

      (1) None of the microscopy photos have scale bars. 

      We have added scale bars to all microscopy images.

      (2) Multiple microscopy pictures show strange patches around the fluorescent signals (a grey square distinguishes from the black background). This is especially evident in Figure 2 S2. Was it caused by the reduction of the original pictures? 

      We have reviewed all fluorescence microscopy images but are unable to identify the issue noted by the reviewer. We have uploaded new versions of all images to include scale bars (as requested above), and we hope that this update resolves the issue observed by the reviewer. We are happy to further troubleshoot and address if the reviewer continues to see these artifacts and can provide further information.

      (3) A description of how Southern blotting was performed is missing. 

      We thank the reviewer for bringing this omission to our attention. We have added a description of the Southern blot methods to the section on genome editing.

      (4) Figure 3B: should be "αGFP: 12nm", not "αPfHO1: 12nm". 

      We have modified this labeling to read “αGFP (PfHO): 12 nm”.

      (5) Figure 3C: which clone of PfHO knockdown was used in all the following figures? How many clones were tested in the following figures (did they show consistent phenotype)? 

      The polyclonal culture of PfHO-aptamer/TetR-DOZI knockdown parasites from transfection 11 was used for growth assay and western blot experiments, since there was no evidence by PCR or Southern blot for the wildtype PfHO locus. We have elaborated on these details in the Methods section.

      Reviewer #2 (Recommendations For The Authors): 

      In Figure 2 and Figure 3B, to address rigor and reproducibility, the authors should state the number of parasites analyzed and if there was any variation in localization. For instance, did all of the parasites analyzed have apicoplast localization of heme oxygenase or was there a distribution of apicoplast and non-apicoplast localization? 

      Localization by fluorescence microscopy of episomal and endogenous tagged PfHO is presented in Fig. 2, Fig. 2- fig. supplements 1 and 2, and Fig. 3- fig. supplement 2. Localization by immunogold EM is presented in Fig. 3B and Fig. 3- fig. supplement 3. In all cases 3-4 representative images are presented that support exclusive localization of PfHO to the apicoplast. We imaged ≥10-20 additional parasites in all cases (and across distinct transfections and biological samples) that also supported exclusive localization to the apicoplast. We have modified the figure legends and methods description to note these replicate values. Finally, we note that IPP rescue of parasite viability upon PfHO knockdown strongly supports the conclusion that the critical and essential function of PfHO impacts the apicoplast, consistent with its exclusive detection in that organelle by microscopy.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      Comment 1. Mohseni and Elhaik's article offers a critical evaluation of Geometric Morphometrics (GM), a common tool in physical anthropology for studying morphological differences and making phylogenetic inferences. I read their article with great interest, although I am not a geneticist or an expert on PCA theory since the problem of morphology-based classification is at the core of paleoanthropology.

      The authors developed a Python package for processing superimposed landmark data with classifier and outlier detection methods, to evaluate the adequacy of the standard approach to shape analysis via modern GM. They call into question the accuracy, robustness, and reproducibility of GM, and demonstrate how PCA introduces statistical artefacts specific to the data, thus challenging its scientific rigor. The authors demonstrate the superiority of machine learning methods in classification and outlier detection tasks. The paper is well-written and provides strong evidence in support of the authors' argument. Thus, in my opinion, it constitutes a major contribution to the field of physical anthropology, as it provides a critical and necessary evaluation of what has become a basic tool for studying morphology, and of the assumptions allowing its application for phylogenetic inferences. Again, I am not an expert in these statistical methods, nor a geneticist, but the authors' contribution is of substantial relevance to our field (physical anthropology). The examples of NR fossils and HLD 6 are cases in point, in line with other notable examples of critical assessment of phylogenetic inferences made on the basis of PCA results of GM analysis. For example, see Lordkipanidze et al.'s (2014) GM analyses of the Dmanisi fossils, suggesting that the five crania represent a single regional variant of Homo erectus; and see Schwartz et al.'s (2014) comment on their findings, claiming that the dental, mandibular, and cranial morphology of these fossils suggest taxic diversity. Schwartz et al. (2014) ask, "Why did the GMA of 78 landmarks not capture the visually obvious differences between the Dmanisi crania and specimens commonly subsumed H. erectus? ... one wonders how phylogenetically reliable a method can be that does not reflect even easily visible gross morphological differences" (p. 360).

      As an alternative to the PCA step in GM, the authors tested eight leading supervised learning classifiers and outlier detection methods on three-dimensional datasets. The authors demonstrated inconsistency of PCA clustering with the taxonomy of the species investigated for the reconstruction of their phylogeny, by analyzing a database comprising landmarks of 6 known species that belong to the Old World monkeys tribe Papionini, using PCA for classification. The authors also demonstrated that high explained variance should not be used as an estimate of high accuracy (reliability). Then, the authors altered the dataset in several ways to simulate the characteristic nature of paleontological data.

      The authors excluded taxa from the database to study how PCA and alternative classifiers are affected by partial sampling, and the results presented in Figures 4 and 5, among others, are quite remarkable in showing the deviations from the benchmark data. These results expose the perils of applying PCA and GM for interpreting morphological data. Furthermore, they provide evidence showing that the alternative classifiers are superior to PCA, and that they are less susceptible to experimenter intervention. Similar results, i.e., inconsistencies in the PC plots, were obtained in examinations of the effect of removing specimens from the dataset and in the interesting test of removing landmarks to simulate partial morphological data, as is often the case with fossils. To test the combined effect of these data alterations, the authors combined removal of taxa, specific samples, and landmarks from the dataset. In this case, as well, the PCA results indicate deviation from the benchmark data. However, the ML classifiers could not remedy the situation. The authors discuss how these inconsistencies may lead to different interpretations of the data, and in turn, different phylogenetic conclusions. Lastly, the authors simulated the situation of a specimen of unknown taxonomy using outlier detection methods, demonstrating LOF's ability to identify a novelty in the morphospace.

      References

      Bookstein FL. 1991. Morphometric tools for landmark data: geometry and biology [Orange book]. Cambridge New York: Cambridge University Press.<br /> Cooke SB, and Terhune CE. 2015. Form, function, and geometric morphometrics. The Anatomical Records 298:5-28.<br /> Lordkipanidze D, et al. 2013. A complete skull from Dmanisi, Georgia, and the evolutionary biology of early Homo. Science 342: 326-331.<br /> Schwartz JH, Tattersall I, and Chi Z. 2014. Comment on "A complete skull from Dmanisi, Georgia, and the evolutionary biology of Early Homo". Science 344(6182): 360-a.

      The reviewer considered our work to be a “contribution is of substantial relevance to our field (physical anthropology)” We are grateful for this evaluation and for the thorough review and insightful comments on our manuscript, which helped us improve its quality further. Your remarks regarding the superiority of machine learning methods over traditional GM approaches, as well as the challenges and implications highlighted in our findings, resonate deeply with the core objectives of our research. The references to previous studies and their relevance to our work underscore the broader implications of our findings for the interpretation of morphological data in evolutionary studies. We are thankful for your remarks regarding the debate surrounding the Dmanisi fossils. We covered it in our introduction (lines 161-174):

      Finally, PCA also played a part in the much-disputed case of the Dmanisi hominins (39, 40). These early Pleistocene hominins, whose fossils were recovered at Dmanisi (Georgia), have been a subject of intense study and debate within physical anthropology. Despite their small brain size and primitive skeletal architecture, the Dmanisi fossils represent Eurasia’s earliest well-dated hominin fossils, offering insights into early hominin migrations out of Africa. The taxonomic status of the Dmanisi hominins has been initially classified as Homo erectus or potentially represented a new species, Homo georgicus or else (40, 41). Lordkipanidze et al.’s (42) geometric morphometrics analyses suggested that the variation observed among the Dmanisi skulls may represent a single regional variant of Homo erectus. However, Schwartz et al. (2014) (43) raised concerns about the phylogenetic inferences based on PCA results of the geometric morphometrics analysis, noting the failure of the method to capture visually obvious differences between the Dmanisi crania and specimens commonly subsumed under Homo erectus."

      Comment 2. I suggest moving all the interpretations from the Results section to the Discussion section. This will enhance the flow of the results and make it easier to follow.

      We tried that, but it made the manuscript less readable. Because our manuscript makes two strong statements, one about the unsuitability of PCA to the field and one about the many other problems in the field, as demonstrated through several test cases, it is better to keep them separate in the Results and Discussions, respectively.

      Comment 3. I recommend conducting an English language edit on the text to address minor inconsistencies.

      We thoroughly edited the text to enhance the language style and consistency. We thank the reviewer for the suggestion.

      Comment 4. Line 21, what do you mean by "ontogenists"?

      Individuals who are versed in or study ontogeny.

      Comment 5. When referring to the remains from Nesher Ramla (Israel), I recommend using "NR fossils". Thus, in line 34, I suggest replacing "Homo Nesher Ramla" by "Nesher Ramla fossils (NR fossils)", also in line 122.

      We replaced "Homo Nesher Ramla" with "Nesher Ramla fossils (NR fossils)" in all of the instances throughout the manuscript. We thank the reviewer for the suggestion.

      Comment 6. Line 34, I suggest replacing "human" by "hominin".

      (Line 35) We replaced "human" with "hominin".

      “…, such as the case of Homo Nesher Ramla, an archaic hominin with a questionable taxonomy.”

      We thank the reviewer for the suggestion.

      Comment 7. Line 67-68, I suggest clarifying the classification of landmarks using the definition of landmark types (Bookstein, 1991; also see summary by Cooke and Terhune (2015) - Table 1).

      We revised our summary of the classification of landmarks: (Lines 83-94). Our MS now reads:

      “Determining sufficient measurements and data points for a valid morphometric analysis is older than modern geometric morphometrics (19). In geometric morphometrics, landmarks are discrete points on biological structures used to capture shape variation. Bookstein (20) categorised landmarks into three types: Type one, representing the juxtaposition of tissues such as the intersection of two sutures; Type two, denoting maxima of curvature like the deepest point in a depression or the most projecting point on a process; and Type three, which includes extremal points defined by information from other locations on the object, such as the endpoint or centroid of a curve or feature. Originally, Type three landmarks encompassed semi-landmarks, but Weber and Bookstein (21) refined this classification, identifying Type three landmarks as those characterised by information from multiple curves and symmetry, including the intersection of two curves or the intersection of a curve and a suture, and further subdividing them into three subtypes (3a, 3b, 3c) (15). While landmarks provide crucial information about the structure’s overall shape, semi-landmarks capture fine-scale shape variation (e.g., curves or surfaces) that landmarks alone cannot adequately represent. Semi-landmarks are heavily relied upon as the source of shape information to break the continuity of regions in the specimen without clearly identifiable landmarks (22). Semi-landmarks are typically aligned based on their relative positions to landmarks, allowing for the comprehensive analysis of shape changes and deformations within complex structures (2). Unsurprisingly, the use of semi-landmarks is controversial. For instance, Bardua et al. (23) claim that high-density sliding semi-landmark approaches offer advantages compared to landmark-only studies, while Cardini (24) advises caution about potential biases and subsequent inaccuracies in high-density morphometric analyses.”

      We thank the reviewer for the suggestion.

      Comment 8. Line 84, "beneficial over" - I suggest revising.

      (Line 102) We revised the sentence and used “offer advantages” instead.

      “… claim that high-density sliding semi-landmark approaches offer advantages compared to landmark-only studies.”

      We thank the reviewer for the suggestion.

      Comment 9. Line 97, do you mean "therefore"?

      (Line 115) Yes, we replaced "thereby" with "therefore".

      Comment 10. Line 116, I suggest rephrasing as follows: "newly discovered hominin fossils with respect to...".

      (Lines 135, 136) We rephrased it as suggested:

      “is the classification of newly discovered hominin fossils within the human phylogenetic tree”

      We thank the reviewer for the suggestion.

      Comment 11. Line 119, please clarify or explain what you mean by subjective determination of clustering in PCA plots.

      We rephrased (Lines 137, 138) to read:

      "However, which specimens should be included in clusters and which ones should be considered outliers is determined subjectively…"

      We thank the reviewer for the suggestion.

      Comment 12. Lines 146-148: consider revising to clarify the sentence; "than" in line 147 should be "that".

      We modified the sentence, we replaced "than" with "that". (Lines 196, 197)

      " … that even the criticism from its pioneers was dismissed"

      We thank the reviewer for the suggestion.

      Comment 13. Line 213: I recommend adding the phylogenetic tree of the Papionini tribe. This would be particularly relevant for the interpretation of the results, e.g., in lines 324-328.

      The reviewer suggested adding a phylogenetic tree of the Papionini tribe to increase the interpretability of our results. We added two trees (Figure 3) based on the molecular phylogeny of extant papionins and the most parsimonious tree generated from the initial Collard and Wood (1).

      We thank the reviewer for the suggestion.

      Comment 14. Lines 244-248: I recommend that the parallels drawn between the results presented in this section and other cases of PCA analysis interpretation (e.g., the NR fossils) are transferred to the Discussion section.

      This would allow a more fluent read of the results.

      Thank you, we considered that but found that it does not improve the readability of the discussion, because this is a very technical issue that would be best understood alongside the specific use case that tests it.

      Comment 15. Line 301: The word "are" should be placed before the word "all".

      (Line 319) We modified accordingly and placed "are" before "all":

      “Rarely are all related taxa represented;”

      We thank the reviewer for the suggestion.

      Comment 16. Line 426: I suggest "omissions" in place of "missingness".

      (Line 435) We replaced "missingness" with "omissions".

      We thank the reviewer for the suggestion.

      Comment 17. Line 440 is part of the caption for Figure 6. Please add a description of what the red arrow indicates in every figure in which it appears.

      Yes, we added a sentence to the caption of figures 7 and 8:

      “The red arrow in subfigures A, B, and C marks a Lophocebus albigena (pink) sample whose position in PC scatterplots is of interest.”

      We thank the reviewer for the suggestion.

      Comment 18. Line 454: I recommend "partial morphological information" instead of "some form information".

      (Lines 446, 447) We made modifications and replaced "some form information" with " partial morphological information":

      “Newfound samples often comprise incomplete osteological remains or fossils (18, 22) and only present partial morphological information.”

      We thank the reviewer for the suggestion.

      Comment 19. Line 547: I suggest "portion" instead of "fracture".

      (Lines 470, 471) We replaced "fracture" with "portion":

      “Thereby, while the complete skull would cluster with its own taxon…”

      We thank the reviewer for the suggestion.

      Comment 20. Lines 664-665 should read "anatomy and physical anthropology".

      (Lines 600-602) We modified the text accordingly:

      “There are various approaches in morphometrics, but among them, geometric morphometrics has left an indelible mark on biology, especially in anatomy and physical anthropology.”

      We thank the reviewer for the suggestion.

      Comment 21. Lines 684-699: This paragraph seems to belong in the introduction section.

      (lines 175-190) We modified it and moved it to the introduction.

      “Visual interpretations of the PC scatterplots are not the only role PCA plays in geometric morphometrics. Phylogenetic Principal Component Analysis (Phy-PCA) (44) and Phylogenetically Aligned Component Analysis (PACA) (45) are both used in geometric morphometrics to analyse shape variation while considering the supposed phylogenetic relationships among species. They differ in their approach to aligning landmark configurations and the role of PCA within them. Phy-PCA incorporates phylogenetic information by utilising a phylogenetic tree to model the evolutionary history of the species. This method aims to separate shape variation resulting from shared evolutionary history from other sources of variation. PCA plays a similar role in performing dimensionality reduction on the aligned landmark configurations in Phy-PCA (44). PACA takes a different approach to alignment. It uses a Procrustes superimposition method based on a phylogenetic distance matrix, aligning the landmark configurations according to the evolutionary relationships among species. PCA is then applied to the aligned configurations to extract the principal components of shape variation (45). Both analyses provide insights into the patterns and processes that shape biological form diversity while considering phylogenetic relationships, yet they are also subjected to the limitations and biases inherent in relying on PCA as part of the process.”

      We thank the reviewer for the suggestion.

      Comment 22. Line 717: I suggest "fossils" instead of "hominins".

      (Lines 636, 637) We modified it accordingly and replaced "hominins" with "fossils":

      “…which reflect the restraints faced in morphometric analysis of ancient samples (e.g., fossils).”

      We thank the reviewer for the suggestion.

      Comment 23. Line 728: the word "the" should be deleted; Skhul V should not be italicized, and so do the words "Mount Carmel"; "Neandertals"; "modern humans"; and "Late Paleolithic" in the following lines.

      (Line 647-651) We made modifications accordingly:

      “For example, Harvati (27), who analysed the Skhul 5 (84), a 40,000-year-old human skull from Mount Carmel (Israel), proposed diverging hypotheses based on favourable PC outcomes (based on PC8 separating it from Neanderthals and modern humans and associating it with the Late Palaeolithic specimen and based on PC12 associating it with modern humans).”

      We thank the reviewer for the suggestion.

      Comment 24. Line 734: the first comma should be deleted.

      (Line 653) We deleted the first comma:

      “(Figures 5-12) show that compared to the benchmark (Figure 4), …”

      We thank the reviewer for the suggestion.

      Reviewer #2:

      Comment 1. I completely agree with the basic thrust of this study. Yes, of course, machine learning is FAR better than any variant of PCA for the paleosciences. I agree with the authors' critique early on that this point is not new per se - it is familiar to most of the founders of the field of GMM, including this reviewer. A crucial aspect is the dependence of ALL of GMM, PCA or otherwise, on the completely unexamined, unformalized praxis by which a landmark configuration is designed in the first place. I must admit that I am stunned by the authors' estimate of over 32K papers that have used PCA with GMM.

      We thank the reviewer for accepting the premise of our study.

      But beating a dead horse is not a good way of designing a motor vehicle. I think the manuscript needs to begin with a higher-level view of the pathology of its target disciplines, paleontology and paleoanthropology, along the lines that David demonstrated for numerical taxonomy some decades ago. That many thousands of bad methodologies require some sort of explanation all of their own in terms of (a) the fears of biologists about advanced mathematics, (b) the need for publications and tenure, (c) the desirability of covers of Nature and Science, and (d) the even greater glory of getting to name a new "species." This cumulative pathology of science results in paleoanthro turning into a branch of the humanities, where no single conclusion is treated as stable beyond the next dig, the next year or so of applied genomics, and the next chemical trace analysis. In short, the field is not cumulative.

      Given the wide popularity of PCA and the attempts to prevent data replication to show its limitations, we do not believe that we are beating a dead horse, but a very live beast that threatens the integrity of the entire field. We accept the second part of the analogy about developing a motor vehicle.

      We also accepted the reviewer’s suggestion and developed the suggested paragraph:

      " A major contribution to the field was made by Sokal and Sneath’s Principles of Numerical Taxonomy (9) book, which challenged traditional taxonomic theory as inherently circular and introduced quantitative methods to address questions of classification (see also review by Sneath (10)). Hull (11) claimed that evolutionary reasoning practiced in taxonomy is not inherently circular but rather unwarranted. He argued that such criticism was based on misunderstandings of the logic of hypothesising, which he attributed to an unrealistic desire for a mistake-proof science. He contended that scientific hypotheses should begin with insufficient evidence and be refined iteratively as new evidence emerges. However, some taxonomists preferred a more rigid, hierarchical approach to avoid the appearance of error. As a result of these and other criticisms, traditional taxonomy declined in favour of cladistics and molecular systematics, which provided more accurate and evolutionarily informed classifications.

      Today, palaeontology and palaeoanthropology grapple with methodological challenges that compromise the stability of their conclusions. These issues stem from various factors, including biologists’ apprehensions towards advanced mathematics, the pressure to publish for career advancement (12), the pursuit of high-profile journal covers, and the prestige associated with naming new species. As a result, these fields often resemble a branch of biology where the latest discoveries or new analytical techniques frequently overturn previous findings. This lack of cumulative knowledge necessitates a more rigorous approach to methodology and interpretation in morphometrics to ensure that conclusions are robust and enduring."

      It is not obvious that the authors' suggestion of supervised machine learning will remedy this situation, since (a) that field itself is undergoing massive changes month by month with the advent of applications AI, and even more relevant (b) the best ML algorithms, those based on deep neural nets, are (literally) unpublishable - we cannot see how their decisions have actually been computed. Instead, to stabilize, the field will need to figure out how to base its inferences on some syntheses of actual empirical theories.

      We appreciate the reviewer’s insightful comments and concerns regarding the use of supervised machine learning in our study. We acknowledge the rapid advancements in the field of machine learning and its significant impact on various domains, including geometric morphometrics. Although we are aware of the ongoing integration of machine learning techniques in geometric morphometrics, our objective was to thoroughly investigate some of the conventional and more frequently used models for comparative analysis.

      Our intention was also to develop a Python module that enables users to easily apply these models to their landmark data. We recognise that most users typically apply machine learning methods to the principal component analysis (PCA) of their landmark data (2), unless PCA fails to explain enough variance (3), as we discussed in the context of Linear Discriminant Analysis (LDA). Our study demonstrates that these machine learning methods can be directly applied after generalised Procrustes analysis (GPA), without necessitating PCA as an intermediary step. This highlights another significant point of our research: the often automatic and potentially unnecessary use of PCA in geometric morphometrics.

      Furthermore, we acknowledge that the availability of more extensive data might have allowed us to explore more complex methods, such as neural networks. However, neural networks require a substantial amount of data due to their numerous learning parameters, which we did not possess in this study. It is also evident that not every algorithm is suitable for every situation. Our findings revealed that simpler models, such as the nearest neighbours classifier, which do not even have a training phase, performed exceptionally well. Additionally, the nearest neighbours classifier offers the desired transparency and interpretability, addressing the reviewer’s concern regarding the opacity of more complex models.

      We hope this clarifies our approach and objectives, and we sincerely thank the reviewer for their valuable feedback, which has helped us refine our study and its presentation.

      It's not that this reviewer is cynical, but it is fair to suggest a revision conveying a concern for the truly striking lack of organized skepticism in the literature that is being critiqued here. A revision along those lines would serve as a flagship example of exactly the deeper argument that reference (17) was trying to seed, that the applied literature obviously needs a hundred times more of. Such a review would do the most good if it appeared in one of the same journals - AJBA, Evolution, Journal of Human Evolution, Paleobiology - where the bulk of the most highly cited misuses of PCA themselves have appeared.

      First, we do not believe that this reviewer is cynical, and we hope they will not consider us cynical if we point out that the field has thus far largely ignored previous reports of PCA misuses published in those journals, like the excellent Bookstein 2019 (4) paper, so perhaps a different approach is needed with a different journal.

      Second, our MS is not a review. We agree with the reviewer that a review of PCA critical papers is of value. We changed the title of our study to make it easier to find, and we thank the reviewer for the comment. 

      Reviewer #3:

      Comment 1. Mohseni and Elhaik challenge the widespread use of PCA as an analytical and interpretive tool in the study of geometric morphometrics. The standard approach in geometric morphometrics analysis involves Generalised Procrustes Analysis (GPA) followed by Principal Component Analysis (PCA). Recent research challenges PCA outcomes' accuracy, robustness, and reproducibility in morphometrics analysis. In this paper, the authors demonstrate that PCA is unreliable for such studies. Additionally, they test and compare several Machine-Learning methods and present MORPHIX, a Python package of their making that incorporates the tools necessary to perform morphometrics analysis using ML methods.

      Mohseni and Elhaik conducted a set of thorough investigations to test PCA's accuracy, robustness, and reproducibility following renewed recent criticism and publications where this method was abused. Using a set of 2 and 3D morphometric benchmark data, the authors performed a traditional analysis using GPA and PCA, followed by a reanalysis of the data using alternative classifiers and rigorous testing of the different outcomes.

      In the current paper, the authors evaluated eight ML methods and compared their classification accuracy to traditional PCA. Additionally, common occurrences in the attempted morphological classification of specimens, such as non-representative partial sampling, missing specimens, and missing landmarks, were simulated, and the performance of PCA vs ML methods was evaluated.

      This is a correct description of our MS.

      The main problem with this manuscript is that it is three papers rolled into one, and the link doesn't work.

      We agree that the manuscript is comprehensive and can probably be broken down into more than one manuscript. However, we do not adhere to the philosophies of the least publishable unit (LPU), the smallest publishable unit (SPU), or the minimum publishable unit (MPU). Instead, we believe in producing high-quality and encompassing studies.

      We checked the link thoroughly and ensured it is functional, thank you for your comment.

      The title promises a new Python package, but the actual text of the manuscript spends relatively little time on the Python package itself and barely gives any information about the package and what it includes or its usefulness. It is definitely not the focus of the manuscript. The main thrust of the manuscript, which takes up most of the text, is the analysis of the papionin dataset, which shows very convincingly that PCA underperforms in virtually all conditions tested.

      We agree. We revised the title to reflect the main issue of the paper. Thank you for your comment.

      In addition, the manuscript includes a rather vicious attack against two specific cases of misuse of PCA in paleoanthropological studies, which does not connect with the rest of the manuscript at all.

      We consider these case studies of the use of PCA, which resonate with our ultimate goal. First, the previous reviewer suggested that we are beating a “dead horse.” We provide very recent and high-profile test cases to support our position that PCA is a popular and widely used method. Second, we wish to show how researchers use data alternations to cherry-pick results. Third, we focus on one of the use cases (the Homo NS) to demonstrate the poor scientific practices prevalent in this field, such as refusing to share data and breaking Science’s policies to protect this act.

      If the manuscript is a criticism of PCA techniques, this should be reflected in the title. If it is a report of a new Python package, it should focus on the package. Otherwise, there should be two separate manuscripts here.

      It is a criticism of PCA, and it is now reflected in the title; thank you again.

      The criticism of PCA is valid and important. However, pointing out that it is problematic in specific cases and is sometimes misused does not justify labeling tens of thousands of papers as questionable and does not justify vilifying an entire discipline. The authors do not make a convincing enough case that their criticism of the use of PCA in analyzing primate or hominin skulls is relevant to all its myriad uses in morphometrics. The criticism is largely based on statistical power, but it is framed as though it is a criticism of geometric morphometrics in general.

      We appreciate the opportunity to address the concerns raised regarding our critique of PCA. The reviewer argues that because we analyzed only primate skulls, we cannot extrapolate that PCA will be biased in analyzing other data (other taxa or other usages). Using the same logic, we can also argue that PCA cannot be used to study NEW taxa and certainly not to detect NOVEL taxa because it was never shown to apply to these taxa. We can further argue that PCA cannot be sued to study ANY taxa since it was never shown to yield correct results (PCA results are justified through circular reasoning and are adjusted when they do not show the desired results). However, that part of our answer is not a defense of our method but rather a further criticism of the field.

      To answer the question more directly, our criticism of PCA is rooted in empirical evidence and robust research, including studies by Elhaik (5) and others (6, 7), demonstrating that PCA lacks the power to produce accurate and reliable results. If the reviewer believes that using cats instead of primates will somehow boost the accuracy of PCA, they should, at the very least, explain what morphological properties of cats justify this presumption. Concerning the case of other usages, we clearly noted that “the scope of our study was limited to PCA usage in geometric morphology.”  The reviewer did not explain why our analysis is not “convincing enough,” so we cannot address it.

      As you know, this issue extends beyond the specific case study of primate or hominin skulls in our research. Despite its widespread use, PCA is heavily relied upon in the field, often without sufficient scrutiny of its limitations. Our intention is not to vilify an entire discipline but to highlight the pervasive and sometimes unquestioning reliance on PCA across many studies in geometric morphometrics. Calling to reevaluate studies based on problematic method is not a vilification, this is by definition science.

      While we understand the concern about the generalisability of our findings, our critique is based on the inherent limitations of PCA itself, not merely on statistical power. PCA lacks measurable power, a test of significance, and a null model. Its outcomes are highly sensitive to the input data, making them susceptible to manipulation and interpretation. Moreover, the ability to evaluate various dimensions allows for cherry-picking of results, where different outcomes can be equally acceptable, thus undermining the robustness of conclusions drawn from PCA.

      We invite the reviewer to examine the mathematical basis of PCA as demonstrated in Figure 1 of Elhaik (2022) (https://www.nature.com/articles/s41598-022-14395-4/figures/1). We ask the reviewer to explain what in this straightforward calculation—calculating the mean of the dimensions, subtracting the mean from the dimensions, calculating the covariance matrix, and identifying the eigenvalues—convinces them that PCA is suitable for predicting evolutionary relationships between samples. What evidence supports the notion that evolutionary relationships can be inferred by merely subtracting the mean of a matrix? There is none, just as there is no statistical power in this method. PCA does not know what the data mean. It can be applied equally to horse race data and a dataset that records how many times Home Simpsons says his catchphrases. PCA is not an evolutionary method; it’s just a linear transformation. If we ask anyone why they trust it, eventually, we will get the answer that with enough tweaking, PCA results produce what the scientist wants to show, and, most importantly, it will be mathematically accurate (and as mathematically accurate as the result of all possible tweaks). There is nothing specific to hominins about it. If your method produces conflicting results by tweaking the number of samples, species, or landmarks, as we showed, your method is worthless. This is what we demonstrated.

      We would also like to note that if we had easier access to more data, we would have extended our analysis further and shown that the bias exists in other species. As explained in our manuscript, we reached out to several scientists who refused to share their data so that we would not show biases in their studies. As this reviewer is undoubtedly aware of the practices in the field, this criticism is extremely unfair.

      Finally, arguing that our MS dismisses the entire field of geometric morphometrics is also unfair and provocative. We made no such claim. On the contrary, we offer an unbiased method to replace PCA and improve the accuracy of studies in this field.

      We hope this clarifies our position and reinforces the validity of our critique. Thank you for your valuable feedback and for allowing us to address these important points.

      Comment 2a. The article's tone is very argumentative and provocative, and non-necessary superlatives and modifiers are used ("...colourful scatterplots", lines 101, 155, 672). While this is an excellent paper and should be studied by morphometrics experts and probably anyone using PCA, the overall tone does nothing to help. It reads somewhat like a Facebook rant rather than a scientific paper (there is still, we hope, a difference between the two). Please tone it down.

      Again, we thank the reviewer for considering our work excellent. We regret that the reviewer believes that describing colorful (#101) scatterplots as such is a provocation. We do not feel the same way. “Subsumed” (#155) has been suggested to us by an anonymous reviewer. We changed it to “classified” to satisfy the reviewer (However, Schwartz et al. (2014) raised concerns about the phylogenetic inferences based on PCA results of the geometric morphometrics analysis, noting the failure of the method to capture visually obvious differences between the Dmanisi crania and specimens commonly classified under Homo erectus.).  We do not understand the problem with #672, but we revised it to read “However, a growing body of literature criticises the accuracy of various PCA applications, raising concerns about its use in geometric morphometrics.” We hope that this satisfies the reviewer. We made no special effort to be argumentative or provocative. There is no need for that; our results speak for themselves. We did, however, make an effort to communicate the gravity of our findings by citing K. Popper. We do not consider this a provocation.

      Comment 2b. The acronym ML is normally used to denote Maximum Likelihood in the context of phylogenetic studies. The authors use it to denote Machine Learning, which many readers may find confusing (this reviewer took a while to realize that it was not referring to Maximum Likelihood). Perhaps leave "machine learning" written in full.

      We understand that in some contexts, "ML" typically denotes Maximum Likelihood, which can indeed cause confusion. Unfortunately, “ML” is also a well-established acronym for machine learning, and since our paper doesn’t deal with Maximum Likelihood but rather machine learning, we have to choose the latter. Initially, we did spell out "Machine Learning" in full to avoid this confusion. However, upon review, we found that the manuscript's readability and flow were compromised, leading us to revert to the acronym.

      We appreciate your suggestion and understand the importance of clarity. To address this, we will ensure that the first mention of "ML" is accompanied by "Machine Learning" written in full (Line 244). This should help maintain both clarity and readability. Thank you for your valuable input.

      Comment 3. In lines 142, 157 Rohlf's should be Rohlf.

      (Lines 191, 205) We modified it accordingly and replaced "Rohlf's" with "Rohlf".

      Comment 4. The short paragraph in lines 165-167 feels out of place and does not connect to the paragraphs before and after it.

      (Lines 210-223) We modified the introduction and merged that paragraph with a relevant paragraph. The new paragraph reads:

      “PCA’s prominent role in morphometrics analyses and, more generally, physical anthropology is inconsistent with the recent criticisms, raising concerns regarding its validity and, consequently, the value of the results reported in the literature. To assess PCA’s accuracy, robustness, and reproducibility in geometric morphometric analysis, particularly its potential biases and inconsistencies in clustering with species taxonomy for phylogenetic reconstruction, we utilised a benchmark database containing landmarks from six known species within the Old World monkeys tribe Papionini. We altered this dataset to simulate typical characteristics of paleontological data. We found that PCA’s outcomes lack reliability, robustness, and reproducibility. We also evaluated the argument that a high explained variance could be counted as a measure of reliability (2) and found no association between high explained variance amounts and the subjectiveness of the results. If PCA of morphometric landmark data produces biased results, then landmark-based geometric morphometric studies employing PCA, conservatively estimated to range jfrom 18,400 to 35,200 (as of July 2024) (see Methods), should be reevaluated.”

      We thank the reviewer for the suggestion.

      References

      (1) Gilbert CC, Rossie JB. Congruence of molecules and morphology using a narrow allometric approach. Proceedings of the National Academy of Sciences. 2007;104(29):11910-11914.

      (2) Courtenay LA, Yravedra J, Huguet R, Aramendi J, Maté-González MÁ, González-Aguilera D, et al. Combining machine learning algorithms and geometric morphometrics: a study of carnivore tooth marks. Palaeogeography, Palaeoclimatology, Palaeoecology. 2019;522:28-39.

      (3) Bellin N, Calzolari M, Callegari E, Bonilauri P, Grisendi A, Dottori M, et al. Geometric morphometrics and machine learning as tools for the identification of sibling mosquito species of the Maculipennis complex (Anopheles). Infection, Genetics and Evolution. 2021;95:105034.

      (4) Bookstein FL. Pathologies of between-groups principal components analysis in geometric morphometrics. Evolutionary Biology. 2019;46(4):271-302.

      (5) Elhaik E. Principal Component Analyses (PCA)-based findings in population genetic studies are highly biased and must be reevaluated. Scientific reports. 2022;12(1):1-35.

      (6) Cardini A, Polly PD. Cross-validated between group PCA scatterplots: a solution to spurious group separation? Evolutionary Biology. 2020;47(1):85-95.

      (7) Berner D. Size correction in biology: how reliable are approaches based on (common) principal component analysis? Oecologia. 2011;166(4):961-971.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Kainov et al investigated the prevalence of mutations in 3'UTR that affect gene expression in cancer to identify noncoding cancer drivers.

      The authors used data from normal controls (1000 genome data) and compared it to cancer data (PCAWG). They found that in cancer 3'UTR mutations had a stronger effect on cleavage than the normal population. These mutations are negatively selected in the normal population and positively selected in cancers. The authors used PCAWG data set to identify such mutations and found that the mutations that lead to a reduction of gene expression are enriched in tumor suppressor genes and those that are increased in gene expression are enriched for oncogenes. 3'UTR mutations that reduce gene expression or occur in TSGs cooccur with non-synonymous mutations. The authors then validate the effect of 3'UTR mutations experimentally using a luciferase reporter assay. These data identify a novel class of noncoding driver genes with mutations in 3'UTR that impact polyadenylation and thus gene expression.

      This is an elegant study with fundamental insight into identifying cancer driver genes. The conclusions of this paper are mostly well supported by data, but some aspects of data analysis need to be extended.

      We thank the reviewer for the positive assessment of our work and constructive comments.

      (1) It would be important for the authors to show if the findings of this study hold for metastatic cancers since most deaths occur due to metastasis and tumor heterogeneity changes when cancer progresses to metastasis. The authors should use the Hartwig data and show if metastatic cancers are enriched for 3'UTR mutations.

      This is a good suggestion, but we believe that the proposed analysis would have a significantly stronger impact in the context of a separate study focused specifically on longitudinal changes in the somatic mutation landscape as cancer progresses from primary tumours to metastases. Conducting such a study would require obtaining permissions to use relevant controlled datasets and, ideally, collaborating with oncologists to generate additional genome and transcriptome sequencing data. As such, this level of analysis would go beyond the current scope of our work.

      (2) Figure 2 should show the distribution of 3'UTR mutations by cancer type especially since authors go on to use colorectal cancer only for validations. It would be helpful to bring Figures S3A and S3C to this panel since these findings make the connections to cancer biology. Are any molecular functions enriched in addition to biological processes? Are kinases, phosphatases, etc more or less affected by 3'UTR mutations?

      As suggested, we have added a pie chart showing the distribution of 3’UTR mutations by cancer type (new Fig. 2E). Notably, nearly a half of the mutations in our dataset was of colorectal adenocarcinoma origin, justifying the focus on this type of cancer in our subsequent validation analyses. 

      To strengthen the connections to cancer biology, we moved Fig. S3A and S3C to the main text. It was more logical to integrate these panels into Fig. 3 rather than Fig. 2. We also analysed molecular function enrichment in Fig. 3E. Consistent with the biological process enrichment (now shown in Fig. 3D), this revealed an enrichment of proteins interacting with the ubiquitination pathway, including tumour suppressors SMAD2, APC and AXIN1.

      (3) Figure 3 looks at the co-occurrence of 3'UTR mutations with non-synonymous mutations but what about copy number change? You would expect the loss of the other allele to be enriched. Along the same line, are these data phased? Do you know that the nonsynonymous mutations are in the other allele or in the same allele that shows 3'UTR mutation?

      As suggested, we have analysed copy number variation data. As mentioned in the revised Results, this "showed that increased copy number was 4.1-times more common in the PCAWG data compared to allele loss. However, the incidence of copy number increase was substantially lower in the DOWN-paSNV group compared to the BG-paSNV control (Fig. S6). This points to a negative selection against duplications of genes affected by DOWNpaSNVs in cancer".

      Phasing somatic mutations in cancer samples is challenging due to high genetic heterogeneity of tumour cells. This situation will likely improve in the near future with the increased use of long-read sequencing. However, with currently available data, there is no straightforward method to determine whether mutations co-occur in the same cell. We have added a note on this in the Discussion section: "As long-read genomic sequencing data become increasingly available, it will be interesting to investigate whether these additional mutations occur in the same or in a different allele compared to the DOWN-paSNVs".

      Reviewer #2 (Public Review):

      Summary:

      To evaluate whether somatic mutations in cancer genomes are enriched with mutations in polyadenylation signal regions, the authors analyzed 1000 genomes data and PCAWG data as a control and experimental set, respectively. They observed increased enrichment of somatic mutations that may affect the function of polyA signals and confirmed that these mutations may influence the expression of the gene through a minigene expression experiment.

      Strengths:

      This study provides a systematic evaluation of polyA signal, which makes it valuable. Overall, the analytic approach and results are solid and supported by experimental validation.

      Thank you.

      Weaknesses:

      (1) This study uses APARENT2 as a tool to evaluate functional alteration in polyA signal sequences. Based on the original paper and the results shown in this paper, the algorithm appears to be of high quality. However, the whole study is dependent on the output of APARENT2. Therefore, it would be nice to

      (a) run and show a positive control run, which can show that the algorithm works well, and (b) describe the rationale for selecting this algorithm in the main text.

      As suggested, we have added control analyses to Fig. S1A-B, which show that APARENT2 performs well in our hands. We have described the rationale for using APARENT in the Results as follows: "For each paSNV, we calculated the change in cleavage/polyadenylation efficiency using the APARENT2 neural network model, which has been shown to infer this statistic more accurately than earlier approaches [Ref23]".

      (2) Are there recurrent somatic mutation calls (= exactly the same mutation across different tumor samples) in the poly(A) region of certain genes?

      We indeed see several cases where the same cleavage/polyadenylation signal is affected by the same or different DOWN mutations in different cancer samples. This finding is now summarized in the Results section and Table S1 as follows: "In several cases, including LRP1B and FOXO1, which are known to act as tumour suppressors in certain cancers, the same signal/polyadenyalation signal was disrupted by the same or different mutations in more than one sample (see columns Mut_Recurrence and Signal_Recurrence in Table S1)".

      (3) The authors nicely showed that the minigene with A>G mutation altered gene expression. Maybe one can reach a similar conclusion by analyzing a cancer dataset that has mutation and gene expression data? That is, genes with or without polyA mutations show different expression levels.

      The data presented in Fig. 5A-B show that DOWN-paSNV mutations have a negative effect on the expression of endogenous tumour suppressor genes.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Figures should be numbered in order. For example, Figure S3C is referred to in the text before S3A-B, etc.

      We have proofread the text to fix this problem.

      Adding a supplementary file with lists of genes carrying 3'UTR mutations split by effect on gene expression and cancer type would be very useful for the community.

      We now show this in Table S1, with the caveat that we could not consistently investigate the effect of DOWN-paSNV on gene expression since the transcriptomics data are not available for all cancers.

      Spelling mistake in Figure 1A - genone should be genome.

      Fixed - thank you.

      Typo in Figure 1B x-axis label +50nt should be -50nt to the left of the dashed line.

      Fixed - thank you.

      All figures use E to denote x10 but it would make the figures more readable if authors used the standard notation (x10) for all numbers with exponents and base 10.

      Done.

    1. Author Response:

      Reviewer #1 (Public review):

      Summary:

      It is well known that autophagosomes/autolysosomes move along microtubules. However, because previous studies did not distinguish between autophagosomes and autolysosomes, it remains unknown whether autophagosomes begin to move after fusion with lysosomes or even before fusion. In this manuscript, the authors show, using fusion-deficient cells, that both pre-fusion autophagosomes and lysosomes can move along the MT toward the minus end. By screening motor proteins and Rabs, the authors found that autophagosomal traffic is primarily regulated by the dynein-dynactin system and can be counter-regulated by kinesins. They also show that Rab7-Epg5 and Rab39-ema interactions are important for autophagosome trafficking.

      Strengths:

      This study uses reliable Drosophila genetics and high-quality fluorescence microscopy. The data are properly quantified and statistically analyzed. It is a reasonable hypothesis that gathering pre-fusion autophagosomes and lysosomes in close proximity improves fusion efficiency.

      Thank you for your positive comments and for acknowledging the strengths of our work.

      Weaknesses:

      (1) To distinguish autophagosomes from autolysosomes, the authors used vps16 RNAi cells, which are supposed to be fusion deficient. However, the extent to which fusion is actually inhibited by knockdown of Vps16A is not shown. The co-localization rate of Atg8 and Lamp1 should be shown (as in Figure 8). Then, after identifying pre-fusion autophagosomes and lysosomes, the localization of each should be analyzed.

      Thank you for this comment. We plan to perform immunohistochemistry experiment on Vps16A KD fat body cells for mCherry and Lamp1, as in case of other panels of Figure 8. We will also analyse the distribution of each.

      It is also possible that autophagosomes and lysosomes are tethered by factors other than HOPS (even if they are not fused). If this is the case, autophagosomal trafficking would be affected by the movement of lysosomes.

      While we cannot exclude the possibility that autophagosomes are transported indirectly by being tethered to lysosomes. However, we find this unlikely be the case as we believe in fat cells lysosomes and autophagosomes will rapidly fuse with each other if they get close enough.

      (2) The authors analyze autolysosomes in Figures 6 and 7. This is based on the assumption that autophagosome-lysosome fusion takes place in cells without vps16A RANi. However, even in the presence of Vps16A, both pre-fusion autophagosomes and autolysosomes should exist. This is also true in Figure 8H, where the fusion of autophagosomes and lysosomes is partially suppressed in knockdown cells of dynein, dynactin, Rab7, and Epg5. If the effect of fusion is to be examined, it is reasonable to distinguish between autophagosomes and autolysosomes and analyze only autolysosomes.

      Thank you for your careful insights. The mCherry-Atg8a reporter we use is highly stable in autolysosomes due to the resilience of the mCherry fluorophore within these acidic, post-fusion structures, making it useful for labelling both autophagosomes and autolysosomes. Notably, the high intensity of mCherry-Atg8a within autolysosomes allows us to distinguish them from pre-fusion autophagosomes, which appear fainter and smaller, especially when accumulated in fusion-defective backgrounds (as shown in Figure 4). We therefore regard larger, brighter structures as autolysosomes.

      To improve clarity, we included additional markers—endogenous Lamp1 staining (Figure 8) and Lamp1-GFP (Figure S9)—to help differentiate between autophagic structures. Lamp1-negative, mCherry-Atg8a-positive vesicles indicate pre-fusion autophagosomes, while Lamp1/mCherry-Atg8a double-positive vesicles represent autolysosomes. Additionally, Lamp1-positive, mCherry-Atg8a-negative vesicles mark lysosomes of non-autophagic origin. We appreciate your suggestion

      (3) In this study, only vps16a RNAi cells were used to inhibit autophagosome-lysosome fusion. However, since HOPS has many roles besides autophagosome-lysosome fusion, it would be better to confirm the conclusion by knockdown of other factors (e.g., Stx17 RNAi).

      Thank you for this suggestion. We will generate additional Drosophila lines similar to those used in our current study, substituting Syntaxin17, SNAP29 or Vamp7 RNAi for Vps16A RNAi. We will test key phenotypic hits with these new backgrounds to confirm our findings.

      (4) Figure 8: Rab7 and Epg5 are also known to be directly involved in autophagosome-lysosome tethering/fusion. Even if the fusion rate is reduced in the absence of Rab7 and Epg5, it may not be the result of defective autophagosome movement, but may simply indicate that these molecules are required for fusion itself. How do the authors distinguish between the two possibilities?

      Thank you for this comment. While we agree that Rab7 and Epg5 are involved in autophagosome-lysosome tethering and subsequent fusion, we believe they also play an additional role in autophagosome movement. Our hypothesis stems from the observation that the phenotypes of vps16 RNAi and rab7 or epg5 RNAi are not identical. In contrast, RNAi targeting SNARE proteins involved exclusively in fusion (Syx17, SNAP29, and Vamp7) all result in a consistent phenotype: autophagosomes accumulate around the nucleus, closely resembling the phenotype observed with vps16 depletion. This suggests that these SNAREs are specifically involved in fusion. Since Rab7 and Epg5 depletion scatters autophagosomes throughout the cytosol rather than transporting them to the nucleus, we hypothesize that this is due to impaired movement of autophagosomes. This hypothesis is further supported by our co-IP data showing that Epg5 binds to dyneins.

      Reviewer #2 (Public review):

      Summary:

      This manuscript by Boda et al. describes the results of a targeted RNAi screen in the background of Vps16A-depleted Drosophila larval fat body cells. In this background, lysosomal fusion is inhibited, allowing the authors to analyze the motility and localization specifically of autophagosomes, prior to their fusion with lysosomes to become autolysosomes. In this Vps16A-deleted background, mCherry-Atg8a-labeled autophagosomes accumulate in the perinuclear area, through an unknown mechanism.

      The authors found that the depletion of multiple subunits of the dynein/dynactin complex caused an alternation of this mCherry-Atg8a localization, moving from the perinuclear region to the cell periphery. Interactions with kinesin overexpression suggest these motor proteins may compete for autophagosome binding and transport. The authors extended these findings by examining potential upstream regulators including Rab proteins and selected effectors, and they also examined effects on lysosomal movement and autolysosome size. Altogether, the results are consistent with a model in which specific Rab/effector complexes direct the movement of lysosomes and autophagosomes toward the MTOC, promoting their fusion and subsequent dispersal throughout the cell.

      Strengths:

      Although previous studies of the movement of autophagic vesicles have identified roles for microtubule-based transport, this study moves the field forward by distinguishing between effects on pre- and post-fusion autophagosomes, and by its characterization of the roles of specific Dynein, Dynactin, and Rab complexes in regulating movement of distinct vesicle types. Overall, the experiments are well-controlled, appropriately analyzed, and largely support the authors' conclusions.

      Thank you for your positive comments and for acknowledging the strengths of our work.

      Weaknesses:

      One limitation of the study is the genetic background that serves as the basis for the screening. In addition to preventing autophagosome-lysosome fusion, disruption of Vps16A has been shown to inhibit endosomal maturation and block the trafficking of components to the lysosome from both the endosome and Golgi apparatus. Additional effects previously reported by the authors include increased autophagosome production and reduced mTOR signaling. Thus Vps16A-depleted cells have a number of endosome, lysosome, and autophagosome-related defects, with unknown downstream consequences. Additionally, the cause and significance of the perinuclear localization of autophagosomes in this background is unclear. Thus, interpretations of the observed reversal of this phenotype are difficult, and have the caveat that they may apply only to this condition, rather than to normal autophagosomes. Additional experiments to observe autophagosome movement or positioning in a more normal environment would improve the manuscript.

      Thank you for highlighting this limitation. We plan to conduct time-lapse imaging of live fat body tissues expressing 3xmCherry-Atg8a and GFP-Lamp1 to visualize the movement and fusion events of pre-fusion autophagosomes (3xmCherry-Atg8a positive and GFP-Lamp1 negative) and lysosomes (GFP-Lamp1 positive). We expect these vesicles to exhibit movement toward the ncMTOC, providing insight into their behaviour under more typical conditions.

      Specific comments

      (1) Several genes have been described that when depleted lead to perinuclear accumulation of Atg8-labeled vesicles. There seems to be a correlation of this phenotype with genes required for autophagosome-lysosome fusion; however, some genes required for lysosomal fusion such as Rab2 and Arl8 apparently did not affect autophagosome positioning as reported here. Thus, it is unclear whether the perinuclear positioning of autophagosomes is truly a general response to disruption of autophagosome-lysosome fusion, or may reflect additional aspects of Vps16A/HOPS function. A few things here would help. One would be an analysis of Atg8a vesicle localization in response to the depletion of a larger set of fusion-related genes. Another would be to repeat some of the key findings of this study (effects of specific dynein, dynactin, rabs, effectors) on Atg8a localization when Syx17 is depleted, rather than Vps16A. This should generate a more autophagosome-specific fusion defect.

      Thank you for this suggestion. We will generate additional Drosophila lines similar to those used in our current study, substituting Syntaxin17, SNAP29, and Vamp7 RNAi for Vps16A RNAi. We will test key phenotypic hits with these new backgrounds to confirm our findings.

      Third, it would greatly strengthen the findings to monitor pre-fusion autophagosome localization without disrupting fusion. Such vesicles could be identified as Atg8a-positive Lamp-negative structures. The effects of dynein and rab depletion on the tracking of these structures in a post-induction time course would serve as an important validation of the authors' findings.

      Thank you for this helpful suggestion. We plan to conduct time-lapse experiments under various conditions (e.g., non-starved and starved at different durations) to monitor the motility of newly formed autophagosomes (3xmCherry-Atg8a positive, Lamp1 negative), allowing us to analyze their positioning dynamics without interference from fusion defects.

      (2) The authors nicely show that depletion of Shot leads to relocalization of Atg8a to ectopic foci in Vps16A-depleted cells; they should confirm that this is a mislocalized ncMTOC by co-labeling Atg8a with an MTOC component such as MSP300. The effect of Shot depletion on Atg8a localization should also be analyzed in the absence of Vps16A depletion.

      Thank you for this positive comment, to confirm the presence of ectopic MTOC foci in Shot KD cells, we plan to co-label with MTOC markers, including Khc-nod-LacZ, and additional reporters like Msps-mCherry, in both Vps16A-depleted and normal backgrounds.

      (3) The authors report that depletion of Dynein subunits, either alone (Figure 6) or co-depleted with Vps16A (Figure 2), leads to redistribution of mCherry-Atg8a punctae to the "cell periphery". However, only cell clones that contact an edge of the fat body tissue are shown in these figures. Furthermore, in these cells, mCherry-Atg8a punctae appear to localize only to contact-free regions of these cells, and not to internal regions of clones that share a border with adjacent cells. Thus, these vesicles would seem to be redistributed to the periphery of the fat body itself, not to the periphery of individual cells. Microtubules emanating from the perinuclear ncMTOC have been described as having a radial organization, and thus it is unclear that this redistribution of mCherry-Atg8a punctae to the fat body edge would reflect a kinesin-dependent process as suggested by the authors.

      Thank you for this detailed observation. Indeed, we frequently observe autophagosomes redistributing to contact-free peripheral regions upon dynein depletion, resulting in an asymmetric distribution. We believe this redistribution to be kinesin-dependent, as shown in Figure 3: kinesin overexpression scatters or shifts autophagosomes to the periphery, while kinesin/dynein double knockdown causes widespread autophagosome scattering. The simplest explanation is that, in dynein's absence, kinesins drive autophagosome movement.

      Additionally, while the radial organization of the microtubule (MT) network has been documented in two independent studies that we referenced, neither study showed MT plus-ends specifically, towards which kinesins transport. It is plausible that, while the MT network appears radial and symmetrical, subtle asymmetry might influence kinesin-dependent transport in fat cells. To explore this further, we will express MT plus-end markers, such as EB1-RFP and EB1-GFP, as well as kinesin reporters like unc-104-GFP or HA-tagged kinesins.

      (4) To validate whether the mCherry-Atg8a structures in Vps16A-depleted cells were of autophagic origin, the authors depleted Atg8a and observed a loss of mCherry- Atg8a signal from the mosaic cells (Figure S1D, J). A more rigorous experiment would be to deplete other Atg genes (not Atg8a) and examine whether these structures persist.

      Thank you for the suggestion to further validate our reporter. We will knock down additional Atg genes, including Atg14, Atg1, Atg6, and Vps34, to confirm that the mCherry-Atg8a-positive structures in the Vps16A RNAi background are indeed of autophagic origin.

      (5) The authors found that only a subset of dynein, dynactin, rab, and rab effector depletions affected mCherry- Atg8a localization, leading to their suggestion that the most important factors involved in autophagosome motility have been identified here. However, this conclusion has the caveat that depletion efficiency was not examined in this study, and thus any conclusions about negative results should be more conservative.

      Thank you for this constructive feedback. We agree and will adjust our conclusions based on the negative results in the revised manuscript to account for the potential variability in depletion efficiency.

      Reviewer #3 (Public review):

      Summary:

      In multicellular organisms, autophagosomes are formed throughout the cytosol, while late endosomes/lysosomes are relatively confined in the perinuclear region. It is known that autophagosomes gain access to the lysosome-enriched region by microtubule-based trafficking. The mechanism by which autophagosomes move along microtubules remains incompletely understood. In this manuscript, Péter Lőrincz and colleagues investigated the mechanism driving the movement of nascent autophagosomes along the microtubule towards the non-centrosomal microtubule organizing center (ncMTOC) using the fly fat body as a model system. The authors took an approach whereby they examined autophagosome positioning in cells where autophagosome-lysosome fusion was inhibited by knocking down the HOPS subunit Vps16A. Despite being generated at random positions in the cytosol, autophagosomes accumulate around the nucleus when Vps16A is depleted. They then performed an RNA interference screen to identify the factors involved in autophagosome positioning. They found that the dynein-dynactin complex is required for the trafficking of autophagosomes toward ncMTOC. Dynein loss leads to the peripheral relocation of autophagosomes. They further revealed that a pair of small GTPases and their effectors, Rab7-Epg5 and Rab39-ema, are required for bidirectional autophagosome transport. Knockdown of these factors in Vps16a RNAi cells causes the scattering of autophagosomes throughout the cytosol.

      Strengths:

      The data presented in this study help us to understand the mechanism underlying the trafficking and positioning of autophagosomes.

      Thank you for your positive comment and for acknowledging the strengths of our work.

      Major concerns:

      (1) The localization of EPG5 should be determined. The authors showed that EPG5 colocalizes with endogenous Rab7. Rab7 labels late endosomes and lysosomes. Previous studies in mammalian cells have shown that EPG5 is targeted to late endosomes/lysosomes by interacting with Rab7. EPG5 promotes the fusion of autophagosomes with late endosomes/lysosomes by directly recognizing LC3 on autophagosomes and also by facilitating the assembly of the SNARE complex for fusion. In Figure 5I, the EPG5/Rab7-colocalized vesicles are large and they are likely to be lysosomes/autolysosomes.

      Thank you for suggesting an improvement to our Epg5 localization data. We plan to perform triple-staining experiments with autophagy and lysosome markers, such as Atg8a and Lamp1, together with Epg5-9xHA to provide a clearer context for Epg5 localization.

      (2) The experiments were performed in Vps16A RNAi KD cells. Vps16A knockdown blocks fusion of vesicles derived from the endolysosomal compartments such as fusion between lysosomes. The pleiotropic effect of Vps16A RNAi may complicate the interpretation. The authors need to verify their findings in Stx17 KO cells, as it has a relatively specific effect on the fusion of autophagosomes with late endosomes/lysosomes.

      Thank you for this valuable suggestion. We will create similar Drosophila lines as used in our study but will now employ Syntaxin17, SNAP29, or Vamp7 RNAi. We will cross our most significant hits with these new lines to confirm our findings.

      (3) Quantification should be performed in many places such as in Figure S4D for the number of FYVE-GFP labeled endosomes and in Figures S4H and S4I for the number and size of lysosomes.

      Thank you for pointing this out, we will perform the suggested quantifications and statistics.

      (4) In this study, the transport of autophagosomes is investigated in fly fat cells. In fat cells, a large number of large lipid droplets accumulate and the endomembrane systems are distinct from that in other cell types. The knowledge gained from this study may not apply to other cell types. This needs to be discussed.

      Thank you for this insight. We will discuss the potential cell-type specificity of our findings in the revised manuscript. Additionally, we plan to examine the distribution of the mCherry-Atg8a reporter in the vps16A RNAi background in other cell types, such as salivary gland cells, to broaden our analysis.

      Minor concerns:

      (5) Data in some panels are of low quality. For example, the mCherry-Atg8a signal in Figure 5C is hard to see; the input bands of Dhc64c in Figure 5L are smeared.

      Thank you for noting this. We will repeat the experiment in Figure 5C to obtain clearer images. The smeared Dhc64C input bands in Figure 5L are due to the large size of this protein, which affects its migration characteristics. We will address this in the revised manuscript.

      (6) In this study, both 3xmCherry-Atg8a and mCherry-Atg8a were used. Different reporters make it difficult to compare the results presented in different figures.

      Thank you for this comment. Both reporters are well-established as autophagic markers and function similarly. However, to reduce confusion, we have used only one type per figure to ensure comparability of results.

      (7) The small autophagosomes presented in Figures such as in Figure 1D and 1E are not clear. Enlarged images should be presented.

      Thank you for your suggestion. We will repeat these experiments and provide higher-quality, enlarged images for clarity.

      (8) The authors showed that Epg5-9xHA coprecipitates with the endogenous dynein motor Dhc64C. Is Rab7 required for the interaction?

      Thank you for this question. We will investigate this by co-transfecting the cells with WT and GTP- or GDP-locked Rab7 mutants (which mimic constitutively active and dominant-negative forms, respectively) with Epg5-9xHA. This will allow us to assess whether Rab7 modulates the Epg5-Dhc interaction.

      (9) The perinuclear lysosome localization in Epg5 KD cells has no indication that Epg5 is an autophagosome-specific adaptor.

      Thank you for this comment. We will moderate our statement regarding Epg5's role as an autophagosome-specific adaptor in the revised manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews

      eLife assessment:

      In this useful study, the authors analyze droplet size distributions of multiple protein condensates and their fit to a scaling ansatz, highlighting that they exhibit features of first- and second-order phase transitions. The experimental evidence is still incomplete as the measurements were apparently done only at one time point, neglecting the possibility that droplet size distribution can evolve with time. The text would benefit from a connection to and contextualization with the well-understood expectations from the coupling of percolation and phase separation in protein condensates - a phenomenon that is increasingly gaining consensus amongst the community and that emphasizes "liquid-gas" criticality. 

      We have now carried out new experiments at multiple time points to establish that the droplet size distributions are stationary below the critical concentration. We have also addressed the comments made by the reviewers about the nature of the phase transition.

      Our analysis does not depend on a specific hypothesis on the nature of the phase transition, whether it be percolation or a gas-liquid critical transition. The scaling that we observed is an emergent property that is independent from the possible theoretical models used to describe the phase transition. In fact, our scaling analysis indicates that any theoretical model proposed for protein phase separation should predict the critical exponents that we reported. 

      Reviewer #1

      The authors analyse droplet size distributions of multiple protein condensates and fit to a scaling ansatz to highlight that they exhibit features of first-order and second-order phase transitions. While the experimental evidence is solid, the text lacks connection and contextualization to the well-understood expectations from the coupling of percolation and phase separation in protein condensates - a phenomenon that is increasingly gaining consensus amongst the community. The evidence supports the percolation and phase separation model rather than being close to a true critical point in the liquid-gas phase space. Overall, the work is useful to the community.

      We are grateful to the reviewer for these positive comments. We would like to emphasises that our contribution is not to propose a theoretical model, but rather to report a scaling behaviour in the experimentally measured droplet size distributions. The main implication of our work is that any theoretical model should predict the scaling exponents that we derived from the experimental measurements.

      Strengths: 

      The experimental analysis of distinct protein condensates is very well done and the reported exponents/scaling framework provides a clear framework to help the community deconvolve signatures of percolation in condensates. 

      Weaknesses: 

      The principal concern this reviewer has is that the reviewers adopt a framing in this paper to present a discovery of second-order features and connections to criticality - however, they ignore/miss the connections to percolation (a well-understood second-order transition that is expected to play a major role in protein condensates). I believe this needs to be addressed and the paper suitably revised to help connect with these expectations. 

      The scaling that we found is not characteristic standard percolation, since the exponents that we obtained (a=0 and f=1) are different from those of percolation (a=1.19 and f=2.21). This difference indicates that protein phase separation is not in the same universality class of standard percolation. Further studies will be required to understand whether theoretical models based on percolation could predict the observed critical exponents.

      - Protein condensates have been increasingly understood to be described as fluids whose assembly is driven by a connection of density (phase separation, first-order) and connectivity (percolation, second-order) transitions. This has been long known in the polymer community (Flory, Stockmayer, Tanaka, Rubinstein, Semenov, and others) and recently repopularized in the condensate community (by Pappu and Mittag, in particular, amongst others). The authors make no connections to any of these frameworks - which actually seem to be the essence of what they are describing. 

      As mentioned above, our purpose was neither to support an existing theoretical model, nor to propose a new one. Rather, we have reported a scaling behaviour and scaling exponents not noted before. Further studies will be required to establish whether existing theoretical models could account for this scaling behaviour.

      - Percolation theory, which has been around for more than half a century, has clear-cut scaling laws that have essentially similar forms to the ansatz adopted by the authors, and the commonalities/differences are not discussed by the authors - this is essential since this provides a physical basis for their ansatz rather than an arbitrary mathematical formulation. In particular, percolation models connect size distribution exponents to factors like dimensionality, valence, etc. and if these connections can be made with this data, that would be very powerful. 

      The scaling ansatz that we are using is commonly adopted in studies of critical phenomena, and it is not specific to percolation. The scaling exponents depends only on very few attributes like dimensionality, symmetries and if interactions are short or long range. These attributes determine the universality class. As such, scaling does not link with molecular determinants, but can distinguish different classes.

      - The connections between spinodal decomposition and second-order phase transitions are very confusing. Spindal decomposition happens when the barriers for first-order phase transitions are zero and systems can phase separate without crossing nucleation barriers. Further, the "criticality" discussed in the paper is confusing since it more likely refers to a percolation threshold and much less likely to a "critical temperature" (Tc -where spinodal and binodals become identical). I would recommend reframing this argument. 

      We cannot refer to percolation threshold as our model is not readily compatible with it. We elaborated and better explained the differences between these models.

      It's unlikely, in this reviewer's opinion, that the authors are actually discussing a "first-order" liquid-gas critical point - because saturation concentrations of these proteins can be much higher with temperature and the critical point would thus likely be at much higher concentrations (and ofc temperature). Further, the scaling exponents don't fall into that class naturally. However, if the authors disagree, I would appreciate clear quantitative reasons (including through the scaling exponents in that universality class) and be happy to be convinced to change my mind. As provided, the data does not support this model. 

      We have now clarified in the manuscript that we do not discuss the liquid-gas critical point.

      Reviewer #2

      This is a potentially interesting study addressing a possible scale-invariant log-normal characteristic of droplet size distribution in the phase separation behavior of biomolecular condensates. Some of the data presented are valuable and intriguing. However, as it stands, the validity and utility of this study are uncertain because there are serious deficiencies in the execution and presentation of the authors' results. Many of these shortcomings are fundamental, including a lack of clarity in the basic conceptual framework of the study, insufficient justification of the experimental setup, less-than-conclusive experimental evidence, and inadequate discussion of implications of the authors' findings to future experimental and theoretical studies of biomolecular condensates. Accordingly, this reviewer considers that the manuscript should undergo a major revision to address the following. In particular, the discussion should be significantly expanded by including references mentioned below as well as other references pertinent to the issues raised. 

      We thank the reviewer for the helpful comments. In the revised version of the manuscript we clarified that we aimed to use a well-established tool – the scaling analysis – to study phase transition and applied to the protein condensation process. This approach offers insight into a universal aspect of protein phase separation, and also provides a practical approach to determine the phase boundary. The observed fat-tailed distribution of protein droplet sizes is not what is normally observed in more standard phase separation systems in the subsaturated phase. Our contribution is not to propose a theoretical model, but rather to report the observation of a scaling behaviour. 

      (1) The theoretical analysis in this study is based on experimental data on condensed droplet size distributions for FUS and α-synuclein. The size data for FUS droplet is indirect as it relies on the assumption that FUS droplet diameter is proportional to fluorescence intensity of labeled FUS (page 10 of manuscript), with fluorescence data adopted from a previously published work by another group (Kar et al. & Pappu, ref.27). Because fluorescence of a droplet is expected to be dependent upon the condensed-phase concentration of FUS, this proportional relationship, even if it holds, must also be modulated by FUS concentration in the droplet. Moreover, why should fluorescence be proportional to diameter but not the cross-sectional area or volume of the FUS droplet, which would be more intuitive? These issues should be clarified. A new measure by microscopy is used to determine the size distribution of condensed α-synuclein; but no microscopy image is shown. It is of critical importance that such raw data (for example microscopy images) be presented for the completeness and reproducibility of the experiment because the entire study relies on the soundness of these experimental measurements. 

      As we mentioned in the article, for the scaling analysis, the droplet dimensions could be assessed in 1D (length), 2D (area) or 3D (volume). For the FUS experiments, we used the data as the authors provided in the original publication (PNAS 2022). For alpha-synuclein, we provided the data in the article. 

      (2) Despite the authors' claim of a universal scaling relationship, the log-log scatter plots in Figure 1 (page 15 of the manuscript) exhibit significant deviations from linearity at low protein concentrations (ρ→0). Given this fact, is universal scaling really valid? Discussion of this behavior is conspicuously absent (except the statement that these data points are excluded in the fit). In any case, the possible origins of these deviations should be thoroughly discussed so that the regime of universal scaling can be properly delineated. 

      In general, one would expect the scaling ansatz to be valid close to the phase boundary. It is the feature of the ansatz, that further away from the boundary, deviations are expected because of the decreasing relevance of critical phenomena.

      (3) Droplet size distribution most likely depends on the time duration after the preparation of the sample. For α-synuclein, "liquid droplet size characterisation images were captured 10 minutes post-liquid droplet formation" (page 9 of the manuscript). Why 10 minutes? Have the authors tried imaging at different time points and, if so, do the distributions at different time points remain essentially the same? If they are different, what is the criterion for focusing only on a particular time point? Information related to these questions should be provided. 

      We have now determined the droplet size distribution of alpha-synuclein at different time points, finding that they are not dependent on time within experimental uncertainties (Figure 6 in the revised manuscript).

      (4) At least two well-known mechanisms can lead to the time-dependent distribution of liquid droplet sizes: (i) coalescence of droplets in spatial proximity to form a larger droplet, and (ii) Ostwald ripening, i.e., formation of larger droplets concomitant with the dissolution of smaller droplets without fusion of droplets. The implications of these mechanisms on the authors' droplet size distributions should be addressed. Indeed, maintaining a size distribution against these mechanisms in vivo often requires active suppression [Bressloff, Phys Rev E 101, 042804 (2020)] with possible involvement of chemical reactions [Kirschbaum & Zwicker, J R Soc Interface 18, 20210255 (2021)]. These considerations are central to the basic rationale of this study and therefore should be carefully tackled. 

      These two mechanism of growth are relevant above the critical concentration. Below the critical concentration, which is the regime that we investigated in our work, there is no need of active suppression.

      (5) If coalescence and/or Ostwald ripening do occur, given sufficient time after sample preparation, the condensed phase may become a single large "droplet" or a single liquid layer. Does this occur in the authors' experiments? 

      As we are below the critical concentration, this is unlikely to occur, as indeed supported by the experiments mentioned at point (3). 

      (6) It is unclear whether the authors aim to address the kinetic phenomenon of liquid droplet formation and evolution or equilibrium properties. The two types of phenomena appear to be conflated in the authors' narrative. Clarification is needed. If this work aims to address timeindependent (or infinite-time) equilibrium properties, how are they expected to be related to droplet size distribution, which most likely is time-dependent? 

      Our analysis focuses on the equilibrium properties of the droplet size distribution below the critical concentration, and it should guide the proposal of a theoretical model that explains the emergence of scaling. In the introductory part of our manuscript, we proposed a possible scenario that tries to extend the Flory-Huggins’s theory to predict a scaling behaviour appropriate to a critical transition. Other scenarios are possible, and our result along with further experiments are needed to arrive at a deeper understanding of protein aggregation.

      (7) The relationship between the potentially time-dependent droplet size distribution and equilibrium properties of ρt and ρc (transition and critical concentrations, respectively) should be better spelled out. An added illustrative figure will be helpful. 

      We are addressing equilibrium properties, not kinetic ones. See also the answers to point 6.

      (8) The authors comment that their findings appear to be inconsistent with Flory-Huggins theory because Flory-Huggins "characterizes droplet formation as a consequence of nucleation ..." (page 8 of the manuscript). Here, three issues need detailed clarification: (i) In what way does Flory-Huggins mandate nucleation? (ii) Why are the findings of apparent scale invariance inconsistent with nucleation? (iii) If liquid droplet formations do not arise from nucleation, what physical mechanism(s) is (are) envisioned by the authors to be underpinning the formation of condensed liquid droplets in protein phase separation? 

      We do agree that the Flory-Huggins theory does not mandate nucleation above the spinodal line. However, we are addressing the equilibrium properties below the critical concentration, so the stable phase is the dilute phase, and there is no nucleation.

      (9) Are any of the authors' findings related to finite-system effects of phase separation [see, e.g., Nilsson & Irbäck, Phys Rev E 101, 022413 (2020)]?  

      Our experimental system is macroscopic, so we would not expect finite size effects.

      (10) Since the authors are using their observation of an apparent scale-invariant droplet size distribution to evaluate phase separation theory, it is important to clarify whether their findings provide any constraint on the shape of coexistence curves (phase diagrams). 

      We are only reporting the phenomenological observation of a scaling behaviour, so we may not speculate at this stage on the constraints of the coexistence curves. This is indeed an exciting opportunity for future studies.

      (11) More specifically, do the authors' findings suggest that the phase diagrams predicted by Flory-Huggins are invalid? Or, are they suggesting that even if the phase diagrams predicted by Flory-Huggins are empirically correct (if verified by experimental testing), they are underpinned by a free energy function different from that of Flory-Huggins? It is important to answer this question to clarify the implications of the authors' findings on equilibrium phase behaviors and the falsifiability of the implications. 

      As mentioned above, our main conclusion is that the droplet size distribution follows a scaling behaviour.  Our contribution is not to propose a theoretical model, but rather to propose a scaling behaviour that should be accounted for by existing of future theoretical models.

      (12) How about the implications of the authors' findings on other theories of protein phase separation that are based on interactions that are different from the short spatial range interactions treated by Flory-Huggins? For instance, it has been observed that whereas the Flory-Huggins-predicted phase diagrams always convex upward, phase diagrams for charged intrinsically disordered proteins with long spatial range Coulomb interactions exhibit a region that concave upward [Das et al., Phys Chem Chem Phys 20, 28558-28574 (2018)]. Can information be provided by the authors' findings regarding apparent scale-invariant droplet size distribution on the underlying interaction driving the protein molecules toward phase separation? 

      This is an interesting point for future studies about the type of interactions that give rise to the observed scaling behaviour.

      (13) Table S1 (page 4) and Table S2 (page 7) are mentioned in the text but these tables are not in the submitted files. 

      We have added the Supplementary Tables as well as the source files for the figures.

      (14) The two systems studied (FUS and α-synuclein) have a single intrinsically disordered protein (IDP) component. It is not clear if the authors expect their claimed scaling relation to be applicable to systems with multiple IDP components and if so, why.

      From the data that we have currently analysed, we feel that we may not speculate on this interesting point, leaving it to future studies.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      A limitation of the study is that it does not directly compare the e4ect of inhibiting the PERKATF4 pathway with inhibiting JUN and/or JUN-CHOP double deficient animals. It would also be useful, for the cell survival experiments shown in Figure 1, to examine a longer time point than 14 days to understand the long-term consequence of manipulating the PERK-ATF4 pathway.

      We appreciate that both suggestions are fantastic ideas for future studies but consider them to be beyond the scope of this investigation. 

      Reviewer #2 (Public Review):

      However, the main concern is the overall data quality, which appears to be suboptimal. The transfection e4iciency of AAV2-hSyn1-mTagBFP2-ires-Cre used in this study does not seem highly e4ective, as evidenced by the data presented in Supplementary Figure 1.

      We appreciate the importance of the e;ectiveness of transfection e;iciency of AAV2-hSyn1-mTagBFP2-ires-Cre to the interpretations of our results and acknowledge that the imaging and color schemes used required improvement. We have now validated widespread knockout in RGCs using AAV2-hSyn1-mTagBFP2-ires-Cre, improving the staining and imaging of LSL.tdTomato Cre reporter mice (Figure S1A-B) and using RNAScope to validate the disruption of ATF4 and CHOP, respectively, in the RGCs of ATF4 cKO and CHOP cKO mice (Figure S1C-D). Additional validation of functional knockout of these transcription factors is provided by reduction of RGC-autonomous expression of transcripts that we identified in this study to be injury-regulated in an ATF4-dependent (Chac1, Atf3, Figure 4C-E) or ATF4- and CHOP-dependent manner (Ecel1, Avil, Figure 4C-E and Figure S2D).

      The manuscript also contains several inconsistencies and a mix of methods in data collection, analysis, and interpretation, such as the labeling and quantification of RGCs and the combination of bulk and single-cell sequencing results.

      Regarding the use and comparison of bulk-seq and scRNA-seq data, it is our sense that these innovative approaches will be among the impactful aspects of this study. Numerous transcriptomic studies of the optic nerve crush model exist, though it has been unclear whether major and minor technical di;erences would preclude deriving insights across studies without the expense and time of exact reproduction. One goal of this study was to evaluate the hypothesis that, despite the obvious limitation that RGCs represent fewer than 1% of cells in whole retina bulk transcriptomics approach, the signals amongst top di;erentially expressed genes (DEGs) would be dominated by injury-induced changes within RGCs and that the most robust of these changes would be readily detected across techniques and labs, serving as a cornerstone for interpreting similarities and di;erences in findings. We believe that the results validate this approach. Important insights gained in this study from these cross-study and cross-platform analyses include:

      (1) Genes that we identify in this study as neuronal ATF4-dependent by whole retina transcriptomics include many of the most robust genes expression changes observed across multiple studies that enrich for RGCs and those that only report RGC-autonomous expression changes by scRNA-seq. This observation predicts that many of the ATF4-dependent expression changes that we report are RGC autonomous, which we further validate in this revision by RNAScope.

      (2) Similarly designed whole transcriptomics studies across labs can be remarkably robust for top DEGs, showing striking similarity that allows for meaningful insights and testable hypotheses across di;erent knockout and conditional knockout mice.

      (3) scRNA-seq of RGCs and bulk sequencing of FACS-enriched RGCs, unsurprisingly results in higher sensitivity for injury-induced expression changes, but the high degree of similarity that we demonstrate between the top DEGs from those studies and whole retina transcriptomics studies allows for confident inferences regarding the expected cell autonomy of reported expression changes in this model, using available resources such as the Single Cell Portal, without the expense and technical optimization required for extensive spatial transcriptomics across numerous mouse models.

      Other revisions

      In addition to these updates to address the public reviews, we are grateful for the reviewers’ additional recommendations and provide these further revisions:

      (1) We appreciate the request to clarify with a schematic the di;erences between our study and a previous report (Tian et al., 2022). A second Correction to that study was published in July 2024, resulting in changes to the logFC values used in our original cross-study comparison and adjustments to multiple figures and tables related to the proposed transcriptional programs of ATF4, CHOP, and the other purported core transcription factors. We have therefore updated our Figure S3A-C in accordance with that Correction to better reflect the underlying data of that study. These changes do not alter our original conclusions that: (a) both the whole retina transcriptomics approach of our study and the FACS-enriched RGC approach of that study readily detect the strong upregulation of many known ATF4 target genes after optic nerve crush (Figure S3A); and (b) there are striking di;erences in the ATF4- and CHOP-dependent transcripts suggested by our cKO data and those suggested by the reported gRNA data. Though we had hoped that the Correction would allow us in this revision to diagram those findings and model for comparison to these cKO findings, documenting those changes and their impacts on the proposed model is beyond the scope of this study.

      (2) We agree that the discordance between the gene and protein names for Ddit3/CHOP and Eif2ak3/PERK represents a challenge for clarity, even when gene names are carefully selected when referring to genes or transcripts and protein names when referring to proteins. We have therefore attempted to streamline the naming throughout, using where possible both names.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Recommendations For The Authors):

      (1) I was surprised to see that the Authors have failed to address my major concerns about the paper, which was in the Main text of the Review.

      Previously I wrote: The major weakness of the manuscript is that it is written for a very specialized reader who has a strong background in cerebellar development, making it hard to read for eLife's general audience. It's challenging to follow the logic of some of the experiments as well as to contextualize these findings in the field of cerebellar development.

      This has not been addressed. The manuscript has not been substantively changed and it is still written for a very specialized reader rather than a general reader.

      We appreciate the respected reviewer’s concern and have made substantial revisions throughout the manuscript to address the points. We have simplified the technical language throughout the manuscript and included additional background information, particularly in the introduction and discussion sections, to better orient general readers. Additionally, we have clarified the logical flow of the experiments by incorporating transitional statements and summaries that explain the purpose and outcomes of each experiment (revisions are highlighted in yellow). 

      (2) These two have been addressed, although to be honest, I don't think that the cartoon is particularly helpful for a general audience.

      Thank you for your feedback. We have replaced the cartoon with a revised version that provides more detailed information to clarify and simplify the origins of cerebellar nuclei from the caudal and rostral ends in both Atoh1+/+ and Atoh1-/- mice. We believe this will make the content more clear and informative for the general audience.

      (3) My third recommendation, that they include a section in the Discussion to speculate about what these cells may become in the adult and the existence of multiple cell types with different molecular markers and projection patterns in the nuclei, has also not been addressed.

      We apologize for the oversight in the previous revision. We have now added a detailed discussion in the manuscript that speculates on the potential fate of these newly identified cells in the adult cerebellum, suggesting that they may differentiate into excitatory neurons (highlighted on page 9). In addition, as noted in our previous resubmission, further direct evidence is needed from the early population of SNCA+ cells during E9 to E13. This is an ongoing focus of investigation in our lab, where we are currently using SNCA-GFP mice, part of a project for a PhD student in our lab.

      Reviewer #2 (Recommendations For The Authors):

      One small remaining issue: The methods text re cell counts remains confusing: n=3

      EMBRYOS???

      "To assess the number of OTX2-positive cells, we conducted immunohistochemistry (IHC) labeling on slides containing serial sections from embryonic days 12, 13, 14, and 15 (n=3 EMBRYOS??? at each timepoint)."

      Thank you for this point and we acknowledge that, and we have revised the text in the methods section for clarity. As highlighted on page 11, “The sample size was equal to 9 embryos” and on page 16, “3 embryos were used at each time point”.

    1. Author response:

      Public Reviews: 

      Reviewer #1 (Public review): 

      The paper by Chen et al describes the role of neuronal themo-TRPV3 channels in the firing of cortical neurons at a fever temperature range. The authors began by demonstrating that exposure to infrared light increasing ambient temperature causes body temperature to rise to a fever level above 38{degree sign}C. Subsequently, they showed that at the fever temperature of 39{degree sign}C, the spike threshold (ST) increased in both populations (P12-14 and P7-8) of cortical excitatory pyramidal neurons (PNs). However, the spike number only decreased in P7-8 PNs, while it remained stable in P12-14 PNs at 39 degrees centigrade. In addition, the fever temperature also reduced the late peak postsynaptic potential (PSP) in P12-14 PNs. The authors further characterized the firing properties of cortical P12-14 PNs, identifying two types: STAY PNs that retained spiking at 30{degree sign}C, 36{degree sign}C, and 39{degree sign}C, and STOP PNs that stopped spiking upon temperature change. They further extended their analysis and characterization to striatal medium spiny neurons (MSNs) and found that STAY MSNs and PNs shared the same ST temperature sensitivity. Using small molecule tools, they further identified that themo-TRPV3 currents in cortical PNs increased in response to temperature elevation, but not TRPV4 currents. The authors concluded that during fever, neuronal firing stability is largely maintained by sensory STAY PNs and MSNs that express functional TRPV3 channels. Overall, this study is well designed and executed with substantial controls, some interesting findings, and quality of data. Here are some specific comments: 

      (1) Could the authors discuss, or is there any evidence of, changes in TRPV3 expression levels in the brain during the postnatal 1-4 week age range in mice? 

      To our knowledge, no published studies have documented changes in TRPV3 expression levels in the brain during the 1st to 4th postnatal weeks in mice. Research on TRPV3 expression in the mouse brain has primarily involved RT-PCR analysis of RNA from dissociated tissue in adult mice (Jang et al., 2012; Kumar et al., 2018), largely due to the scarcity of effective antibodies for brain tissue sections at the time of publication. Furthermore, the Allen Brain Atlas lacks data on TRPV3 expression in the developing or postnatal brain. To address this gap, we plan to examine TRPV3 expression at P7-8, P12-13, and P20-23 as part of our manuscript revision.

      (2) Are there any differential differences in TRPV3 expression patterns that could explain the different firing properties in response to fever temperature between the STAY- and STOP neurons? 

      This is an excellent question and one we plan to explore in the future by developing reporter mice or viral tools to monitor the activity of cells with endogenous TRPV3 expression. To our knowledge, these tools do not currently exist. Creating them will be challenging, as it requires identifying promoters that accurately reflect endogenous TRPV3 expression. We have not yet quantified TRPV3 expression in STOP and STAY neurons; however, our analysis of evoked spiking activity at 30, 36, and 39°C suggests that TRPV3 expression may mark a population of pyramidal neurons that tend to STAY spiking as temperatures increase. To investigate this further, we are considering patch-seq for TRPV3 expression on recorded neurons. This is a complex experiment, as it requires recording activity at three different temperatures and subsequently collecting the cell contents. While success is not guaranteed, we are committed to attempting these experiments as part of our revisions.

      (3) TRPV3 and TRPV4 can co-assemble to form heterotetrameric channels with distinct functional properties. Do STOP neurons exhibit any firing behaviors that could be attributed to the variable TRPV3/4 assembly ratio? 

      There is some evidence that TRPV3 and TRPV4 proteins can physically associate in HEK293 cells and native skin tissues (Hu et al., 2022).  TRPV3 and TRPV4 are both expressed in the cortex (Kumar et al., 2018), but it remains unclear whether they are co-expressed and co-assembled to form heteromeric channels in cortical excitatory  pyramidal neurons.  Examination of the I-V curve from HEK cells co-expressing TRPV3/4 heteromeric channels shows enhanced current at negative membrane potentials (Hu et al., 2022).  

      Currently, we cannot characterize cells as STOP or STAY and measure TRPV3 or TRPV4 currents simultaneously, as this would require different experimental setups and internal solutions. Additionally, the protocol involves a sequence of recordings at 30, 36, and 39°C, followed by cooling back to 30°C and re-heating to each temperature. Cells undergoing such a protocol will likely not survive till the end.

      In our recordings of TRPV3 currents—which likely include both STOP and STAY cells—we do not observe a significant current at negative voltages, suggesting that TRPV3/4 heteromeric channels may either be absent or underrepresented, at least at a 1:1 ratio. However, the possibility that TRPV3/4 heteromeric channels could define the STOP cell population is intriguing and plausible.

      (4) In Figure 7, have the authors observed an increase of TRPV3 currents in MSNs in response to temperature elevation? 

      We have not recorded TRPV3 currents in MSNs in response to elevated temperatures.

      (5) Is there any evidence of a relationship between TRPV3 expression levels in D2+ MSNs and degeneration of dopamine-producing neurons? 

      This is an interesting question, though it falls outside our current research focus in the lab. A PubMed search yields no results connecting the terms TRPV3, MSNs, and degeneration. However, gain-of-function mutations in TRPV4 channel activity have been implicated in motor neuron degeneration (Sullivan et al., 2024) and axon degeneration (Woolums et al., 2020). Similarly, TRPV1 activation has been linked to developmental axon degeneration (Johnstone et al., 2019), while TRPV3 blockade has shown neuroprotective effects in models of cerebral ischemia/reperfusion injury in mice (Chen et al., 2022).

      The link between TRPV activation and cell degeneration, however, may not be straightforward. For instance, TRPV1 loss has been shown to accelerate stress-induced degradation of axonal transport from retinal ganglion cells to the superior colliculus and to cause degeneration of axons in the optic nerve (Ward et al., 2014). Meanwhile, TRPV1 activation by capsaicin preserves the survival and function of nigrostriatal dopamine neurons in the MPTP mouse model of Parkinson's disease (Chung et al., 2017).

      (6) Does fever range temperature alter the expressions of other neuronal Kv channels known to regulate the firing threshold? 

      This is an active line of investigation in our lab. The results of ongoing experiments will provide further insight into this question.

      Reviewer #2 (Public review): 

      Summary: 

      The authors study the excitability of layer 2/3 pyramidal neurons in response to layer four stimulation at temperatures ranging from 30 to 39 Celsius in P7-8, P12-P14, and P22-P24 animals. They also measure brain temperature and spiking in vivo in response to externally applied heat. Some pyramidal neurons continue to fire action potentials in response to stimulation at 39 C and are called stay neurons. Stay neurons have unique properties aided by TRPV3 channel expression. 

      Strengths: 

      The authors use various techniques and assemble large amounts of data. 

      Weaknesses: 

      (1) No hyperthermia-induced seizures were recorded in the study. 

      The goal of this manuscript is to uncover the age-related physiological changes that enable the brain to retain function at fever temperatures. These changes may potentially explain why most children do not experience febrile seizures or why, in the rare cases when they do occur, the most prominent window of susceptibility is between 2-5 years of age (Shinnar and O’Dell, 2004), as this may coincide with the window during which these developmental changes are normally occurring. While it is possible that impairments in these mechanisms could result in febrile seizures, another possibility is that neural activity may fall below the level required to maintain normal function.

      (2) Febrile seizures in humans are age-specific, extending from 6 months to 6 years. While translating to rodents is challenging, according to published literature (see Baram), rodents aged P11-16 experience seizures upon exposure to hyperthermia. The rationale for publishing data on P7-8 and P22-24 animals, which are outside this age window, must be clearly explained to address a potential weakness in the study. 

      This manuscript focuses on identifying the age-related physiological changes that enable the brain to retain function at fever temperatures. To this end, we examine two age periods flanking the putative window of susceptibility (P12-14), specifically an earlier timepoint (P7-8) and a later timepoint (P20-23). The inclusion of these time points also serves as a negative control, allowing us to determine whether the changes we observe in the proposed window of susceptibility are unique to this period. We believe that including these windows ensures a thorough and objective scientific approach.

      (3) Authors evoked responses from layer 4 and recorded postsynaptic potentials, which then caused action potentials in layer 2/3 neurons in the current clamp. The post-synaptic potentials are exquisitely temperature-sensitive, as the authors demonstrate in Figures 3 B and 7D. Note markedly altered decay of synaptic potentials with rising temperature in these traces. The altered decays will likely change the activation and inactivation of voltage-gated ion channels, adjusting the action potential threshold. 

      In Figure 4B, we surmised that the temperature-induced reductions in inhibition and the subsequent loss of the late PSP primarily contribute to the altered decay of the synaptic potentials.

      (4) The data weakly supports the claim that the E-I balance is unchanged at higher temperatures. Synaptic transmission is exquisitely temperature-sensitive due to the many proteins and enzymes involved. A comprehensive analysis of spontaneous synaptic current amplitude, decay, and frequency is crucial to fully understand the effects of temperature on synaptic transmission. 

      Thank you for the opportunity to provide clarification. It was not stated, nor did we intend to imply, that in general, E-I balance is unchanged at higher temperatures. Please see the excerpt from the manuscript below. The statements specifically referred to observations made for experiments conducted during the P20-26 age range for cortical pyramidal neurons. We have a parallel line of investigation exploring the differential susceptibility of E-I balance based on age and temperature. Additionally, our measurements focus on evoked activity, rather than spontaneous activity, as these events are more likely linked to the physiological changes underlying behavior in the sensory cortex.

      “As both excitatory and inhibitory PNs that stay spiking increase their firing rates (Figure 5B) and considering that some neurons within the network are inactive throughout or stop spiking, it is plausible that these events are calibrated such that despite temperature increases, the excitatory to inhibitory (E-I) balance within the circuit may remain relatively unchanged. Indeed, recordings of L4-evoked excitatory and inhibitory postsynaptic currents (respectively EPSCs and IPSCs) in wildtype L2/3 excitatory PNs in S1 cortex, where inhibition is largely mediated by the parvalbumin positive (PV) interneurons, showed that E-I balance (defined as E/E+I, the ratio of the excitatory current to the total current) remained unchanged as temperature increased from 36 to 39°C (Figure 5E).”

      (5) It is unclear how the temperature sensitivity of medium spiny neurons is relevant to febrile seizures. Furthermore, the most relevant neurons are hippocampal neurons since the best evidence from human and rodent studies is that febrile seizures involve the hippocampus. 

      Thank you for the opportunity to clarify. Our goal was not to establish a link between medium spiny neuron (MSN) function and febrile seizures. The manuscript's focus is on identifying age-related physiological changes that enable supragranular cortical cells in the brain to retain function at fever temperatures. MSNs were selected for mechanistic comparison in this study because they represent a non-pyramidal, non-excitatory neuronal subtype, allowing us to assess whether the physiological changes observed in L2/3 excitatory pyramidal neurons are unique to these cells.

      (6) TRP3V3 data would be convincing if the knockout animals did not have febrile seizures. 

      Could you kindly provide the reference indicating that TRPV3 KO mice have seizures? Unfortunately, we were unable to locate this reference. It is important to distinguish febrile seizures, which occur within the range of physiological body temperatures (~ 38 to 40°C), from seizures resulting from heat stroke, a severe form of hyperthermia occuring when body temperature exceeds 40.0 °C. Mechanistically, these may represent different phenomena, as the latter is typically associated with widespread protein denaturation and cell death, whereas febrile seizures are usually non-lethal.  Additionally, TRPV3 is located on chromosome 17p13.2, a region not currently associated with seizure susceptibility.

      Reviewer #3 (Public review): 

      Summary: 

      This important study combines in vitro and in vivo recording to determine how the firing of cortical and striatal neurons changes during a fever range temperature rise (37-40 oC). The authors found that certain neurons will start, stop, or maintain firing during these body temperature changes. The authors further suggested that the TRPV3 channel plays a role in maintaining cortical activity during fever. 

      Strengths: 

      The topic of how the firing pattern of neurons changes during fever is unique and interesting. The authors carefully used in vitro electrophysiology assays to study this interesting topic. 

      Weaknesses: 

      (1) In vivo recording is a strength of this study. However, data from in vivo recording is only shown in Figures 5A,B. This reviewer suggests the authors further expand on the analysis of the in vivo Neuropixels recording. For example, to show single spike waveforms and raster plots to provide more information on the recording. The authors can also separate the recording based on brain regions (cortex vs striatum) using the depth of the probe as a landmark to study the specific firing of cortical neurons and striatal neurons. It is also possible to use published parameters to separate the recording based on spike waveform to identify regular principal neurons vs fast-spiking interneurons. Since the authors studied E/I balance in brain slices, it would be very interesting to see whether the "E/I balance" based on the firing of excitatory neurons vs fast-spiking interneurons might be changed or not in the in vivo condition. 

      As requested, in the revised manuscript, we will include examples of single spike waveforms and raster plots for the in vivo recordings. Please note that all recordings were conducted in the cortex, not the striatum. To clarify, we used published parameters to separate the recordings based on spike waveform, which allowed us to identify regular principal neurons and fast-spiking interneurons. The paragraph below from the methods section describes this procedure.

      “ Following manual curation, based on their spike waveform duration, the selected single units (n= 633) were separated into putative inhibitory interneurons and excitatory principal cells (Barthóet al., 2004). The spike duration was calculated as the time difference between the trough and the subsequent waveform peak of the mean filtered (300 – 6000 Hz bandpassed) spike waveform. Durations of extracellularly recorded spikes showed a bimodal distribution (Hartigan’s dip test; p < 0.001) characteristic of the neocortex with shorter durations corresponding to putative interneurons (narrow spikes) and longer durations to putative principal cells (wide spikes). Next, k-means clustering was used to separate the single units into these two groups, which resulted in 140 interneurons (spike duration < 0.6 ms) and 493 principal cells (spike duration > 0.6 ms), corresponding to a typical 22% - 78% (interneuron – principal) cell ratio”.

      In vivo patching to record extracellular and inhibitory responses at 36°C and then waiting 10 minutes to record again at 39°C would be an extremely challenging experiment. Due to the high difficulty and expected very low yield, these experiments will not be pursued for the revision studies.

      (2) The author should propose a potential mechanism for how TRPV3 helps to maintain cortical activity during fever. Would calcium influx-mediated change of membrane potential be the possible reason? Making a summary figure to put all the findings into perspective and propose a possible mechanism would also be appreciated. 

      Thank you for your helpful suggestions. In response to your recommendation, we will include a summary figure detailing the hypothesis currently described in the discussion section of the manuscript. The excerpt from the discussion is included below.

      “Although, TRPV3 channels are cation-nonselective, they exhibit high permeability to Ca2+ (Ca²⁺ > Na⁺ ≈ K⁺ ≈ Cs⁺) with permeability ratios (relative to Na+) of 12.1, 0.9, 0.9, 0.9 (Xu et al., 2002). Opening of TRPV3 channels activates a nonselective cationic conductance and elevates membrane depolarization, which can increase the likelihood of generating action potentials. Indeed, our observations of a loss of the temperature-induced increases in the PSP with TRPV3 blockade are consistent with a reduction in membrane depolarization. In S1 cortical circuits at P12-14, STAY PNs appear to rely on a temperature-dependent activity mechanism, where depolarization levels (mediated by higher excitatory input and lower inhibitory input) are scaled to match the cell’s ST. Thus, an inability to increase PSPs with temperature elevations prevents PNs from reaching ST, so they cease spiking.”

      (3) The author studied P7-8, P12-14, and P20-26 mice. How do these ages correspond to the human ages? it would be nice to provide a comparison to help the reader understand the context better.

      Ideally, the mouse-human age comparison would depend on the specific process being studied. Please note that these periods are described in the introduction of the manuscript. The relevant excerpt is included below. Let us know if you need any additional modifications to this description.

      “Using wildtype mice across three postnatal developmental periods—postnatal day (P)7-8 (neonatal/early), P12-14 (infancy/mid), and P20-26 (juvenile/late)—we investigated the electrophysiological properties, ex vivo and in vivo, that enable excitatory pyramidal neurons (PNs) neurons in mouse primary somatosensory (S1) cortex to remain active during temperature increases from 30°C (standard in electrophysiology studies) to 36°C (physiological temperature), and then to 39°C (fever-range).”

    1. Author response:

      eLife Assessment

      This important study describes a computational tool termed FliSimBA (Fluorescence Lifetime Simulation for Biological Applications), which uses simulations to rigorously assess experimental limitations in fluorescence lifetime imaging microscopy (FLIM), including diverse noise factors, hardware effects, and sensor expression levels. The evidence from simulation and experimental measurements supporting the usefulness of FlimSimBA is solid. The authors may improve the application of the tool to a wide range of biological samples by providing the simulation package, currently in MATLB, in other common languages such as Python, and having better descriptions of the fitting algorithm and model assumptions. The work will interest scientists who wish to perform quantitative FLIM imaging for cells and tissues.

      We thank the editors and reviewers for the constructive feedback. We plan to provide the FLiSimBA simulation package in Python in addition to Matlab. We will also describe in more detail in the Results section our fitting method. Furthermore, we will explain more clearly in the text that our simulation package makes almost no model assumptions, and features flexibility and adaptability so that it can be used for any fluorescence lifetime measurements. We will clearly outline what are the specific examples we use for our case studies, and how users can input their own values based on the specific sensors, autofluorescence, and hardware they use.

      Public Reviews:

      Reviewer #1 (Public review):

      In this study, Ma et al. aimed to determine previously uncharacterized contributions of tissue autofluorescence, detector afterpulse, and background noise on fluorescence lifetime measurement interpretations. They introduce a computational framework they named "Fluorescence Lifetime Simulation for Biological Applications (FLiSimBA)" to model experimental limitations in Fluorescence Lifetime Imaging Microscopy (FLIM) and determine parameters for achieving multiplexed imaging of dynamic biosensors using lifetime and intensity. By quantitatively defining sensor photon effects on signal-to-noise in either fitting or averaging methods of determining lifetime, the authors contradict any claims of FLIM sensor expression insensitivity to fluorescence lifetime and highlight how these artifacts occur differently depending on the analysis method. Finally, the authors quantify how statistically meaningful experiments using multiplexed imaging could be achieved.

      A major strength of the study is the effort to present results in a clear and understandable way given that most researchers do not think about these factors on a day-to-day basis. The model code is available and written in Matlab, which should make it readily accessible, although a version in other common languages such as Python might help with dissemination in the community. One potential weakness is that the model uses parameters that are determined in a specific way by the authors, and it is not clear how vastly other biological tissue and microscope setups may differ from the values used by the authors.

      Overall, the authors achieved their aims of demonstrating how common factors (autofluorescence, background, and sensor expression) will affect lifetime measurements and they present a clear strategy for understanding how sensor expression may confound results if not properly considered. This work should bring to awareness an issue that new users of lifetime biosensors may not be aware of and that experts, while aware, have not quantitatively determined the conditions where these issues arise. This work will also point to future directions for improving experiments using fluorescence lifetime biosensors and the development of new sensors with more favorable properties.

      We appreciate the comments and helpful suggestions. We plan to present FLiSimBA simulation code in Python in addition to Matlab to make it more accessible to the community.

      One of the advantages of FLiSimBA is that the simulation package is flexible and adaptable, allowing users to input parameters based on the specific sensors, hardware, and autofluorescence measurements for their biological and optical systems. We used parameters based on one FRET-based sensor, measured autofluorescence from mouse tissue, and measured dark count/after pulse of our specific GaAsP PMT in this manuscript as examples. We will emphasize this advantage and further clarify how these parameters can be adapted to diverse tissues, imaging systems, and sensors based on individual users in our revision.

      Reviewer #2 (Public review):

      Summary:

      By using simulations of common signal artefacts introduced by acquisition hardware and the sample itself, the authors are able to demonstrate methods to estimate their influence on the estimated lifetime, and lifetime proportions, when using signal fitting for fluorescence lifetime imaging.

      Strengths:

      They consider a range of effects such as after-pulsing and background signal, and present a range of situations that are relevant to many experimental situations.

      Weaknesses:

      A weakness is that they do not present enough detail on the fitting method that they used to estimate lifetimes and proportions. The method used will influence the results significantly. They seem to only use the "empirical lifetime" which is not a state of the art algorithm. The method used to deconvolve two multiplexed exponential signals is not given.

      We appreciate the comments and constructive feedback and will more clearly describe the fitting methods in our revision.

      Two metrics are currently used to estimate lifetime in our paper, which are currently described in the Methods section ‘Experimental data collection, parameter determination, and simulation’ and ‘FLIM analysis’: (1) fitted P1: we described how lifetime histograms were fitted to Equation 2 with the Gauss-Newton nonlinear least-square fitting algorithm and the fitted P1 was used as lifetime estimation; (2) empirical lifetime, defined by Equation 5. These two metrics were used for the following reasons: (1) when the exponential decay equation of a sensor is known (for example, the FRET-based PKA activity sensor FLIM-AKAR can be described as a double exponential equation), fitted coefficients for each exponential component provide a robust way for lifetime estimate that is less sensitive to noise and background signals; (2) when the biophysical properties of sensors are unknown, or when the sensors cannot be easily described with single or double exponential equations, empirical lifetime (i.e. average lifetime values) provides an unbiased way to quantify fluorescence lifetime without assumptions of underlying models to describe sensor lifetime.

      To deconvolve two multiplexed exponential signals (Fig. 8), histograms were fitted to Equation 2 with the Gauss-Newton nonlinear least-square fitting algorithm, as described in Methods section ‘Simulation and analysis of multiplexed imaging with fluorescence intensity and lifetime data’.

      Considering the importance of these methodological details for evaluating the conclusions of this study, and the importance of appreciating the advantages and limitations of different methods of lifetime estimates (e.g. Figure 7), we will move the description of the fitting method to estimate P1 and the method of calculating empirical lifetime from Methods to Results, and will further clarify the rationale of using these different methods of lifetime estimates.

      Reviewer #3 (Public review):

      Summary:

      This study presents a useful computational tool, termed FLiSimBA. The MATLAB-based FLiSimBA simulations allow users to examine the effects of various noise factors (such as autofluorescence, afterpulse of the photomultiplier tube detector, and other background signals) and varying sensor expression levels. Under the conditions explored, the simulations unveiled how these factors affect the observed lifetime measurements, thereby providing useful guidelines for experimental designs. Further simulations with two distinct fluorophores uncovered conditions in which two different lifetime signals could be distinguished, indicating multiplexed dynamic imaging may be possible.

      Strengths:

      The simulations and their analyses were done systematically and rigorously. FliSimba can be useful for guiding and validating fluorescence lifetime imaging studies. The simulations could define useful parameters such as the minimum number of photons required to detect a specific lifetime, how sensor protein expression level may affect the lifetime data, the conditions under which the lifetime would be insensitive to the sensor expression levels, and whether certain multiplexing could be feasible.

      Weaknesses:

      The analyses have relied on a key premise that the fluorescence lifetime in the system can be described as two-component discrete exponential decay. This means that the experimenter should ensure that this is the right model for their fluorophores a priori and should keep in mind that the fluorescence lifetime of the fluorophores may not be perfectly described by a two-component discrete exponential (for which alternative algorithms have been implemented: e.g., Steinbach, P. J. Anal. Biochem. 427, 102-105, (2012)). In this regard, I also couldn't find how good the fits were for each simulation and experimental data to the given fitting equation (Equation 2, for example, for Figure 2C data).

      We thank the reviewer for the constructive feedback. We agree that the FLiSimBA users should ensure that the right decay equations are used to describe the fluorescent sensors. In this study, we used a FRET-based PKA sensor FLIM-AKAR to provide a proof-of-principle demonstration of FLiSimBA usage. The donor fluorophore of FLIM-AKAR, truncated monomeric enhanced GFP, follows a single exponential decay. FLIM-AKAR, a FRET-based sensor, follows a double exponential decay. The time constants of the two exponential components were determined previously (Chen, et al, Frontiers in pharmacology (2014)).  Thus, a double exponential decay equation with known τ1 and τ2 (Equation 1) was used for both simulation and fitting. In our revision, we will refer to our prior study characterizing the double exponential decay model of FLIM-AKAR. We will also emphasize the importance of using the right decay equations, strategies to estimate sensor decays, and how the flexibility of FLiSimBA allows users to input different forms of models to describe their specific sensor histograms. We will additionally provide data showing the goodness of fit for both simulated data and experimental data.

      Also, in Figure 2C, the 'sensor only' simulation without accounting for autofluorescence (as seen in Sensor + autoF) or afterpulse and background fluorescence (as seen in Final simulated data) seems to recapitulate the experimental data reasonably well. So, at least in this particular case where experimental data is limited by its broad spread with limited data points, being able to incorporate the additional noise factors into the simulation tool didn't seem to matter too much.

      We agree that in Figure 2C the contributions from autofluorescence, afterpulse, and background signals are small, because sensor photon count is high here. As seen in Figure 2B, when sensor photon counts are higher, the contributions from these other factors become less pronounced. The simulated data in Figure 2C were based on high photon counts because the simulated P1 value was determined by fitting experimental data. To achieve reasonable fitting with minimal interference from autofluorescence, afterpulse, and background signals, we used experimental data with high sensor expression. We will clarify these details in our revision.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      The main goal of the paper was to identify signals that activate FLP-1 release from AIY neurons in response to H2O2, previously shown by the authors to be an important oxidative stress response in the worm. 

      Strengths: 

      This study builds upon the authors' previous work (Jia and Sieburth 2021) by further elucidating the gut-derived signaling mechanisms that coordinate the organism-wide antioxidant stress response in C. elegans. 

      By detailing how environmental cues like oxidative stress are transduced into gut-derived peptidergic signals, this study represents a valuable advancement in understanding the integrated physiological responses governed by the gut-brain axis. 

      This work provides valuable mechanistic insights into the gut-specific regulation of the FLP2 peptide signal. 

      Weaknesses: 

      Although the authors identify intestinal FLP-2 as the endocrine signal important for regulating the secretion of the neuronal antioxidant neuropeptide, FLP-1, there is no effort made to identify how FLP-2 levels regulate FLP-1 secretion or identify whether this regulation is occurring directly through the AIY neuron or indirectly. This is brought up in the discussion, but identifying a target for FLP-2 in this pathway seems like a crucial missing piece of information in characterizing this pathway. 

      We agree that this is an important question. Specifically, identifying the FLP-2 receptor and its site of action is a major priority. Since there are at least four different receptors that have been functionally or physically linked to FLP-2 and there are at least three FLP-2 peptides, unraveling the components acting directly downstream of FLP-2 will require further investigation that we feel is beyond the scope of this current study. We have added a new panel (Fig 1E) addressing the requirements for flp-2 signaling on peroxide production in AIY. These results provide new mechanistic insight into how flp-2 impacts signaling in AIY and a new interpretation of these results has been added to the discussion.

      Reviewer #2 (Public Review): 

      Summary: 

      The core findings demonstrate that the neuropeptide-like protein FLP-2, released from the intestine of C. elegans, is essential for activating the intestinal oxidative stress response. This process is mediated by endogenous hydrogen peroxide (H2O2), which is produced in the mitochondrial matrix by superoxide dismutases SOD-1 and SOD-3. H2O2 facilitates FLP-2 secretion through the activation of protein kinase C family member pkc-2 and the SNAP25 family member aex-4. The study further elucidates that FLP-2 signaling potentiates the release of the antioxidant FLP-1 neuropeptide from neurons, highlighting a bidirectional signaling mechanism between the intestine and the nervous system. 

      Strengths: 

      This study presents a significant contribution to the understanding of the gut-brain axis and its role in oxidative stress response and significantly advances our understanding of the intricate mechanisms underlying the gut-brain axis's role in oxidative stress response. By elucidating the role of FLP-2 and its regulation by H2O2, the study provides insights into the molecular basis of inter-tissue communication and antioxidant defense in C. elegans. These findings could have broader implications for understanding similar pathways in more complex organisms, potentially offering new targets for therapeutic intervention in diseases related to oxidative stress and aging. 

      Weaknesses: 

      (1) The experimental techniques employed in the study were somewhat simple and could benefit from the incorporation of more advanced methodologies. 

      Thank you for your comment

      (2) The weak identification of the key receptors mediating the interaction between FLP-2 and AIY neurons, as well as the receptors in the gut that respond to FLP-1. 

      We agree that this is an important question. Specifically, identifying the FLP-2 receptor and its site of action is a major priority. Since there are at least four different receptors that have been functionally or physically linked to FLP-2 and there are at least three FLP-2 peptides, unraveling the components acting directly downstream of FLP-2 will require further investigation that we feel is beyond the scope of this current study.

      (3) The study could be improved by incorporating a sensor for the direct measurement of hydrogen peroxide levels. 

      We have added a new panel (Fig 1E) addressing the requirements for flp-2 signaling on peroxide production in AIY using the genetically encoded peroxide sensor HyPer7. These results provide new mechanistic insight into how flp-2 impacts signaling in AIY and a new interpretation of these results has been added to the discussion. In addition, we have used HyPer7 to measure peroxide levels in the intestinal mitochondrial matrix and outer membrane (Figs 3, 4, 5, 6)

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      The major missing link in the study is how FLP-2 affects FLP-1 release from AIY: is the effect direct and does it require the previously described FLP-2 receptor FRPR-18? Although this possibility is discussed extensively (L511-528) so it is odd that the effect of an frpr-18 mutation was not tested (or if it was tested, why the results were not reported). If the authors haven't done this experiment (despite doing many less critical experiments) it would be good to know why. 

      We agree that this is an important question. Specifically, identifying the FLP-2 receptor and its site of action is a major priority. Since there are at least four different receptors that have been functionally or physically linked to FLP-2 and there are at least three FLP-2 peptides, unraveling the components acting directly downstream of FLP-2 will require further investigation that we feel is beyond the scope of this current study. We have added a new panel (Fig 1E) addressing the requirements for flp-2 signaling on peroxide production in AIY. These results provide new mechanistic insight into how flp-2 impacts signaling in AIY and a new interpretation of these results has been added to the discussion.

      Results:

      “To address how flp-2 signaling regulates FLP-1 secretion from AIY, we examined H2O2 levels in AIY using a mitochondrially targeted pH-stable H2O2 sensor HyPer7 (mitoHyPer7, Pak et al. 2020). Mito-HyPer7 adopted a punctate pattern of fluorescence in AIY axons, and the average fluorescence intensity of axonal mito-HyPer7 puncta increased about two-fold following 10 minute juglone treatment (Fig 1E), in agreement with our previous studies using HyPer (Jia and Sieburth 2021), confirming that juglone rapidly increases mitochondrial AIY H2O2 levels. flp-2 mutations had no significant effects on the localization or the average intensity of mito-HyPer7 puncta in AIY axons either in the absence of juglone, or in the presence of juglone (Fig 1E), suggesting that flp-2 signaling promotes FLP-1 secretion by a mechanism that does not increase H2O2 levels in AIY. Consistent with this, intestinal overexpression of flp-_2 had no effect on FLP-1::Venus secretion in the absence of juglone, but significantly enhanced the ability of juglone to increase FLP-1 secretion (Fig. 1D). We conclude that both elevated mitochondrial H2O2 levels and intact _flp-2 signaling from the intestine are necessary to increase FLP-1 secretion from AIY.”

      More minor comments/suggestions: 

      Line 172: No justification is given as to why the authors chose to focus on flp-2 over the other potential candidates identified in their RNAi screen. 

      We are currently examining the other neuropeptide hits from the screen, but we have no additional phenotypes to report.

      Line 189: An explanation for the use of gDNA as opposed to cDNA should be given. 

      We have changed the text in the Results section as follows:

      “Expressing a flp-2 genomic DNA (gDNA), fragment (containing both the flp-2a and flp-2b isoforms that arise by alternative splicing), specifically in the nervous system failed to rescue the FLP-1::Venus defects of flp-2 mutants, whereas expressing flp-2 selectively in the intestine fully restored juglone-induced FLP1::Venus secretion to flp-2 mutants (Fig. 1D).”

      Line 249-253: nlp-40 and nlp-27 were not implicated in contributing to juglone toxicity in the RNAi screen performed previously by the authors, so it is unclear why both of these peptides are investigated beyond simply being released from the intestine. Confusingly, while Figure S2D shows no overlap between NLP-40 and FLP2, NLP-27 is omitted from the analysis. 

      We have clarified that these peptides are not implicated in stress responses, providing a clearer rational for why the serve as controls for specificity.

      “Third, nlp-40 and nlp-27 encode neuropeptide-like proteins that are released from the intestine, but are not implicated in stress responses (Liu et al. 2023; Taylor et al. 2021; Wang et al. 2013), and juglone treatment had no detectable effects on coelomocyte fluorescence in animals expressing intestinal NLP-40::Venus or NLP-27::Venus fusion proteins (Fig. S2B and C), and NLP40::mTur2 puncta did not overlap with FLP-2::Venus puncta in the intestine (Fig. S2D).”

      Line 262: A more detailed description of juglone's mechanism of action would be welcome here. Is juglone expected to act only in intestinal cells, or is its function more pervasive? 

      We have added more detail:

      “Juglone generates superoxide anion radicals (Ahmad and Suzuki 2019; Paulsen and Ljungman 2005) and juglone treatment of C. elegans increases ROS levels (de Castro, Hegi de Castro, and Johnson 2004) likely by promoting the global production of mitochondrial superoxide. Superoxide can then be rapidly converted into H2O2 by superoxide dismutase.”

      Line 414: Justification for why expulsion frequency is used here to quantify NLP-40 secretion is required, particularly because NLP-40::Venus was already used to quantify NLP-40 secretion via the coelomocyte fluorescence method in the experiments contributing to Figure S2. 

      We used expulsion frequency here because (1) it is an easier assay compared to the coelomocyte assay and (2) it is a functional assay. Defective NLP-40 exocytosis manifests as reduced exclusion frequency, therefore if NLP-40 secretion is defective in pkc-2 mutants, nlp-40 mutants should exhibit defects in expulsion frequency.

      We have clarified this point:

      “To determine whether pkc-2 can regulate the intestinal secretion of other peptides that are not associated with oxidative stress, we examined expulsion frequency, which is a measure of NLP-40 secretion (Mahoney et al. 2008; Wang et al. 2013).”

      Line 478: The discussion of neuronally-secreted kisspeptin in this context does not seem relevant as this paper has focused on intestinal peptide secretion. 

      We have removed this sentence:

      In mammals, release of the RF-amide neuropeptide kisspeptin from the anteroventral periventricular nucleus (AVPV) regulates reproduction by inducing the release of gonadotropins via its stimulatory action on GnRH neurons (Han et al. 2005).

      Line 526: DMSR-18 seems to be a typo. Possibly meant FRPR-8, as this is another FLP-2-activated GPCR identified in the screen (though notably, FRPR-8 is only activated by one of the two FLP-2 peptide products) On that note, DMSR-1 has two isoforms, and only one of them is activated by FLP-2 (and only one of the two FLP-2 peptides). This seems relevant to discuss. 

      We have corrected the text and we have added to the discussion the number of FLP-2 peptides:

      “In addition, certain FLP-2-derived peptides (of which there are at least three) can bind to the GPCRs DMSR-1, or FRPR-8 in transfected cells (Beets et al. 2023). Identifying the relevant FLP-2 peptide(s), the FLP-2 receptor and its site of action will help to define the circuit used by intestinal flp-2 to promote FLP-1 release from AIY.” 

      Line 534: An explanation or speculation into why this integration might be necessary would be welcome here. 

      We have edited this paragraph:

      “FLP-1 release from AIY is positively regulated by H2O2 generated from mitochondria (Jia and Sieburth 2021). Here we showed that H2O2-induced FLP-1 release requires intestinal flp-2 signaling. However, flp-2 does not appear to promote FLP-1 secretion by increasing H2O2 levels in AIY (Fig 1E), and flp-2 signaling is not sufficient to promote FLP-1 secretion in the absence of H2O2 (Fig. 1D). These results point to a model whereby at least two conditions must be met in order for AIY to increase FLP-1 secretion: an increase in H2O2 levels in AIY itself, and an increase in flp-2 signaling from the intestine. Thus AIY integrates stress signals from both the nervous system and the intestine to activate the intestinal antioxidant response through FLP-1 secretion. The requirement of signals from multiple tissues for FLP-1 secretion may function to limit the activation of SKN-1, since unregulated SKN-1 activation can be detrimental to organismal health (Turner, Ramos, and Curran 2024).”

      Line 569: Should specify what these candidates are. 

      There are 11 proteins with thioredoxin fold domains. We modified the sentence to list one of them.

      “There are several thioredoxin-domain containing proteins in addition to trx-3 in the C. elegans genome that could be candidates for this role (e.g. trx-5 and others).”

      Line 660: Details about whether the M9 control had an equivalent amount of DMSO as the juglone+M9 condition is required. 

      We have performed toxicity assay and neuropeptide release assays comparing M9 DMSO, and Juglone treatment and we have included this new data in Fig S1C, D and S2E. Methods: 

      “A stock solution of 50mM juglone in DMSO was freshly made on the same day of liquid toxicity assay. 120μM  working solution of juglone in M9 buffer was prepared using stock solution before treatment. Around 60-80 synchronized adult animals were transferred into a 1.5mL Eppendorf tube with fresh M9 buffer and washed three times, and a final wash was done with either the working solution of juglone with or M9  DMSO at the concentrations present in juglone-treated animals does not contribute to toxicity since DMSO treatment alone caused no significant change in survival compared to M9-treated controls (Fig. S1C).

      For coelomocyte imaging, L4 stage animals were transferred in fresh M9 buffer on a cover slide, washed six times with M9 before being exposed to 300μM juglone in M9 buffer (diluted from freshly made 50mM stock solution), 1mM H2O2 in M9 buffer, or M9 buffer. DMSO at the concentrations present in juglone-treated animals does not alter neuropeptide secretion since DMSO treatment alone caused no significant change in FLP-1::Venus or FLP-2::Venus coelomocyte fluorescence compared to M9-treated controls.  (Fig. S1D and S2E).”

      Line 1191: Should be FLP-1:Venus in AIY, not the intestine  

      Corrected.

      In general, the significance of reporting in the figures is very unclear. "a, b, c" to report statistical analysis is confusing in the figure legends, and also unnecessary when they denote non-significance. There are some cases where it is reported that a symbol (eg. ***) denotes statistical significance, but there is no indication of what level of statistical significance the symbol represents (for example, in Figures 2C and 2D) 

      Levels of significance was summarized in the end of legend for each figure unless indicated for specific symbols (for example Fig. 1C), we have edited this figure legend: 

      “E Representative images and quantification of fluorescence of matrix-targeted HyPer7 in the axon of AIY following M9 or juglone treatment for 10min. Arrowheads denote puncta marked by MLS::HyPer7 fusion proteins (Excitation: 500 and 400nm; emission: 520nm). Ratio of images taken with 500nM (GFP) and 400nM (CFP) for excitation was used to measure H2O2 levels. Unlined *** and ns denote statistical analysis compared to “wild type”. n = 25, 25, 25, 25 independent animals. Scale bar: 10μM.

      F Representative images and quantification of average fluorescence in the posterior region of transgenic animals expressing P_gst-4::gfp_ after 4h vehicle M9 or juglone exposure. Asterisks mark the intestinal region used for quantification. P_gst-4::gfp_ expression in the body wall muscles, which appears as fluorescence on the edge animals in some images, was not quantified. Unlined *** and ns denote statistical analysis compared to “wild type”; unlined ## and ### denotes statistical analysis compared to “wild type+juglone”. n = 25, 26, 25, 25, 25, 25, 25, 25 independent animals. Scale bar: 10μM.”

      Figure 2C: It is unclear which conditions have H2O2 treatment (as described in the legend). There is also no mention of what ### indicates. 

      Levels of significance for ### was summarized in the end of legend, No H2O2 treatment was performed in this assay, we have edited this figure legend: 

      “C. Representative images and quantification of average coelomocyte fluorescence of the indicated mutants expressing FLP-2::Venus fusion proteins in the intestine following M9 or juglone treatment for 10min. Unlined *** and ns denote statistical analysis compared to “wild type”. n = 29, 25, 24, 30, 23, 30, 25, 25, 25 independent animals. Scale bar: 5μM.”

      Figure 2D: It is not previously mentioned that M9 condition contains DMSO, as implied by the legend. 

      We have edited this figure legend:

      “D. Quantification of average coelomocyte fluorescence of transgenic animals expressing FLP-2::Venus fusion proteins in the intestine following treatment of fresh M9 buffer or the indicated stressors for 10min. Unlined *** denotes statistical analysis compared to “M9”. n = 23, 25, 25 independent animals.”  

      Figure 3J: The y-axis label should more clearly describe the ratio being measured. 

      We have updated the panel and this figure legend: 

      “J. Schematic, representative images and quantification of fluorescence in the posterior region of the indicated transgenic animals co-expressing mitochondrial matrix targeted HyPer7 (matrix-HyPer7) or mitochondrial outer membrane targeted HyPer7 (OMMHyPer7) with TOMM-20::mCherry following M9 juglone or H2O2 treatment. Ratio of images taken with 500nM (GFP) and 400nM (CFP) for excitation and 520nm for emission was used to measure H2O2 levels. Unlined *** and ns denote statistical analysis compared to “wild type; unlined ## denotes statistical analysis compared to “wild type+juglone”. (top) n = 20, 20, 18, 20, 19, 19, 20, 20 independent animals.

      (bottom) n = 20, 20, 19, 20, 20, 20, 20, 20 independent animals. Scale bar: 5μM.” 

      Figure S3A: *** is mislabelled. It should be a comparison to wildtype. 

      We have edited this figure legend: 

      “A. Quantification of average coelomocyte fluorescence of the indicated mutants expressing FLP-2::Venus fusion proteins in the intestine following M9 or juglone treatment for 10min. Unlined *** denotes statistical analysis compared to “wild type”; ### and ns denote statistical analysis compared to “wild type+juglone”. n = 29, 27, 29, 27, 25, 26, 24 independent animals.”  

      Reviewer #2 (Recommendations For The Authors): 

      (1) The localization experiments could benefit from the application of ultra-high-resolution fluorescence microscopy. This would allow for a more detailed analysis of the spatial distribution of SOD-1/3::GFP in relation to mitochondria-targeted TOMM-20::mCherry fusion proteins in the posterior intestinal region of transgenic animals. 

      We agree that high resolution microscopy would be a great way to more precisely localize SOD proteins relative to the mitochondria, and this would enhance understanding of the source of peroxide in this system. We do not conduct this type of microcopy in the lab, so this approach would require a collaboration with a lab that is set up for this. Thus we feel that this is beyond the scope of the current study.  

      (2) The paper may note the challenge of directly measuring mitochondrial H2O2 concentrations. However, advancements in chemical or fluorescent sensors for H2O2 detection within mitochondria could provide more direct evidence of its role in FLP-2 secretion. 

      We have considered using chemical sensors, but many are either not efficiently taken up by worms (the skin is largely impermeable to all but the most hydrophobic molecules), or they would label peroxide indiscriminately in all tissues making detection specifically in the intestine challenging. We have had good luck with genetically encoded peroxide sensors since they provide tissue specificity and good spatial resolution depending on where we target them. We have added imaging results for HyPer7 in the AIY neuron to Figure 1E. 

      Results:

      “To address how flp-2 signaling regulates FLP-1 secretion from AIY, we examined H2O2 levels in AIY using a mitochondrially targeted pH-stable H2O2 sensor HyPer7 (mitoHyPer7, Pak et al. 2020). Mito-HyPer7 adopted a punctate pattern of fluorescence in AIY axons, and the average fluorescence intensity of axonal mito-HyPer7 puncta increased about two-fold following 10 minute juglone treatment (Fig 1E), in agreement with our previous studies using HyPer (Jia and Sieburth 2021), confirming that juglone rapidly increases mitochondrial AIY H2O2 levels. flp-2 mutations had no significant effects on the localization or the average intensity of mito-HyPer7 puncta in AIY axons either in the absence of juglone, or in the presence of juglone (Fig 1E), suggesting that flp-2 signaling promotes FLP-1 secretion by a mechanism that does not increase H2O2 levels in AIY. Consistent with this, intestinal overexpression of flp-_2 had no effect on FLP-1::Venus secretion in the absence of juglone, but significantly enhanced the ability of juglone to increase FLP-1 secretion (Fig. 1D). We conclude that both elevated mitochondrial H2O2 levels and intact _flp-2 signaling from the intestine are necessary to increase FLP-1 secretion from AIY.” 

      (3) To confirm the activation of AIY neurons by FLP-2, measuring calcium activity in these neurons may be a robust approach. It would be beneficial to determine if synthetic FLP-2 can activate AIY neurons and subsequently induce an intestinal antioxidant response. 

      This is a great idea. We have begun to examine GCaMP fluorescence in AIY and we see responses to oxidative stressors. We think that this data is too preliminary at the moment to include here.  

      (4) The identification of the key receptors mediating the interaction between FLP-2 and AIY neurons, as well as the receptors in the gut that respond to FLP-1, would complete the signaling pathway and strengthen the study's conclusions. 

      We agree that this is an important question. Specifically, identifying the FLP-2 receptor and its site of action is a major priority. Since there are at least four different receptors that have been functionally or physically linked to FLP-2 and there are at least three FLP-2 peptides, unraveling the components acting directly downstream of FLP-2 will require further investigation that we feel is beyond the scope of this current study.  

      (5) Investigating whether direct manipulation of AIY neurons, through methods such as optogenetic activation or inhibition, can trigger the gut's antioxidant response would provide insight into the functional relevance of this neuronal activity. 

      Also an excellent idea. We previously published that Channelrhodopsin activation specifically in AIY indeed increases FLP-1 secretion, but we have not yet examined its effects on antioxidant responses in the intestine.  This may require a more sustained activation of AIY than Channelrhodopsin can provide.

      (6) For the analysis of intestinal Pges-1::GFP fluorescence, specifying the region of interest would enhance the precision of the data and the reproducibility of the results. 

      We analyze fluorescence intensity of a 16-pixel diameter circle in the posterior intestine (as indicated by the asterisks) and we have added this to the methods, we edited this paragraph:

      “or transcriptional reporter imaging, young adult animals with indicated genotype were transferred into a 1.5mL Eppendorf tube with M9 buffer, washed three times and incubated in M9 buffer or 60uM working solution of juglone for 1h in dark on rotating mixer before recovering on fresh NGM plates with OP50 for 3h in dark at 20°C. The posterior end of the intestine was imaged with the 60x objective and quantification for average fluorescence intensity of a 16-pixel diameter circle in the posterior intestine was calculated using Metamorph.”

      (7) Assessing the potential for pharmacological modulation of FLP-2 or H2O2 levels could provide valuable insights into therapeutic strategies aimed at enhancing the oxidative stress response. 

      Agreed.

      (8) For improved clarity, it is suggested that the schematic currently presented in Figure S1A be integrated into Figure 2C, as this would facilitate the reader's comprehension of the experimental design and findings. 

      Moved.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The manuscript by Choi and co-authors presents "P3 editing", which leverages dual-component guide RNAs (gRNA) to induce protein-protein proximity. They explore three strategies for leveraging prime-editing gRNA (pegRNA) as a dimerization module to create a molecular proximity sensor that drives genome editing, splitting a pegRNA into two parts (sgRNA and petRNA), inserting self-splicing ribozymes within pegRNA, and dividing pegRNA at the crRNA junction. Among these, splitting at the crRNA junction proved the most promising, achieving significant editing efficiency. They further demonstrated the ability to control genome editing via protein-protein interactions and small molecule inducers by designing RNA-based systems that form active gRNA complexes. This approach was also adaptable to other genome editing methods like base editing and ADAR-based RNA editing.

      Strengths:

      The study demonstrates significant advancements in leveraging guide RNA (gRNA) as a dimerization module for genome editing, showcasing its high specificity and versatility. By investigating three distinct strategies-splitting pegRNA into sgRNA and petRNA, inserting self-splicing ribozymes within the pegRNA, and dividing the pegRNA at the repeat junction-the researchers present a comprehensive approach to achieving molecular proximity and reconstituting function. Among these methods, splitting the pegRNA at the repeat junction emerged as the most promising, achieving editing efficiencies up to 76% of the control, highlighting its potential for further development in CRISPR-Cas9 systems. Additionally, the study extends genome editing control by linking protein-protein interactions to RNA-mediated editing, using specific protein-RNA interaction pairs to regulate editing through engineered protein proximity. This innovative approach expands the toolkit for precision genome editing, demonstrating the feasibility of controlling genome editing with enhanced specificity and efficiency.

      Weaknesses:

      The initial experiments with splitting the pegRNA into sgRNA and petRNA showed low editing efficiency, less than 2%. Similarly, inserting self-splicing ribozymes within pegRNA was inefficient, achieving under 2% editing efficiency in all constructs tested, possibly hindered by the prime editing enzyme. The editing efficiency of the crRNA and petracrRNA split at the repeat junction varied, with the most promising configurations only reaching 76% of the control efficiency. The RNA-RNA duplex formation's inefficiency might be due to the lack of additional protein binding, leading to potential degradation outside the Cas9-gRNA complex. Extending the approach to control genome editing via protein-protein interactions introduced complexity, with a significant trade-off between efficiency and specificity, necessitating further optimization. The strategy combining RADARS and P3 editing to control genome editing with specific RNA expression events exhibited high background levels of non-specific editing, indicating the need for improved specificity and reduced leaky expression. Moreover, P3 editing efficiencies are exclusively quantified after transfecting DNA into HEK cells, a strategy that has resulted in past reproducibility concerns for other technologies. Overall, the various methods and combinations require further optimization to enhance efficiency and specificity, especially when integrating multiple synthetic modules.

      Thank you for this accurate summary and assessment of the strengths and weaknesses of the P3 editing as it stands. Looking ahead, we agree that further optimizations will be important, as will characterizing the performance of P3 editing in additional cellular contexts. The revised Discussion (see below) now makes these points more clearly.

      Reviewer #2 (Public Review):

      Choi et al. describe a new approach for enabling input-specific CRISPR-based genome editing in cultured cells. While CRISPR-Cas9 is a broadly applied system across all of biology, one limitation is the difficulty in inducing genome editing based on cellular events. A prior study, from the same group, developed ENGRAM - which relies on activity-dependent transcription of a prime editing guide RNA, which records a specific cellular event as a given edit in a target DNA "tape". However, this approach is limited to the detection of induced transcription and does not enable the detection of broader molecular events including protein-protein interactions or exposure to small molecules. As an alternative, this study envisioned engineering the reconstitution of a split prime editing guide RNA (pegRNA) in a protein-protein interaction (PPI)-dependent manner. This would enable location- and content-specific genome editing in a controlled setting.

      The authors explored three different design possibilities for engineering a PPI-dependent split pegRNA. First, they tried splitting pegRNA into a functional sgRNA and corresponding prime editing transRNA, incorporating reverse-complementary dimerization sequences on each guide half. This approach, however, resulted in low editing efficiency across 7 different designs with various complementary annealing template lengths (<2% efficiency). They also tried inserting a self-splicing ribozyme within the pegRNA, which produces a functional pegRNA post-transcriptionally. The incorporation of a split-ribozyme, dependent on a PPI, could have been used to reconstitute the split pegRNA in an event-controlled manner. However again, only modest levels of editing were observed with the self-splicing ribozyme design (<2%). Finally, they tried splitting the pegRNA at the repeat:anti-repeat junction that was used to join the original dual-guide system comprised of a crRNA and tracrRNA, into a single-guide RNA. They incorporated the prime editing features into the tracrRNA half, to create petracrRNA. Dimerization was initially induced by different complementary RNA annealing sequences. Using this design, they were able to induce an editing efficiency of ~28% (compared to 37% efficiency using a positive control epegRNA guide).

      Having identified a suitable split pegRNA system, they next sought to induce the reconstitution of the two halves in a PPI-dependent manner. They replaced the complementary RNA annealing sequences with two different RNA aptamers (MS2 and BoxB). MS2 detects the MCP protein, while BoxB detects the LambdaN protein. Close proximity between MCP and LambdaN would thus bring together the two split pegRNA halves, creating a functional pegRNA that would enable prime editing at a specific target site. They demonstrated that they could induce MCP-BoxB proximity by fusing them to different dimerizing protein partners: 1) constitutive epitope-nanobody/antibody pairs such as scFv/GCN4 or NbALFA/ALFA-Tag; 2) split-GFP; or 3) chemically-induced protein pairs such as FKBP/FRB or ABI/PYL. For all of these approaches, they could achieve between ~20-60% normalized editing efficiency (relative to positive control editing levels with epegRNA). Additional mutation of the linkers between the RNA and aptamers could increase editing efficiency but also increase non-specific background editing even in the absence of an induced PPI.

      Additional applications of this overall strategy included incorporating the design with different DNA base editors, with the most promising examples shown with the base editors CBE4max and ABE8. It should be noted that these specific examples used a non-physiological LambdaN-MCP direct fusion protein as the "bait" that induced reconstitution of the two halves of the guideRNA, rather than relying on a true induced PPI. They also demonstrated that the recently reported RADARS strategy could be incorporated into their system. In this example, they used an ADAR-guide-RNA to drive the expression of a LambdaN-PCP fusion protein in the presence of a specific target RNA molecule, IL6. This induced LambdaN-PCP protein could then reconstitute the split peg-RNAs to drive prime editing. To enable this last application, they replaced the MS2 aptamer in their pegRNA with the PP7 aptamer that binds the PCP protein (this was to avoid crosstalk with RADARS, which also uses MS2/MCP interaction). Using this strategy, they observed a normalized editing efficiency of around 12% (but observed non-specific editing of around 8% in the absence of the target RNA).

      Strengths:

      The strengths of this paper include an interesting concept for engineering guide RNAs to enable activity-dependent genome editing in living cells in the future, based on discreet protein-protein interactions (either constitutively, spatially, or chemically induced). Important groundwork is laid down to engineer and improve these guide RNAs in the future (especially the work describing altering the linkers in Supplementary Figure 3 - which provides a path forward).

      Weaknesses:

      In its current state, the editing efficiency appears too low to be applied in physiological settings. Much of the latter work in the paper relies on a LambdaN-MCP direction fusion protein, rather than two interacting protein pairs. Further characterizations in the future, especially varying the transfection amounts/durations/etc of the various components of the system, would be beneficial to improve the system. It will also be important to demonstrate editing at additional sites; to characterize how long the PPI must be active to enable efficient prime editing; and how reversible the reconstitution of the split pegRNA is.

      Thank you for this assessment of the strengths and weaknesses of the P3 editing as it stands. Looking ahead, we agree that further optimizations will be important, including along the lines suggested by the reviewer, as will further characterization of the system with respect to dependencies, reversibility, etc. The revised Discussion (see below) now makes these points more clearly.

      Recommendations for the authors:

      Reviewing Editor comments:

      It would be helpful to better describe the nature of improvements (on-targeting and/or off-targeting) that would be needed to effectively use this approach in vitro and in vivo applications.

      We agree, and have accordingly revised the last paragraph of our discussion to better describe what improvements are needed for in vitro and in vivo applications:

      “In our view, there are four outstanding challenges for P3 editing to be broadly useful: evaluating additional cellular contexts, the method’s efficiency and specificity, understanding the limit of detectable protein-protein interactions, and the development of sensors compatible with multiplex P3 editing within the same cell. First, we have thus far only conducted P3 editing in HEK293T cells, and obviously needs to be tested in additional cell types. Second, both the efficiency and specificity of the P3 editing need to be improved before it can be used as a selective editing tool in model systems. We have explored how modifying the crRNA and petracrRNA pair sequences can tune the efficiency-vs-specificity tradeoff, but alternative avenues to improvement (e.g., better docking of RNA-aptamers such as MS2, BoxB, or PP7 by testing more linker sequences that place crRNA and petracrRNA for duplex formation) may be more fruitful in terms of achieving high efficiency and specificity at once (e.g., >50% editing in the setting of a specific protein-protein interaction, and <1% editing without it). Second, it is not clear whether weak and transient interactions among proteins can be used to trigger P3 editing. Assuming the genome editing complex formation is reversible, improving P3 editing efficiency may be able to capture different strengths of protein-protein interactions, although some interactions may be too transient to promote functional guide RNA formation. Finally, the current P3 editing design uses a pair of RNA aptamers and their corresponding protein binders, limiting the multiplex detection of protein-protein pairs. More orthogonal protein-RNA pairs need to be identified (e.g., using a massively parallel platform (Buenrostro et al., 2014) and/or computational prediction (Baek et al., 2023)) to allow for large numbers of P3 sensors for different protein-protein interactions to be deployed within the same cell. Overcoming these four challenges is necessary for P3 editing to be broadly useful for gating genome editing on physiological levels of specific protein-protein interactions in a multiplex fashion.”

      Reviewer #2 (Recommendations For The Authors):

      It does not appear that all plasmids necessary to reproduce the results of this paper have been deposited to addgene, but only a small subset. The authors might include that these plasmids are available upon request, if not uploaded to a public repository.

      We have added a statement that additional plasmids are available upon request. Our Data Availability Statement reads (with the added sentence underlined):

      “Raw sequencing data have been uploaded to Sequencing Read Archive (SRA) with the associated BioProject ID PRJNA1004865. The following plasmids have been deposited to Addgene: pU6-crRNA-MS2, pU6-BoxB-petracrRNA, pCMV-LambdaN-MCP, pCMV-LambdaN-NbALFA,  and pCMV-ALFA-MCP (Addgene ID 207624 - 207628). The rest of the plasmids used in this study are available upon request.”

      It could be useful to include somewhere why, specifically, editing the guide RNAs as opposed to the Cas9 itself is advantageous. Light-inducible split Cas9s have been engineered, and I imagine other PPI-inducible split Cas9s have also been engineered. A specific mention of the advantages of using engineered split pegRNAs could put the significance of this work in a better context.

      Thanks for raising this, and we agree. We have revised the first paragraph of the Results section to highlight why we think splitting the guide RNAs as opposed to Cas9 might be advantageous:

      “In the split architecture, the “dimerization module” is a key sensor component. Although strategies that split the protein component of the genome editing complex have been described (e.g., split-Cas9 (Yu et al., 2020)), we reasoned that having the guide RNA serve as the dimerization module rather than the protein, i.e. by splitting it into two parts, and making the restoration of its function dependent on a molecular proximity event, would afford even more control. For example, if multiple split gRNAs were present within the same cell, they could be independently controlled, whereas a split Cas9 would only allow a single control point.  In our initial experiments, we focused on splitting the pegRNA used in prime editing.”

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The authors test the "OHC-fluid-pump" hypothesis by assaying the rates of kainic acid dispersal both in quiet and in cochleae stimulated by sounds of different levels and spectral content. The main result is that sound (and thus, presumably, OHC contractions and expansions) results in faster transport along the duct. OHC involvement is corroborated using salicylate, which yielded results similar to silence. Especially interesting is the fact that some stimuli (e.g. tones) seem to provide better/faster pumping than others (e.g. noise), ostensibly due to the phase profile of the resulting cochlear traveling-wave response.

      Strengths:

      The experiments appear well controlled and the results are novel and interesting. Some elegant cochlear modeling that includes coupling between the organ of Corti and the surrounding fluid as well as advective flow supports the proposed mechanism.

      Weaknesses:

      It's not clear whether the effect size (e.g., the speed of sound-induced pumping relative to silence) is large enough to have important practical applications (e.g., for drug delivery). The authors should comment on the practical requirements and limitations.

      With our current data, what we can conclude is that modest sound levels (e.g., 75 dB SPL noise or an 80 dB SPL tone) facilitates cochlear drug delivery. We added a paragraph to the Discussion stating some future considerations for application to drug delivery in the human cochlea.

      Although helpful so far as it goes, the modeling could be taken much further to help understand some of the more interesting aspects of the data and to obtain testable predictions. In particular, the authors should systematically explore the level effects they find experimentally and determine whether the model can replicate the finding that different sounds produce different results (e.g. noise vs tone).

      The model should also be used to relate the model's flow rates more quantitatively to the properties of the traveling wave (e.g., its phase profile).

      The present study is focused on explaining the principle of mass transport in the cochlea. The quantification of the relationship between flow rate and traveling wave is an important open question and will be the topic of future studies. Our previous modeling study (Shokrian et al. 2020) showed a clear relation between the traveling wave characteristics (e.g., amplitude and phase velocity) and the mass transport in the Corti fluid. As the reviewer correctly pointed out, the current paper is focused on designing controlled experiments to provide proof of concept along with computational simulations to support our major claim (that outer hair cells stir cochlear fluid). 

      Finally, the model should be used to investigate differences between active and passive OHCs (e.g., simulating the salicylate experiment by disabling the model's OHCs).

      What the reviewer asks for has been demonstrated in previous theoretical studies (Lighthill, 1992; Edom, Obrist, Kleiser, 2014; Sumner, Reichenbach, 2021). In some of the previous studies, it was called the steady streaming. These studies are excellent examples because they simulated the sensitive cochlea (similar level of basilar membrane vibrations) but did not incorporate the Corti fluid peristalsis. Even without the peristaltic motion of the Corti tube, the basilar membrane-scala fluid interaction generated steady streaming (creepy fluid flow). However, the streaming velocity of cochlear models without active peristalsis along the Corti tube is about three orders of magnitude smaller than the active cochlea at a comparable level of basilar membrane vibrations. For example, the peak streaming speed was < 0.1 um/s at 80 dB SPL, and it took > 4 hours for particles to travel 1 mm. This speed is much slower than the particle transport speed due to pure diffusion (Sumner, Reichenbach, 2021).

      The manuscript would be stronger if the authors discussed ways to test their hypothesis that OHC motility serves a protective effect by pumping fluid. For example, do animals held in quiet after noise exposure (TTS) take longer to recover?

      We agree with the reviewer. The following statements were added to the Discussion section. “Our results have implications for cochlear fluid homeostasis. For example, future studies can test the hypothesis that an acoustically rich environment would be beneficial in maintaining healthy hearing as well as in recovering from transient hearing loss.”

      Reviewer #2 (Public review):

      Summary:

      Recent cochlear micromechanical measurements in living animals demonstrated outer hair celldriven broadband vibration of the reticular lamina that contradicts frequency-selective cochlear amplification. The authors hypothesized that motile outer hair cells can drive cochlear fluid circulation. This hypothesis was tested by observing the effects of acoustic stimuli and salicylate, an outer hair cell motility blocker, on kainic acid-induced changes in the cochlear nucleus activities. It was found that acoustic stimuli can reduce the latency of the kainic acid effect, and a low-frequency tone is more effective than broadband noise. Salicylate reduced the effect of acoustic stimuli on kainic acid-induced changes. The authors also developed a computational model to provide the physical basis for interpreting experimental results. It was concluded that experimental data and simulations coherently indicate that broadband outer hair cell action is for cochlear fluid circulation.

      Strengths:

      The major strengths of this study include its high significance and the combination of electrophysiological recording of the cochlear nucleus responses with computational modeling. Cochlear outer hair cells have been believed to be responsible for the exceptional sensitivity, sharp tuning, and huge dynamic range of mammalian hearing. Recent observation of the broadband reticular lamina vibration contradicts frequency-specific cochlear amplification. Moreover, there is no effective noninvasive approach to deliver the drugs or genes to the cochlea for treating sensorineural hearing loss, one of the most common auditory disorders. These important questions were addressed in this study by observing outer hair cells' roles in the cochlear transport of kainic acid. The well-established electrophysiological method for recording cochlear nucleus responses produced valuable new data, and the purposely developed computational model significantly enhanced the interpretation of the data.

      The authors successfully tested their hypothesis, and both the experimental and modeling results support the conclusion that active outer hair cells can drive cochlear fluid circulation in the living cochlea.

      Findings from this study will help auditory scientists understand how the outer hair cells contribute to cochlear amplification and normal hearing.

      We thank the reviewer for acknowledging our effort.

      Weaknesses:

      While the statement "The present study provides new insights into the nonselective outer hair cell action (in the second paragraph of Discussion)" is well supported by the results, the authors should consider providing a prediction or speculation of how this hair cell action enhances cochlear sensitivity. Such discussion would help the readers better understand the significance of the current work.

      We added a potential implication to the Discussion, that an acoustically rich environment could be beneficial in maintaining healthy hearing as well as recovering from damaged hearing.

      Reviewer #3 (Public review):

      Summary:

      This study reveals that sound exposure enhances drug delivery to the cochlea through the nonselective action of outer hair cells. The efficiency of sound-facilitated drug delivery is reduced when outer hair cell motility is inhibited. Additionally, low-frequency tones were found to be more effective than broadband noise for targeting substances to the cochlear apex. Computational model simulations support these findings.

      Strengths:

      The study provides compelling evidence that the broad action of outer hair cells is crucial for cochlear fluid circulation, offering a novel perspective on their function beyond frequency-selective amplification. Furthermore, these results could offer potential strategies for targeting and optimizing drug delivery throughout the cochlear spiral.

      Weaknesses:

      The primary weakness of this paper lies in the surgical procedure used for drug administration through the round window. Opening the cochlea can alter intracochlear pressure and disrupt the traveling wave from sound, a key factor influencing outer hair cell activity. However, the authors do not provide sufficient details on how they managed this issue during surgery. Additionally, the introduction section needs further development to better explain the background and emphasize the significance of the work.

      Although we wrote that the inner ear left intact, it might have not been sufficiently clear. Our surgical approach leaves the inner ear intact, including the round-window membrane. The round window in gerbil is concave like a bowl. We applied 4 µL of kainic acid solution in the round-window niche, without perforating the round-window membrane. 

      Recommendations For The Authors:

      Reviewer #1 (Recommendations for the authors):

      The authors' choice to frame their findings by hinting that they have discovered the "real" reason for the evolution of broadband OHC electromotility (e.g., the first and last sentences of the abstract and parts of the Discussion), although clearly intended to boost the perceived significance of the work, does them no favors and will probably lead to distracting criticisms they could easily have avoided. The manuscript would be significantly improved by removing or downplaying these rather speculative and unsupported claims; the work stands on its own without them.

      We agree that the first line of the Abstract might distract the readers. Meanwhile, in the Discussion, we believe the readers will appreciate our speculation of how this study is relevant to recent debates on hearing mechanics. Following the reviewer’s advice, we have revised the Abstract.

      Reviewer #3 (Recommendations for the authors):

      Please review the detailed comments below. I hope they contribute to enhancing the paper:

      We thank the reviewer for this detailed advice. All of these comments make good sense and were very helpful in improving this paper or in planning future studies. 

      Many of the comments were relevant to the computer model, and they have one common basis, which we have not yet achieved. I.e., simulating the level-dependence. 

      I. Introduction

      (1) Please clarify and improve this sentence. Effective and safe strategies for delivering treatments to the inner ear have been reported: 'Consequently, intervening in hearing health by delivering substances to the inner-ear fluid is challenging'.

      The preceding statement is regarding the blood-labyrinthine barrier (BLB), comparable to the bloodbrain barrier (BBB). We revised the statement: “Consequently, intervening in hearing health by delivering substances to the inner-ear fluid through systemic circulation is challenging.”

      (2) Please expand on how the secretion and absorption of ions and molecules maintain the unique ionic compositions of the two intracochlear fluids. Include details on the role of the stria vascularis and the specific functions of the three types of strial cells in this process.

      In response to this request, we added a paragraph discussing cochlear fluid homeostasis. Our study is different from existing homeostasis studies in three regards. First, the site: Existing studies are centered on the stria vascularis, while this study concerns the Corti fluid. Second, the mechanism: Existing studies are regarding metabolic transport, while our scope is the transport due to fluid flow. Third, the range: Existing studies considered local electrochemical equilibrium within a radial section, while this study concerns global (longitudinal) mass transport. To address this comment, the following was added to the Discussion.

      “Our study complements existing studies regarding cochlear fluid homeostasis and differs from previous studies in several ways. The intrastrial fluids (extracellular fluids in the stria vascularis) have been more thoroughly investigated because the three layers in the stria vascularis (marginal, intermediate, and basal cells) maintain the endocochlear potential (Wangemann 2006).

      Equilibrium in the Corti fluid has been sparsely investigated because its electrochemical gradient is modest compared to that of the intrastrial fluids (Johnstone, Patuzzi et al. 1989; Zidanic and Brownell 1990). Local electrochemical balance in the cochlear fluids has been considered within a radial section (Quraishi and Raphael 2008; Patuzzi 2011; Nin, Hibino et al. 2012). Our study is focused on the longitudinal (global) equilibrium along the cochlear coil and did not consider the equilibrium across the stria vascularis cell layers. To examine whether the longitudinal fluid flow driven by outer hair cells is strong enough to affect cochlear fluid homeostasis, future studies should measure the K+ equilibrium and recycling along the length of the Corti fluid under sound and silence conditions.“

      (3) Please provide a more detailed explanation and definition of a longitudinal electrochemical gradient, including how it functions and its relevance in physiological processes.

      The most researched electrochemical gradient of the cochlea must be the endocochlear potential that varies along the cochlear length. The endocochlear potential at any location is determined by the equilibrium between the source and the sink. In the view of the Corti fluid, the source is the potassium current out of the hair cells and the sink is the resorption of potassium by supporting cells. The effect of a longitudinal electrochemical gradient on hearing physiology is beyond the scope of this study. To do so would require incorporating detailed K+ equilibrium dynamics. This certainly is one of our future directions. 

      (4) Please include the necessary references to support these three sentences: "Diffusion is an effective mechanism for a substance to travel along submicrometer distances. For instance, it takes microseconds for neurotransmitters to diffuse across a 20-nm synaptic gap. In contrast, diffusion is inefficient for travel on the centimeter scale. It takes days for a drug applied at the round window to travel 30 mm to the apical end of the human cochlea. In practice, the substance would not reach the apex because it would be resorbed before traveling the distance".   

      A reference was added (Berg, 1993). Our description of diffusion is based on the fundamental physics of Fick’s laws.

      (5) In paragraph 3, the author only discussed a portion of the previous approaches. There are numerous methods for inner ear delivery, including external, middle ear, and direct inner ear delivery via the round window or semicircular canal. Each method has its pros and cons, which the authors should carefully address. For example, the semicircular canal approach doesn't require two perforations in the inner ear and distributes the injection evenly throughout the cochlea.  

      A recent review paper regarding inner ear drug delivery was added as a reference (Szeto, Chiang et al. 2020). Drug delivery is a means to demonstrate the OHC’s role in longitudinal mass transport. We are concerned that comparing different drug delivery modalities in detail would distract the readers from the main point of this study. We mentioned ‘one remedy’ with two perforations, for which abundant case studies are found in the literature. Discussing existing approaches exhaustively can be better done by review papers.

      (6) The following sentence is inaccurate and should be carefully rephrased. Previous reports chose higher volumes than the actual fluid volume to maximize the drug (or gene) effect, but this was not a requirement of the delivery methods: 'Such an invasive approach requires the injection of a substantial fluid volume, larger than the entire perilymph in the inner ear'.

      We revised the statement to relax the wording ‘require’: ‘Such an invasive approach is often associated with the injection of a substantial fluid volume, larger than the entire perilymph in the inner ear (Szeto, Chiang et al. 2020)'. This statement might be acceptable because we found few invasive delivery papers that used < 1 µL. Moreover, the physics basis of the injection method is to replace the fluid in a labyrinth compartment with a new fluid (a good example where this fluid physics was tested with quantitative data is the Lichtenhan et al. 2016 paper).

      (7) Please provide the necessary references. Also, clarify what is meant by 'actuator cells'. Are you referring to hair cells?: 'The tube-shaped organ of Corti (OoC) is lined with actuator cells and the cells are activated systematically with a large phase velocity (> a few m/s) toward the apex'.

      Yes, we meant OHCs as the actuator cells. This point has been clarified. A reference for the phase velocity has been added (Olson, Duifhuis, Steele, 2012).

      II. Results

      (1) Is there a specific reason you use 60 or 75 dB SPL for broadband sounds, but opt for louder sounds (80 dB SPL) for pure tones?

      It is not straightforward to compare the SPL between broadband noise and a pure tone, and we did not attempt to ‘equate’ them in any way. 

      (2) Please provide specific details about the sound generation protocol, including the duration, start time, end time, and any other relevant parameters. Here is an example of a vague sentence. Do you play the sounds continuously during these time periods, or only at specific intervals?: 'In two example cases, the effect time at low-CF locations (CFs near 2 kHz) was 15 minutes for the case of the 0.5 kHz tone (Fig. 3A)'

      It is described in the Measurement protocol part of the Methods section (see the red text below). In the exampled case and all other cases, the sounds were played continually (not continuously).

      For the “Sound” protocol, 1.1-s noise pips (60 or 75 dB SPL, 0.1-12 kHz bandwidth, 0.8-s duration including 0.15-s onset/offset ramps) were presented continually. After 48 noise pips, one 1.1-s silent pause and three CF tone pips followed (a total of 51 pips and a pause make a 57.2-s sequence). The CF tone pips were presented at the level of 35 dB SPL to monitor neural responses. The silence pause was to monitor spontaneous neural responses. The sequence was repeated until neural signals at the lowest CF site were completely abolished. The neural responses presented in this study are the ‘driven responses’ obtained by subtracting the spontaneous responses from the responses to the 35 dB CF tones. For the “Silence” or “Pure-tone” protocol, the noise pips of the Sound protocol were replaced with either silence pauses or a pure tone at 80 dB SPL.

      (3) Providing a schematic timeline of your experiments indicating sound generation, kainic acid (and salicylate) application, as well as DPOAE and AVCN recordings would greatly help in understanding and following your results.

      We have revised Figure 2.

      (4) How did you control the opening(s) for the injection? The openings could alter intracochlear pressure and affect the traveling wave from the sound, which is the major factor influencing outer hair cell activity.

      We did not open the inner ear. The round window remained intact. Opening the bulla does not affect the intracochlear pressure. We have clarified this issue, beginning with the first sentence of the Abstract. Thanks for raising this important question.

      (5) Is there any reason why the author generated only low and mid-frequencies? If so, please address what the limitations were in testing high frequency.

      There are no limitations to testing high frequencies. High frequencies would not affect drug delivery to the apex of the cochlea because the traveling waves stop right after the CF location. We are interested in delivering drugs deeper into the apex. Our presented results support this reasoning: mid-frequency stimulation was less effective for delivery to the low CF location.

      (6) I suggest combining Figures 3E and 3F to facilitate a direct comparison between the Silence and Noise conditions, as the MF and LF plots are overlapping in these panels.

      We considered this change but realized that it might introduce confusion and difficulty in parsing the results. Moreover, the two panels have their respective messages. 

      (7) In Figure 3E, why does the LF tone affect both Low and Mid CFs, while the MF tone only affects Mid CF?

      The cochlear traveling wave stops right after the CF location. Peristaltic action takes place in the broad tail region of the traveling waves (see Fig. 5C).

      III. Materials and Methods

      (1) Please provide details about your injection protocol. Did you create additional perforations? How did you target the round window? What was the injection rate? How did you seal the round window, and so on?

      The inner ear including the round window was left intact. Only the bulla was open.

      (2) Please include details about your surgical procedure for the AVCN recording, including probe insertion.

      AVCN recording is a well-established technique. Instead of reintroducing the method, we added a classical reference with friendlier description (Frisina, Chamberlain, et al., 1982). 

      IV. Minor points

      (1) Please include the full terms for the abbreviations 'CF', 'DPOAEs', 'PT', 'IP', and 'RW' for readers who are not in the hearing research field.

      We have checked that these abbreviations were defined.

      (2) Are 'GXXX's in figures animal identifiers? Please clarify what they represent.

      Yes, they are animal identifiers. We have clarified this point in Fig. 1 caption.

    1. Author response:

      In response to your comments, we will revise our manuscript to address the limitations raised, including our ability to rigorously test how observed changes in gene expression in shrews are adaptive. The phylogenetic ANOVA we use (EVE), tests for a separate RNA expression optimum specific to the shrew lineage for each gene, and is consistent with expectations for adaptive evolution of gene expression. However, as you noted, while this analysis highlights many candidate genes potentially under positive selection, further functional validation is required to confirm if and how these genes contribute to Dehnel’s phenomenon. We will emphasize that inferred adaptive expression of these genes is putative in our discussion and outline that future studies are needed to test the function of proposed adaptations. For example, cell line validations of BCL2L1 on apoptosis is a case study that tests the function of a putatively adaptive change in gene expression, and it illuminates this limitation. We will also refine our discussion to focus more on pathway-level analyses rather than on individual genes.

      We recognize that our methodological choices may not have been fully transparent, such as our selection of gene expression clusters for the pathway enrichment analysis and our focus on BCL2L1 for functional validation in cell lines. We will expand on these decisions in the methods section to provide greater clarity for our readers.

      Regarding the use of sex as a covariate, we acknowledge the concerns raised. In our evolutionary analyses, we maintained a balanced sex ratio when possible. EVE models handle the effect of sex on gene expression as intraspecific variation, reflective of plasticity. In shrews, however, we used males exclusively. Females were only found among juvenile individuals and including them would have introduced developmental variation with larger, negative impacts on these results. For the seasonal data, we will now include sex as a covariate in differential expression analyses, however, our design is imbalanced in relation to sex. We will account for this limitation and discuss it further in the revised manuscript.

    1. Author response:

      We sincerely thank you for your constructive and insightful feedback on our manuscript, including the assessment of its strengths and suggestions for improvements. This will allow us to enhance the clarity and impact of our work. In our revised manuscript, we will address your recommendations as follows:

      (1) Disambiguating whether the joystick eccentricity reflects the subject’s confidence or simply the perceived stimulus strength or coherence

      We agree that this is a pivotal issue for the interpretation of our results. We are confident that the joystick “eccentricity” (i.e., radial joystick deviation from the center) does not simply correlate with the moment-to-moment fluctuations of stimulus coherence. The observations that the radial joystick response varied considerably more than the stimulus fluctuations within each subject and each coherence level, and the analysis of metacognitive sensitivity, suggest that subjects indeed incorporated confidence judgements into their continuous reports. As proposed, we will further explore the established signatures of metacognitive confidence reports, and we will quantify the motion energy fluctuations within time intervals where the nominal stimulus parameters remained constant, to examine whether accuracy and confidence levels vary in response to these fluctuations. This approach will provide deeper insights into continuous dynamics within our paradigm.

      (2) Rationale for Social Investigation

      We will clarify the rationale and methodology of the social aspects in our experiments to better contextualize our approach and findings and their relationship to the field of collective decision-making. In particular, we will further emphasize that while our paradigm indeed did not impose integrating the information from the partner and did not involve incentives for collectively solving the task, the participants could (and did) incorporate the social information into their judgements and mostly improved their earnings. In this way, our approach complements the studies that required joint decisions.

      (3) Streamlining and Terminology

      We will streamline the text and figure legends to present our main arguments more concisely and improve the overall flow of the manuscript. Additionally, we will include a glossary to the main text to clarify terminology, enhancing accessibility and ensuring consistent understanding of key terms throughout the paper.

      To clarify two of the points upfront, we indeed used the term “eccentricity” not in a visual science sense but as the measure of radial joystick deviation from the center and the corresponding angular width of the response arc; we now realize that this is confusing in the context of visual psychophysics paper and will use another word. The term “dyadic” was meant to describe the experimental condition when two participants worked on the task, and associated measures of performance in this condition. The “dyadic score”, defined as the average score across the two participants in the dyadic condition, will be renamed as “combined score”.  

      (4) Incorporation of Additional Literature

      We acknowledge and appreciate the recommendations for additional relevant literature, which we will incorporate into our discussion. This will allow us to contextualize our findings more thoroughly within the existing body of research and highlight the broader implications of our work.

    1. Author response:

      eLife Assessment

      This valuable study uses consensus-independent component analysis to highlight transcriptional components (TC) in high-grade serous ovarian cancers (HGSOC). The study presents a convincing preliminary finding by identifying a TC linked to synaptic signaling that is associated with shorter overall survival in HGSOC patients, highlighting the potential role of neuronal interactions in the tumor microenvironment. This finding is corroborated by comparing spatially resolved transcriptomics in a small-scale study; a weakness is in being descriptive, non-mechanistic, and requiring experimental validation.

      We sincerely thank the editors for the valuable and constructive feedback. We appreciate the recognition of our findings and the significance of identifying transcriptional components in high-grade serous ovarian cancers. We acknowledge the insightful point on our study's descriptive nature and limited mechanistic depth. While further experimental validation would indeed enhance our conclusions, such work extends beyond the current scope of this manuscript. However, we would like to highlight that mechanistic studies demonstrating the impact of tumor-infiltrating nerves on disease progression are emerging (Zahalka et al., 2017; Allen et al., 2018; Balood et al., 2022; Jin et al., 2022; Globig et al., 2023; Restaino et al., 2023; Darragh et al., 2024). Importantly, members of our group have contributed to these findings. These studies, including in vitro and in vivo work in head and neck squamous cell carcinoma as well as high-grade serous ovarian carcinoma, demonstrate that substance P released from tumor-infiltrating nociceptors potentiates MAP kinase signaling in cancer cells, thereby influencing disease progression. This effect can be mitigated in vivo by blocking the substance P receptor (Restaino et al., 2023). Our present work identifies a transcriptional component that aligns with the presence of functional nerves within malignancies. These published mechanistic studies support our findings and suggest that this transcriptional component could serve as a potential screening tool to identify innervated tumors. Such information is clinically relevant, as patients with innervated tumors may benefit from more aggressive therapy.

      Reviewer #1 (Public review):

      This manuscript explores the transcriptional landscape of high-grade serous ovarian cancer (HGSOC) using consensus-independent component analysis (c-ICA) to identify transcriptional components (TCs) associated with patient outcomes. The study analyzes 678 HGSOC transcriptomes, supplemented with 447 transcriptomes from other ovarian cancer types and noncancerous tissues. By identifying 374 TCs, the authors aim to uncover subtle transcriptional patterns that could serve as novel drug targets. Notably, a transcriptional component linked to synaptic signaling was associated with shorter overall survival (OS) in patients, suggesting a potential role for neuronal interactions in the tumor microenvironment. Given notable weaknesses like lack of validation cohort or validation using another platform (other than the 11 samples with ST), the data is considered highly descriptive and preliminary.

      Strengths:

      (1) Innovative Methodology:

      The use of c-ICA to dissect bulk transcriptomes into independent components is a novel approach that allows for the identification of subtle transcriptional patterns that may be overshadowed in traditional analyses.

      We sincerely thank the reviewer for recognizing the strengths and novelty of our study. We appreciate the positive feedback on our use of consensus-independent component analysis (c-ICA) to decompose bulk transcriptomes, which we believe allowed us to detect subtle transcriptional signals often overlooked in traditional analyses.

      (2) Comprehensive Data Integration:

      The study integrates a large dataset from multiple public repositories, enhancing the robustness of the findings. The inclusion of spatially resolved transcriptomes adds a valuable dimension to the analysis.

      Thank you for recognizing the robustness of our study through comprehensive data integration. We appreciate the acknowledgment of our efforts to leverage a large, multi-source dataset, as well as the additional insights gained from spatially resolved transcriptomes. We believe this integrative approach enhances the depth of our analysis and contributes to a more nuanced understanding of the tumor microenvironment.

      (3) Clinical Relevance:

      The identification of a synaptic signaling-related TC associated with poor prognosis highlights a potential new avenue for therapeutic intervention, emphasizing the role of the tumor microenvironment in cancer progression.

      We appreciate the reviewer’s recognition of the clinical implications of our findings. The identification of a synaptic signaling-related transcriptional component associated with poor prognosis underscores the potential for novel therapeutic targets within the tumor microenvironment. We agree that this insight could open new avenues for intervention and further highlights the role of neuronal interactions in cancer progression.

      Weaknesses:

      (1) Mechanistic Insights:

      While the study identifies TCs associated with survival, it provides limited mechanistic insights into how these components influence cancer progression. Further experimental validation is necessary to elucidate the underlying biological processes.

      We appreciate the reviewer’s point regarding the limited mechanistic insights provided in our study. We agree that further experimental validation would enhance our understanding of how the biology captured by these transcriptional components influence cancer progression. However, we respectfully note that such validation is beyond the current scope of this article.   Our current analyses are done on publicly available expression array and spatial transcriptomic array datasets. For future studies, we therefore intend to combine spatial transcriptomic data with immunohistochemical analysis of the same tumors for validation purposes. We have started with setting up in vitro cocultures of neurons and ovarian cancer cells to obtain mechanistic insight in how genes with a large weight in TC121 regulate synaptic signaling and how that affects ovarian cancer cells.

      (2) Generalizability:

      The findings are primarily based on transcriptomic data from HGSOC. It remains unclear how these results apply to other subtypes of ovarian cancer or different cancer types.

      In Figure 5, we present the activity of TC121 across various cancer types, demonstrating broader applicability. However, due to limited treatment response data, we were unable to assess associations between TC activity scores and patient response. Additionally, transcriptomic and survival data specific to other ovarian cancer subtypes beyond HGSOC are currently not available, limiting our ability to generalize these findings to those groups. We intend to leverage survival data from TCGA to explore associations between TC activity scores and overall survival of patients with other cancer types. Nonetheless, we recognize limitations with TCGA survival data, as outlined in this article: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8726696/.

      (3) Innovative Methodology:

      Requires more validation using different platforms (IHC) to validate the performance of this bulk-derived data. Also, the lack of control over data quality is a concern.

      We acknowledge the reviewer’s suggestion to validate our results with alternative platforms, such as IHC; however, we regret that such validation is beyond the scope of this article. Regarding data quality control, we implemented a series of checks:

      • Bulk Transcriptional Profiles: We applied principal component analysis (PCA) on the sample Pearson product-moment correlation matrix, focusing on the first principal component (PCqc), which accounted for approximately 80-90% of the variance, primarily reflecting technical rather than biological variability  (Bhattacharya et al., 2020). Samples with a correlation below 0.8 with PCqc were removed as outliers. Additionally, we generated unique MD5 hashes for each CEL file to identify and exclude duplicate samples. Per gene, expression values were standardized to a mean of zero and a variance of one across the GEO, CCLE, GDSC, and TCGA datasets to minimize probeset- or gene-specific variability.

      • Spatial Transcriptional Profiles: We used PCA for quality control here as well, retained samples only if their loading factors for the first principal component showed consistent signs across all profiles (i.e., all profiles had either positive or negative loading factors for the first PC) from that individual spatial transcriptomic sample. Samples that did not meet this criterion were excluded from analyses.

      (4) Clinical Application:

      Although the study suggests potential drug targets, the translation of these findings into clinical practice is not addressed. Probably given the lack of some QA/QC procedures it'll be hard to translate these results. Future studies should focus on validating these targets in clinical settings.

      While this study is exploratory in nature, we agree that future studies should focus on validating these potential drug targets in clinical settings. As suggested, QA/QC procedures were integral to our analyses. We applied rigorous quality control, including PCA-based checks and duplicate removal across datasets, to ensure data integrity (detailed in our previous response).

      In terms of clinical application, which we partially discussed in the manuscript, we will discuss additional strategies to prevent synaptic signaling and neurotransmitter release in the tumor microenvironment (TME). Drugs such as ifenprodil and lamotrigine are used in treating neuronal disorders to block glutamate release responsible for subsequent synaptic signaling, whereas the vesicular monoamine transporter (VMAT) inhibitor reserpine can block the formation of synaptic vesicles (Reid et al., 2013; Williams et al., 2001). Previous in vitro studies with HGSOC cell lines showed a significant effect of ifenprodil alone on cancer cell proliferation, whereas reserpine seemed to trigger apoptosis in cancer cells (North et al., 2015; Ramamoorthy et al., 2019). Such strategies could potentially be used to inhibit synaptic neurotransmission in the TME.

      Reviewer #2 (Public review):

      Summary:

      Consensus-independent component analysis and closely related methods have previously been used to reveal components of transcriptomic data that are not captured by principal component or gene-gene coexpression analyses.

      Here, the authors asked whether applying consensus-independent component analysis (c-ICA) to published high-grade serous ovarian cancer (HGSOC) microarray-based transcriptomes would reveal subtle transcriptional patterns that are not captured by existing molecular omics classifications of HGSOC.

      Statistical associations of these (hitherto masked) transcriptional components with prognostic outcomes in HGSOC could lead to additional insights into underlying mechanisms and, coupled with corroborating evidence from spatial transcriptomics, are proposed for further investigation.

      This approach is complementary to existing transcriptomics classifications of HGSOC.

      The authors have previously applied the same approach in colorectal carcinoma (Knapen et al. (2024) Commun. Med).

      Strengths:

      (1) Overall, this study describes a solid data-driven description of c-ICA-derived transcriptional components that the authors identified in HGSOC microarray transcriptomics data, supported by detailed methods and supplementary documentation.

      We thank the reviewer for acknowledging the strength of our data-driven approach and the use of consensus-independent component analysis (c-ICA) to identify transcriptional components within HGSOC microarray data. We aimed to provide comprehensive methodological detail and supplementary documentation to support the reproducibility and robustness of our findings. We believe this approach allows for the identification of subtle transcriptional signals that might be overlooked by traditional analysis methods.

      (2) The biological interpretation of transcriptional components is convincing based on (data-driven) permutation analysis and a suite of analyses of association with copy-number, gene sets, and prognostic outcomes.

      We appreciate the reviewer’s positive feedback on the biological interpretation of our transcriptional components. We are pleased that our approach, which includes data-driven permutation testing and analyses of associations with copy-number alterations, gene sets, and prognostic outcomes, was found convincing. These analyses were integral to enhancing the robustness and biological relevance of our findings.

      (3) The resulting annotated transcriptional components have been made available in a searchable online format.

      Thank you for acknowledging the availability of our annotated transcriptional components in a searchable online format.

      (4) For the highlighted transcriptional component which has been annotated as related to synaptic signalling, the detection of the transcriptional component among 11 published spatial transcriptomics samples from ovarian cancers appears to support this preliminary finding and requires further mechanistic follow-up.

      Thank you for acknowledging the accessibility of our annotated transcriptional components. We prioritized making these data available in a searchable online format to facilitate further research and enable the community to explore and validate our findings.

      Weaknesses:

      (1) This study has not explicitly compared the c-ICA transcriptional components to the existing reported transcriptional landscape and classifications for ovarian cancers (e.g. Smith et al Nat Comms 2023; TCGA Nature 2011; Engqvist et al Sci Rep 2020) which would enable a further assessment of the additional contribution of c-ICA - whether the c-ICA approach captured entirely complementary components, or whether some components are correlated with the existing reported ovarian transcriptomic classifications.

      We appreciate the reviewer’s insightful suggestion to compare our c-ICA-derived transcriptional components with previously reported ovarian cancer classifications, such as those from Smith et al. (2023), TCGA (2011), and Engqvist et al. (2020). To address this, we will incorporate analyses comparing the activity scores of our transcriptional components with these published landscapes and classifications, particularly focusing on any associations with overall survival. Additionally, we plan to evaluate correlations between gene signatures from these studies and our identified TCs, enhancing our understanding of the unique contributions of the c-ICA approach.

      (2) Here, the authors primarily interpret the c-ICA transcriptional components as a deconvolution of bulk transcriptomics due to the presence of cells from tumour cells and the tumour microenvironment. However, c-ICA is not explicitly a deconvolution method with respect to cell types: the transcriptional components do not necessarily correspond to distinct cell types, and may reflect differential dysregulation within a cell type. This application of c-ICA for the purpose of data-driven deconvolution of cell populations is distinct from other deconvolution methods that explicitly use a prior cell signature matrix.

      Thank you for highlighting this nuanced aspect of c-ICA interpretation. We acknowledge that c-ICA, unlike traditional deconvolution methods, is not specifically designed for cell-type deconvolution and does not rely on a predefined cell signature matrix. While we explored the transcriptional components in the context of tumor and microenvironmental interactions, we agree that these components may not correspond directly to distinct cell types but rather reflect complex patterns of dysregulation, potentially within individual cell populations.

      Our goal with c-ICA was to uncover hidden transcriptional patterns possibly influenced by cellular heterogeneity. However, we recognize these patterns may also arise from regulatory processes within a single cell type. To investigate further, we plan to use single-cell transcriptional data (~60,000 cell-types annotated profiles from GSE158722) and project our transcriptional components onto these profiles to obtain activity scores, allowing us to assess each TC’s behavior across diverse cellular contexts after removing the first principal component to minimize background effects.

      References

      Allen JK, Armaiz-Pena GN, Nagaraja AS, Sadaoui NC, Ortiz T, Dood R, Ozcan M, Herder DM, Haemerrle M, Gharpure KM, Rupaimoole R, Previs R, Wu SY, Pradeep S, Xu X, Han HD, Zand B, Dalton HJ, Taylor M, Hu W, Bottsford-Miller J, Moreno-Smith M, Kang Y, Mangala LS, Rodriguez-Aguayo C, Sehgal V, Spaeth EL, Ram PT, Wong ST, Marini FC, Lopez-Berestein G, Cole SW, Lutgendorf SK, diBiasi M, Sood AK. 2018. Sustained adrenergic signaling promotes intratumoral innervation through BDNF induction. Cancer Res 78:canres.1701.2016.

      Balood M, Ahmadi M, Eichwald T, Ahmadi A, Majdoubi A, Roversi Karine, Roversi Katiane, Lucido CT, Restaino AC, Huang S, Ji L, Huang K-C, Semerena E, Thomas SC, Trevino AE, Merrison H, Parrin A, Doyle B, Vermeer DW, Spanos WC, Williamson CS, Seehus CR, Foster SL, Dai H, Shu CJ, Rangachari M, Thibodeau J, Rincon SVD, Drapkin R, Rafei M, Ghasemlou N, Vermeer PD, Woolf CJ, Talbot S. 2022. Nociceptor neurons affect cancer immunosurveillance. Nature 611:405–412.

      Bhattacharya A, Bense RD, Urzúa-Traslaviña CG, Vries EGE de, Vugt MATM van, Fehrmann RSN. 2020. Transcriptional effects of copy number alterations in a large set of human cancers. Nat Commun 11:715.

      Darragh LB, Nguyen A, Pham TT, Idlett-Ali S, Knitz MW, Gadwa J, Bukkapatnam S, Corbo S, Olimpo NA, Nguyen D, Court BV, Neupert B, Yu J, Ross RB, Corbisiero M, Abdelazeem KNM, Maroney SP, Galindo DC, Mukdad L, Saviola A, Joshi M, White R, Alhiyari Y, Samedi V, Bokhoven AV, John MSt, Karam SD. 2024. Sensory nerve release of CGRP increases tumor growth in HNSCC by suppressing TILs. Med 5:254-270.e8.

      Globig A-M, Zhao S, Roginsky J, Maltez VI, Guiza J, Avina-Ochoa N, Heeg M, Hoffmann FA, Chaudhary O, Wang J, Senturk G, Chen D, O’Connor C, Pfaff S, Germain RN, Schalper KA, Emu B, Kaech SM. 2023. The β1-adrenergic receptor links sympathetic nerves to T cell exhaustion. Nature 622:383–392.

      Jin M, Wang Y, Zhou T, Li W, Wen Q. 2022. Norepinephrine/β2-adrenergic receptor pathway promotes the cell proliferation and nerve growth factor production in triple-negative breast cancer. J Breast Cancer 26:268–285.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      In the present study, Chen et al. investigate the role of Endophilin A1 in regulating GABAergic synapse formation and function. To this end, the authors use constitutive or conditional knockout of Endophilin A1 (EEN1) to assess the consequences on GABAergic synapse composition and function, as well as the outcome for PTZ-induced seizure susceptibility. The authors show that EEN1 KO mice show a higher susceptibility to PTZ-induced seizures, accompanied by a reduction in the GABAergic synaptic scaffolding protein gephyrin as well as specific GABAAR subunits and eIPSCs. The authors then investigate the underlying mechanisms, demonstrating that Endophilin A1 binds directly to gephyrin and GABAAR subunits, and identifying the subdomains of Endophilin A1 that contribute to this effect. Overall, the authors state that their study places Endophilin A1 as a new regulator of GABAergic synapse function.

      Strengths:

      Overall, the topic of this manuscript is very timely, since there has been substantial recent interest in describing the mechanisms governing inhibitory synaptic transmission at GABAergic synapses. The study will therefore be of interest to a wide audience of neuroscientists studying synaptic transmission and its role in disease. The manuscript is well-written and contains a substantial quantity of data.

      Weaknesses:

      A number of questions remain to be answered in order to be able to fully evaluate the quality and conclusions of the study. In particular, a key concern throughout the manuscript regards the way that the number of samples for statistical analysis is defined, which may affect the validity of the data analysed. Addressing this weakness will be essential to providing conclusive results that support the authors' claims.

      We would like to thank the reviewer for appreciation of the value of our study and careful critics to help us improve the manuscript. We will correct the way that the number of samples for statistical analysis is defined throughout the manuscript as suggested and update figures, figure legends, and Materials and Methods accordingly. For example, we will average the values for all dendritic segments from one neuron, so that each data point represents one neuron in the graphs.

      Reviewer #2 (Public review):

      Summary:

      The function of neural circuits relies heavily on the balance of excitatory and inhibitory inputs. Particularly, inhibitory inputs are understudied when compared to their excitatory counterparts due to the diversity of inhibitory neurons, their synaptic molecular heterogeneity, and their elusive signature. Thus, insights into these aspects of inhibitory inputs can inform us largely on the functions of neural circuits and the brain.

      Endophilin A1, an endocytic protein heavily expressed in neurons, has been implicated in numerous pre- and postsynaptic functions, however largely at excitatory synapses. Thus, whether this crucial protein plays any role in inhibitory synapse, and whether this regulates functions at the synaptic, circuit, or brain level remains to be determined.

      New Findings:

      (1) Endophilin A1 interacts with the postsynaptic scaffolding protein gephyrin at inhibitory postsynaptic densities within excitatory neurons.

      (2) Endophilin A1 promotes the organization of the inhibitory postsynaptic density and the subsequent recruitment/stabilization of GABA A receptors via Endophilin A1's membrane binding and actin polymerization activities.

      (3) Loss of Endophilin A1 in CA1 mouse hippocampal pyramidal neurons weakens inhibitory input and leads to susceptibility to epilepsy.

      (4) Thus the authors propose that via its role as a component of the inhibitory postsynaptic density within excitatory neurons, Endophilin A1 supports the organization, stability, and efficacy of inhibitory input to maintain the excitatory/inhibitory balance critical for brain function.

      (5) The conclusion of the manuscript is well supported by the data but will be strengthened by addressing our list of concerns and experiment suggestions.

      We would like to thank the reviewer for their favorable impression of manuscript. We also appreciate the great experiment suggestions to help us improve the manuscript.

      Weaknesses:

      Technical concerns:

      (1) Figure 1F and Figure 1H, Figures 7H,J:

      Can the authors justify using a paired-pulse interval of 50 ms for eEPSCs and an interval of 200 ms for eIPSCs? Otherwise, experiments should be repeated using the same paired pulse interval.

      We apologize for the confusion. As illustrated by the schematic current traces, the decay time constants of eEPSCs and eIPSCs in hippocampal CA1 neurons are different. The eEPSCs exhibit a faster channel closing rate, corresponding to a smaller time constant Tau. Thus, a shorter inter-stimulus interval (50 ms) was chosen for paired-pulse ratio recordings. In contrast, the eIPSCs display a slower channel closing rate, with a Tau value larger than that of eEPSCs, so a longer inter-stimulus interval (200 ms) was used for PPR. This protocol has been long-established and adopted in previous studies (please see below for examples).

      Contractor, A., Swanson, G. & Heinemann, S. F. Kainate receptors are involved in short- and long-term plasticity at mossy fiber synapses in the hippocampus. Neuron 29, 209-216, doi:10.1016/s0896-6273(01)00191-x (2001).

      Babiec, W. E., Jami, S. A., Guglietta, R., Chen, P. B. & O'Dell, T. J. Differential Regulation of NMDA Receptor-Mediated Transmission by SK Channels Underlies Dorsal-Ventral Differences in Dynamics of Schaffer Collateral Synaptic Function. Journal of neuroscience 37, 1950-1964, doi:10.1523/JNEUROSCI.3196-16.2017 (2017).

      (2) Figures 3G,H,I:

      While 3D representations of proteins of interest bolster claims made by superresolution microscopy, SIM resolution is unreliable when deciphering the localization of proteins at the subsynaptic level given the small size of these structures (<1 micrometer). In order to determine the actual location of Endophilin A1, especially given the known presynaptic localization of this protein, the authors should complete SIM experiments with a presynaptic marker, perhaps an active zone protein, so that the relative localization of Endophilin A1 can be gleaned. Currently, overlapping signals could stem from the presynapse given the poor resolution of SIM in this context.

      Thanks for your suggestions. It is certainly preferable to investigate the relative localization of endophilin A1 using both presynaptic and postsynaptic markers. For SIM imaging in Figure 3G-I, to visualize neuronal morphology, we immunostained GFP as cell fill, leaving two other channels for detection of immunofluorescent signals of endophilin A1 and another protein. We will try co-immunostaining of endophilin A1, the active zone protein bassoon (presynaptic marker) and gephyrin without morphology labeling. Alternatively, we will do co-staining of endophilin A1 and bassoon in GFP-expressing neurons. We agree that overlapping signals or proximal localization of presynaptic endophilin A1 with gephyrin or GABAAR γ2 could not be ruled out. To note, if image resolution is improved with the use of a more advanced imaging system, the overlap between two proteins will become smaller or even disappear. With the ~110 nm lateral resolution of SIM microscopy, the degree of overlap between the two proteins of interest is much lower than in confocal microscopy. Given the presynaptic localization of endophilin, most likely we will observe a small overlap (presynatpic) or proximal localization (postsynaptic) of endophilin A1 with bassoon. Nevertheless, we will complete the SIM experiments as suggested to improve the manuscript.

      Manuscript consistency:

      (1) Figure 2:

      The authors looked at VGAT and noticed a reduction of signals in hippocampal regions in their P21 slices, indicating that the proposed postsynaptic organization/stabilization functions of Endophilin A1 extend to the inhibitory presynapse, perhaps via Neuroligin 2-Neurexin. Simultaneously, hippocampal regions in P21 slices showed a reduction in PSD-95 signals, indicating that excitatory synapses are also affected. It would be crucial to also look at excitatory presynapses, via VGLUT staining, to assess whether EndoA1 -/- also affects presynapses. Given the extensive roles of Endophilin A1 in presynapses, especially in excitatory presynapses, this should be investigated.

      Thanks for the thoughtful comments. Given that the both VGAT and PSD95 signals are reduced in hippocampal regions in P21 slices, it is conceivable that the proposed postsynaptic organization/stabilization functions of endophilin A1 extend to the inhibitory presynapse via Neuroligin-2-Neurexin and the excitatory presynapse as well during development. Of note, endophilin A1 knockout did not impair the distribution of Neuroligin-2 in inhibitory postsynapses (immunoisolated with anti-GABAAR α1) in mature mice (Figure 3K), and endophilin A1 did not bind to Neuroligin-2 (Figure 4D), suggesting that endophilin A1 might function via other mechanisms. Nevertheless, as functions of endophilin A family members at the presynaptic site are well-established, the reduction of presynaptic signals in developmental hippocampal regions of EndoA-/- mice might result from the depletion of presynaptic endophilin A1. The presynaptic deficits can be compensatory by other mechanisms as neurons mature. Certainly, we will do VGLUT staining of EndoA1-/- brain slices as suggested to assess the role of endophilin A1 in excitatory presynapses in vivo.

      (2) Figure 7C:

      The authors do not assess whether p140Cap overexpression rescues GABAAR receptor loss exhibited in Endophilin A1 KO, as they did for Gephryin. This would be an important data point to show, as p140Cap may somehow rescue receptor loss by another pathway. In fact, it is mentioned in the text that this experiment was done, "Consistently, neither p140Cap nor the endophilin A1 loss-of-function mutants could rescue the GABAAR clustering phenotype in EEN1 KO neurons (Figure 7C, D)" yet the data for p140Cap overexpression seem to be missing. This should be remedied.

      Thanks a lot for the thoughtful comment. We will determine whether p140Cap overexpression also rescues the GABAAR clustering phenotype in EndoA1-/- neurons by surface GABAAR γ2 staining in our revised manuscript.

      Reviewer #3 (Public review):

      Summary:

      Chen et al. identify endophilin A1 as a novel component of the inhibitory postsynaptic scaffold. Their data show impaired evoked inhibitory synaptic transmission in CA1 neurons of mice lacking endophilin A1, and an increased susceptibility to seizures. Endophilin can interact with the postsynaptic scaffold protein gephyrin and promote assembly of the inhibitory postsynaptic element. Endophilin A1 is known to play a role in presynaptic terminals and in dendritic spines, but a role for endophilin A1 at inhibitory postsynaptic densities has not yet been described.

      Strengths:

      The authors used a broad array of experimental approaches to investigate this, including tests of seizure susceptibility, electrophysiology, biochemistry, neuronal culture, and image analysis.

      Weaknesses:

      Many results are difficult to interpret, and the data quality is not always convincing, unfortunately. The basic premise of the study, that gephyrin and endophilin A1 interact, requires a more robust analysis to be convincing.

      We greatly appreciate the positive comment on our study and the very valuable feedback for us to improve the manuscript. We will conduct additional experiments to improve our data quality and strengthen our evidences according to these great constructive suggestions. To gain strong evidence for the interaction between endophilin A1 and gephyrin, we will perform in vitro pull-down assay with recombinant proteins from bacterial expression system.

    1. Author response:

      Public Reviews:

      Summary:

      We sincerely thank the reviewers for their insightful and thorough feedback. Their comments cover both technical and conceptual aspects of our project, which we have attempted to address in our provisional responses.

      First, we would like to clarify that any current lack of documentation or technical issues (such as local installation challenges) reflect the software's early stage. These aspects are receiving our full attention and are not intended to remain in their current state. As suggested, we plan to enhance the toolbox’s structure by separating it into a standalone library and a web application, alongside developing smaller satellite apps for SWC and MOD file management. We will also expand our documentation, provide a more detailed user guide, and add video tutorials for the GUI.

      Second, we have clarified the rationale behind specific implementation choices in our software, explaining why certain features of the toolbox were designed and implemented in particular ways. Our goal is to maintain a strong focus on single-cell level modeling, addressing its various aspects in great detail. We are also working on new features, such as automated parameter optimization and support for multiple output formats, to further enrich the toolbox’s functionality.

      Reviewer #1 (Public review):

      Summary:

      Dendrotweaks provides its users with a solid tool to implement, visualize, tune, validate, understand, and reduce single-neuron models that incorporate complex dendritic arbors with differential distribution of biophysical mechanisms. The visualization of dendritic segments and biophysical mechanisms therein provide users with an intuitive way to understand and appreciate dendritic physiology.

      Strengths:

      (1) The visualization tools are simplified, elegant, and intuitive.

      (2) The ability to build single-neuron models using simple and intuitive interfaces.

      (3) The ability to validate models with different measurements.

      (4) The ability to systematically and progressively reduce morphologically-realistic neuronal models.

      We thank the reviewer for their positive comments.

      Weaknesses:

      (1) Inability to account for neuron-to-neuron variability in structural, biophysical, and physiological properties in the model-building and validation processes.

      We agree with the reviewer that it is important to account for neuron-to-neuron variability. The core approach of DendroTweaks and its distinctive feature is interactive exploration of how morpho-electric parameters affect neuronal activity. In light of this, variability can be achieved through interactive updating of the model parameters with widgets. In a sense, by adjusting a widget (e.g., channel distribution or kinetics), a user ends up with a new instance of a cell in the parameter space and receives almost real-time feedback on how this change affects neuronal activity. Implementing complex algorithms to account for neuron-to-neuron variability during the validation process would detract from the interactivity aspect of the GUI. That being said, we acknowledge the importance of this issue and we will explore the options to address it more comprehensively in our revised manuscript.

      (2) Inability to account for the many-to-many mapping between ion channels and physiological outcomes. Reliance on hand-tuning provides a single biased model that does not respect pronounced neuron-to-neuron variability observed in electrophysiological measurements.

      We acknowledge the challenge of accounting for degeneracy in the relation between ion channels and physiological outcomes and the importance of capturing neuron-to-neuron variability. One possible way to address this, as we mention in the Discussion, is to integrate automated parameter optimization algorithms alongside the existing interactive hand-tuning with widgets. We are currently exploring the possibility of integrating Jaxley (Deistler et al., 2024) into DendroTweaks in addition to NEURON. This would allow for automated and fast gradient-based parameter optimization, including optimization of heterogeneous channel distributions.

      (3) Lack of a demonstration on how to connect reduced models into a network within the toolbox.

      Building a network of reduced models is a promising direction, albeit it goes beyond the scope of this manuscript. We do not plan to add support for network models to the toolbox itself. In DendroTweaks, we focus on single-cell modeling, aiming to cover its various aspects in great detail. Of course, such refined single-cell models—both detailed and reduced—are likely to be integrated into networks but this will not take place within the DendroTweaks toolbox. To support the integration of DendroTweaks-produced model neurons into networks, we will focus on better compatibility with existing formats and standards and improve exporting capabilities. It is already possible to export reduced morphologies as SWC files, standardized ion channel models as MOD files and channel distributions as JSON files. Nevertheless, as a proof of concept, we plan to generate a simple network of exported reduced models outside the toolbox and include it as a separate Jupyter notebook.

      (4) Lack of a set of tutorials, which is common across many "Tools and Resources" papers, that would be helpful in users getting acquainted with the toolbox.

      This is a valid concern that we aim to address promptly. Currently, an online user guide is available at https://dendrotweaks.dendrites.gr/guide.html. This guide introduces users to the GUI elements and covers basic use cases. We are working on video tutorials and detailed documentation, which will be available soon (as part of the revised manuscript). The toolbox will be split into two parts: a Bokeh app and a standalone library. The library will offer the core functionality, such as reducing morphology and standardizing channels, without the GUI, enabling bulk processing. It will be installable through PyPI and integrated into the app code as an external library. We will provide thorough documentation for all classes and functions in the library.

      Reviewer #2 (Public review):

      The paper by Makarov et al. describes the software tool called DendroTweaks, intended for the examination of multi-compartmental biophysically detailed neuron models. It offers extensive capabilities for working with very complex distributed biophysical neuronal models and should be a useful addition to the growing ecosystem of tools for neuronal modeling.

      Strengths

      (1) This Python-based tool allows for visualization of a neuronal model's compartments.

      (2) The tool works with morphology reconstructions in the widely used .swc and .asc formats.

      (3) It can support many neuronal models using the NMODL language, which is widely used for neuronal modeling.

      (4) It permits one to plot the properties of linear and non-linear conductances in every compartment of a neuronal model, facilitating examination of the model's details.

      (5) DendroTweaks supports manipulation of the model parameters and morphological details, which is important for the exploration of the relations of the model composition and parameters with its electrophysiological activity.

      (6) The paper is very well written - everything is clear, and the capabilities of the tool are described and illustrated with great attention to detail.

      We thank the reviewer for their positive comments.

      Weaknesses

      (1) Not a really big weakness, but it would be really helpful if the authors showed how the performance of their tool scales. This can be done for an increasing number of compartments - how long does it take to carry out typical procedures in DendroTweaks, on a given hardware, for a cell model with 100 compartments, 200, 300, and so on? This information will be quite useful to understand the applicability of the software.

      DendroTweaks functions as a layer on top of a simulation engine. As a result, currently its performance scales in proportion to the NEURON’s one. Note that the GUI displays the time taken to run a given simulation in NEURON at the bottom of the Simulation tab in the left menu. While GUI-related processing and rendering also consume time, this is not as straightforward to measure. Nonetheless, we will explore options to provide suggested benchmarking in the revised manuscript.

      (2) Let me also add here a few suggestions (not weaknesses, but something that can be useful, and if the authors can easily add some of these for publication, that would strongly increase the value of the paper).

      (3) It would be very helpful to add functionality to read major formats in the field, such as NeuroML and SONATA.

      We agree with the reviewer that support for major formats will substantially improve and ensure reproducibility and reusability of the models. As mentioned in the Discussion, we plan to add support for NeuroML. Regarding SONATA, it is indeed possible to view our models as a network with a single morphologically-detailed biophysical node receiving inputs from multiple populations of virtual nodes. In future editions of the tool we plan to expand its support for additional file formats.

      (4) Visualization is available as a static 2D projection of the cell's morphology. It would be nice to implement 3D interactive visualization.

      We offer an option to rotate a cell around the vertical axis using a slider under the plot. This is a workaround, as implementing a true 3D visualization in Bokeh would require custom Bokeh elements, along with external JavaScript libraries. Despite these implementation difficulties, we advocate for a different approach than the one used in most of the morphology viewers mentioned in the Discussion. The core idea of DendroTweaks' morphology exploration is that each section is "clickable" allowing its geometric properties to be examined in a 2D Section view. Furthermore, we believe the Graph view presents the overall cell topology more clearly than a 3D visualization.

      (5) It is nice that DendroTweaks can modify the models, such as revising the radii of the morphological segments or ionic conductances. It would be really useful then to have the functionality for writing the resulting models into files for subsequent reuse.

      This functionality is already available. Users can export JSON files with channel distributions and SWC files after morphology reduction through the GUI. In the standalone version, users can modify and export SWC files, as well as export MOD files after standardization. Please note that in the online demo version export and import functionality is currently limited, but we plan to fully enable it when submitting our revisions. We are considering separating file managers as satellite apps—one for SWC and one for MOD files. It is worth mentioning that the MOD file manager along with parsing the files and generating Python classes for visualization purposes is already capable of producing Jaxley-compatible Python channel classes.

      (6) If I didn't miss something, it seems that DendroTweaks supports the allocation of groups of synapses, where all synapses in a group receive the same type of Poisson spike train. It would be very useful to provide more flexibility. One option is to leverage the SONATA format, which has ample functionality for specifying such diverse inputs.

      Currently, each group shares the same set of parameters for both biophysical properties of synapses (e.g., reversal potential, time constants) and presynaptic "population" activity (e.g., rate, onset). The parameter that controls an incoming Poisson spike train is the rate, which is indeed shared across all synapses in a group. The suggestion to allow for variability in input properties within a group is interesting and is worth implementing. We will explore this in the revised manuscript.

      (7) "Each session can be saved as a .json file and reuploaded when needed" - do these files contain the whole history of the session or the exact snapshot of what is visualized when the file is saved? If the latter, which variables are saved, and which are not? Please clarify.

      These files capture the exact snapshot of the model's latest state. They include model parameters such as channel distributions, equilibrium potentials, and temperature. Currently, stimuli (current clamps and synapses) are not saved. However, we plan to add an option to export stimuli parameters in the same JSON file. This will also be available as part of the revised manuscript.

      References

      Michael Deistler, Kyra L. Kadhim, Matthijs Pals, Jonas Beck, Ziwei Huang, Manuel Gloeckler, Janne K. Lappalainen, Cornelius Schröder, Philipp Berens, Pedro J. Gonçalves, Jakob H. Macke Differentiable simulation enables large-scale training of detailed biophysical models of neural dynamics bioRxiv 2024.08.21.608979; doi:https://doi.org/10.1101/2024.08.21.608979

    1. Author response:

      To reviewer #1:

      We appreciate your advice on providing more conceptual motivations for comparing Bayesian and RL-like belief updating models. In short, both model families are complementary in capturing asymmetrical and symmetrical updating. They both consider that the magnitude of updating is weighed by two separate learning rates, one for positive and one for negative belief disconfirming evidence. If these two learning rates differ, updating is asymmetrical; if they are equal, updating is symmetrical.

      However, the model families’ assumptions about the underlying updating process differ. In the RL-like belief updating model family, this process is assumed to be driven by comparing base rates and initial beliefs, also known as the prediction error (PE), weighed by the learning rates. On the contrary, the Bayesian updating model assumes that updating (i.e., the posterior belief) is driven by combining the base rate (i.e., the prior evidence) and how often the initial belief is represented in the estimated base rate (i.e., the likelihood ratio of all other alternative hypotheses, beliefs). Moreover, the two components of the posterior belief can differ in their respective contribution (i.e., precision or confidence), which might be more adaptive to external actual life conditions characterized by high uncertainty about the future.

      For the revised manuscript, we will elaborate more on the conceptual and psychological meaning of these two proposed belief updating processes. So far, it is important to note that we do not have direct proof of humans reasoning in an RL-like or Bayesian way when updating their beliefs about the future. We, therefore, focus on the complementarity of both models to capture latent processes and variables in belief updating that can be leveraged to understand the sources of inter-individual differences and the impact of external contexts such as experiencing an actual adverse life event on human psychology.

      To reviewer #2:

      Thank you for recommending the exploration of potential differences between optimism biases in initial belief estimations (self versus other) during and outside the pandemic. We will also provide more details on the belief updating task and design.

      To both reviewers: 

      We agree on the limitations arising from the lack of physiological and self-reported measures of stress. We collected some self-reports on risk perception, adoption of protective measures, need for social interactions, and mood, but solely in participants tested during the pandemic-related lockdowns (reported in the SI Table 1). For the revised manuscript, we propose exploring the correlational links between belief-updating biases and self-reports in this sample. The expected outcomes of such correlational analyses may identify the variables to target with interventions in future studies of human belief updating under real-world contexts. We also will add a relevant section to the discussion to elaborate on the limitation that hinders inferring plausible psychological causes of the differences observed in belief updating during and outside the pandemic.

      Importantly, we will follow your recommendations to improve the computational modeling analyses. We will (1) add the confusion matrices from model recovery analyses to gain inferences on specificity, (2) provide evidence for the best-fitting model to reproduce the observed behavior shown in Figure 1, and (3) conduct model comparisons on the combined groups to justify the focus on the RL like updating model. In a few weeks, we plan to submit a revised manuscript alongside a point-by-point response to your concerns and recommendations.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      First, the authors confirm the up-regulation of the main genes involved in the three branches of the Unfolded Protein Response (UPR) system in diet-induced obese mice in AT, observations that have been extensively reported before. Not surprisingly, IRE1a inhibition with STF led to an amelioration of the obesity and insulin resistance of the animals. Moreover, non-alcoholic fatty liver disease was also improved by the treatment. More novel are their results in terms of thermogenesis and energy expenditure, where IRE1a seems to act via activation of brown AT. Finally, mice treated with STF exhibited significantly fewer metabolically active and M1-like macrophages in the AT compared to those under vehicle conditions. Overall, the authors conclude that targeting IRE1a has therapeutical potential for treating obesity and insulin resistance.

      The study has some strengths, such as the detailed characterization of the effect of STF in different fat depots and a thorough analysis of macrophage populations. However, the lack of novelty in the findings somewhat limits the study´s impact on the field.

      We thank the reviewer for the appreciation of our findings and the comments about the novelty. Regarding the novelty, we would emphasize several novelties presented in this manuscript. First, as the reviewer correctly pointed out, we discovered that IRE1 inhibition by STF activates brown AT and promotes thermogenesis and that IRE1 inhibition not only significantly attenuated the newly discovered CD9+ ATMs and the “M1-like” CD11c+ ATMs but also diminished the M2 ATMs for the first time. These discoveries are very important and novel. In obesity, it was originally proposed that ATM undergoes M1/M2 polarization from an anti-inflammatory M2 to a classical pro-inflammatory M1 state. It was further reported that IRE1 deletion improves thermogenesis by boosting M2 population which then synthesize and secrete catecholamines to promote thermogenesis. It is now known that M2 macrophages do not synthesize catecholamines or promote thermogenesis. In this study, we discovered that IRE1 inhibition doesn’t increase (but instead decrease) the M2 population and that IRE1 inhibition promotes thermogenesis likely by suppressing pro-inflammatory macrophage populations including the M1-like ATMs and most importantly the newly identified metabolically active macrophages, given that ATM inflammation has been reported to suppress thermogenesis. Second, this study presented the first characterization of relationship between the more classical M1-like ATMs and the newly discovered metabolically active ATMs, showing that the CD11c+ M1-like ATMs are largely overlapping with but yet non-identical to CD9+ ATMs in the eWAT under HFD. Third, although upregulation of ER stress response genes in the adipose tissues of diet-induced obese mice have been extensively reported, it doesn’t necessarily mean that targeting IRE1a or ER stress can reverse existing insulin resistance and obesity. It is not uncommon that a therapy doesn’t yield the desired effect as expected. For instance, amyloid plaques are a hallmark of Alzheimer's disease (AD), interventions that prevent or reverse beta amyloid deposition have been expected to prevent progression or even reverse cognitive impairment in AD patients. However, clinical trials on such therapies have been disappointing. In essence, experimental demonstration of effectiveness or feasibility for any potential therapeutic targets is a first step for any future clinical implementation.

      Reviewer #2 (Public review):

      The manuscript by Wu et al demonstrated that IRE1a inhibition mitigated insulin resistance and other comorbidities through increased energy expenditure in DIO mice. In this reviewer's opinion, this timely study has high significance in the field of metabolism research for the following reasons.

      (1) The authors' findings are significant and may offer a new therapeutic target to treat metabolic diseases, including diabetes, obesity, NAFLD, etc.

      (2) The authors carefully profiled the ATMs and examined the changes in gene expression after STF treatment.

      (3) The authors presented evidence collected from both systemic indirect calorimetry and individual tissue gene expression to support the notion of increased energy expenditure.

      Overall, the authors have presented sufficient background in a clear and logically organized structure, clearly stated the key question to be addressed, used the appropriate methodology, produced significant and innovative main findings, and made a justified conclusion.

      We thank the reviewer for the appreciation of our work.

      Reviewer #3 (Public review):

      Summary:

      The manuscript by Wu D. et al. explores an innovative approach to immunometabolism and obesity by investigating the potential of targeting macrophage Inositol-requiring enzyme 1α (IRE1α) in cases of overnutrition. Their findings suggest that pharmacological inhibition of IRE1α could influence key aspects such as adipose tissue inflammation, insulin resistance, and thermogenesis. Notable discoveries include the identification of High-Fat Diet (HFD)-induced CD9+ Trem2+ macrophages and the reversal of metabolically active macrophages' activity with IRE1α inhibition using STF. These insights could significantly impact future obesity treatments.

      Strengths:

      The study's key strengths lie in its identification of specific macrophage subsets and the demonstration that inhibiting IRE1α can reverse the activity of these macrophages. This provides a potential new avenue for developing obesity treatments and contributes valuable knowledge to the field.

      Weaknesses:

      The research lacks an in-depth exploration of the broader metabolic mechanisms involved in controlling diet-induced obesity (DIO). Addressing this gap would strengthen the understanding of how targeting IRE1α might fit into the larger metabolic landscape.

      Impact and Utility:

      The findings have the potential to advance the field of obesity treatment by offering a novel target for intervention. However, further research is needed to fully elucidate the metabolic pathways involved and to confirm the long-term efficacy and safety of this approach. The methods and data presented are useful, but additional context and exploration are required for broader application and understanding.

      We thank the reviewer for the appreciation of strengths in our manuscript. In particular, we appreciate the reviewer’s recommendation on the exploration of broader metabolic landscape, such as the effect of IRE1 inhibition on non-adipose tissue macrophages and metabolism. We agree that achieving these will certainly broaden the therapeutic potential of IRE1 inhibition to larger metabolic disorders and we will pursue these explorations in future studies.

    1. Author response:

      We thank the reviewers for their constructive feedback here, which will both improve the present manuscript, and help us update our approach as we continue to examine interregional interactions in the motor system. Below we address the concerns raised in the Public Reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      This study examined the interaction between two key cortical regions in the mouse brain involved in goal-directed movements, the rostral forelimb area (RFA) - considered a premotor region involved in movement planning, and the caudal forelimb area (CFA) - considered a primary motor region that more directly influences movement execution. The authors ask whether there exists a hierarchical interaction between these regions, as previously hypothesized, and focus on a

      specific definition of hierarchy - examining whether the neural activity in the premotor region exerts a larger functional influence on the activity in the primary motor area than vice versa. They examine this question using advanced experimental and analytical methods, including localized optogenetic manipulation of neural activity in either region while measuring both the neural activity in the other region and EMG signals from several muscles involved in the reaching movement, as well as simultaneous electrophysiology recordings from both regions in a separate cohort of animals.

      The findings presented show that localized optogenetic manipulation of neural activity in either RFA or CFA resulted in similarly short-latency changes in the muscle output and in firing rate changes in the other region. However, perturbation of RFA led to a larger absolute change in the neural activity of CFA neurons. The authors interpret these findings as evidence for reciprocal, but asymmetrical, influence between the regions, suggesting some degree of hierarchy in which RFA has a greater effect on the neural activity in CFA. They go on to examine whether this asymmetry can also be observed in simultaneously recorded neural activity patterns from both regions. They use multiple advanced analysis methods that either identify latent components at the population level or measure the predictability of firing rates of single neurons in one region using firing rates of single neurons in the other region. Interestingly, the main finding across these analyses seems to be that both regions share highly similar components that capture a high degree of variability of the neural activity patterns in each region. Single units' activity from either region could be predicted to a similar degree from the activity of single units in the other region, without a clear division into a leading area and a lagging area, as one might expect to find in a simple hierarchical interaction. However, the authors find some evidence showing a slight bias towards leading activity in RFA. Using a two-region neural network model that is fit to the summed neural activity recorded in the different experiments and to the summed muscle output, the authors show that a network with constrained (balanced) weights between the regions can still output the observed measured activities and the observed asymmetrical effects of the optogenetic manipulations, by having different within-region local weights. These results put into question whether previous and current findings that demonstrate asymmetry in the output of regions can be interpreted as evidence for asymmetrical (and thus hierarchical) inputs between regions, emphasizing the challenges in studying interactions between any brain regions.

      Strengths:

      The experiments and analyses performed in this study are comprehensive and provide a detailed examination and comparison of neural activity recorded simultaneously using dense electrophysiology probes from two main motor regions that have been the focus of studies examining goal-directed movements. The findings showing reciprocal effects from each region to the other, similar short-latency modulation of muscle output by both regions, and similarity of neural activity patterns without a clear lead/lag interaction, are convincing and add to the growing body of evidence that highlight the complexity of the interactions between multiple regions in the motor system and go against a simple feedforward-like network and dynamics. The neural network model complements these findings and adds an important demonstration that the observed asymmetry can, in theory, also arise from differences in local recurrent connections and not necessarily from different input projections from one region to the other. This sheds an important light on the multiple factors that should be considered when studying the interaction between any two brain regions, with a specific emphasis on the role of local recurrent connections, that should be of interest to the general neuroscience community.

      Weaknesses:

      While the similarity of the activity patterns across regions and lack of a clear leading/lagging interaction are interesting observations that are mostly supported by the findings presented (however, see comment below for lack of clarity in CCA/PLS analyses), the main question posed by the authors - whether there exists an endogenous hierarchical interaction between RFA and CFA - seems to be left largely open. 

      The authors note that there is currently no clear evidence of asymmetrical reciprocal influence between naturally occurring neural activity patterns of the two regions, as previous attempts have used non-natural electrical stimulation, lesions, or pharmacological inactivation. The use of acute optogenetic perturbations does not seem to be vastly different in that aspect, as it is a non-natural stimulation of inhibitory interneurons that abruptly perturbs the ongoing dynamics.

      We do believe that our optogenetic inactivation identifies a causal interaction between the endogenous activity patterns in the excitatory projection neurons that are largely silenced, and the endogenous activity that is affected in a downstream region. To clarify, the effect in the downstream region results directly from the silencing of activity in the excitatory projection neurons that connect RFA and CFA. 

      Here we have performed a causal intervention common in biology: a loss-of-function experiment. Such experiments generally reveal that a causal interaction of some sort is present, but often do not clarify much about the nature of the interaction, as is true in our case. By showing that the silencing of endogenous activity in one motor cortical region causes a significant change to the endogenous activity in another, we establish a causal relationship between these activity patterns.

      This is analogous to knocking out the gene for a transcription factor and observing causal effects on the expression of other genes that depends on it. 

      Moreover, our experiments are, to our knowledge, the first that localize a causal relationship to endogenous activity in motor cortical regions at a particular point during motor behavior. Stimulation experiments generate spiking in excitatory projection neurons that is not endogenous. Lesion and pharmacological or chemogenetic inactivation have long-lasting effects, and so their consequences on firing in other regions cannot be attributed to a short-latency influence of activity at a particular point during movement. Moreover, the involvement of motor cortex in motor learning and movement preparation/initiation complicates the interpretation of these consequences vis-à-vis movement execution, as disturbance to processes on which execution depends can impede execution itself. 

      That said, we would agree that the form of the causal interaction between RFA and CFA remains largely unaddressed by our results. These results do not expose how the silenced activity patterns affect activity in the downstream region, just as transcription factor gene knockouts do not expose how the effect on transcription occurs. To show evidence for specific interaction dynamics between RFA and CFA, a different sort of experiment would be necessary. See Jazayeri and Afraz, Neuron, 2017 for more on this issue.

      Furthermore, the main finding that supports a hierarchical interaction is a difference in the absolute change of firing rates as a result of the optogenetic perturbation, a finding that is based on a small number of animals (N = 3 in each experimental group), and one which may be difficult to interpret. 

      Though N = 3 in this case, we do show statistical significance. Moreover, using three replicates is not uncommon in biological experiments that require a large technical investment, including those in rodents.

      As the authors nicely demonstrate in their neural network model, the two regions may differ in the strength of local within-region inhibitory connections. Could this theoretically also lead to a difference in the effect of the artificial light stimulation of the inhibitory interneurons on the local population of excitatory projection neurons, driving an asymmetrical effect on the downstream region? 

      We (Miri et al., Neuron, 2017) and others (Guo et al., Neuron, 2014) have shown that the effect of this inactivation on excitatory neurons in CFA is a near-complete silencing (90-95% within 20 ms). Thus there is not much room for the effects on projection neurons in RFA to be much larger. As part of other work currently in review, we have verified that the effects on RFA projection neuron firing are not larger.

      Moreover, the manipulation was performed upon the beginning of the reaching movement, while the premotor region is often hypothesized to exert its main control during movement preparation, and thus possibly show greater modulation during that movement epoch. It is not clear if the observed difference in absolute change is dependent on the chosen time of optogenetic stimulation and if this effect is a general effect that will hold if the stimulation is delivered during different movement epochs, such as during movement preparation.

      We agree that the dependence of RFA-CFA interactions on movement phase would be interesting to address in subsequent experiments. While a strong interpretation of past lesion results might lead to a hypothesis that premotor influence on primary motor cortex is local to, or stronger during, movement preparation as opposed to execution, at present there is to our knowledge no empirical support from interventional experiments for this hypothesis. Moreover, existing results from analysis of activity in premotor and primary motor cortex have produced conflicting results on the strength of interaction between these regions during preparation. Compare for example Bachschmid-Romano et al., eLife, 2023 to Kaufman et al., Nature Neuroscience, 2014.

      That said, this lesion interpretation would predict the same asymmetry we have observed from perturbations at the beginning of a reach – a larger effect of RFA on CFA than vice versa.

      Another finding that is not clearly interpretable is in the analysis of the population activity using CCA and PLS. The authors show that shifting the activity of one region compared to the other, in an attempt to find the optimal leading/lagging interaction, does not affect the results of these analyses. Assuming the activities of both regions are better aligned at some unknown groundtruth lead/lag time, I would expect to see a peak somewhere in the range examined, as is nicely shown when running the same analyses on a single region's activity. If the activities are indeed aligned at zero, without a clear leading/lagging interaction, but the results remain similar when shifting the activities of one region compared to the other, the interpretation of these analyses is not clear.

      Our results in this case were definitely surprising. Many share the intuition that there should be a lag at which the correlations in activity between connected regions will be strongest. Similarity in alignment across lags might be expected if communication between regions occurs over a range of latencies as a result of dependence on a broad diversity of synaptic paths that connect neurons. In the Discussion, we offer an explanation of how to reconcile these findings with the seemingly different picture presented by DLAG.

      Reviewer #2 (Public review):

      Summary:

      While technical advances have enabled large-scale, multi-site neural recordings, characterizing inter-regional communication and its behavioral relevance remains challenging due to intrinsic properties of the brain such as shared inputs, network complexity, and external noise. This work by Saiki-Ishikawa et al. examines the functional hierarchy between premotor (PM) and primary motor (M1) cortices in mice during a directional reaching task. The authors find some evidence consistent with an asymmetric reciprocal influence between the regions, but overall, activity patterns were highly similar and equally predictive of one another. These results suggest that motor cortical hierarchy, though present, is not fully reflected in firing patterns alone.

      Strengths:

      Inferring functional hierarchies between brain regions, given the complexity of reciprocal and local connectivity, dynamic interactions, and the influence of both shared and independent external inputs, is a challenging task. It requires careful analysis of simultaneous recording data, combined with cross-validation across multiple metrics, to accurately assess the functional relationships between regions. The authors have generated a valuable dataset simultaneously recording from both regions at scale from mice performing a cortex-dependent directional reaching task.

      Using electrophysiological and silencing data, the authors found evidence supporting the traditionally assumed asymmetric influence from PM to M1. While earlier studies inferred a functional hierarchy based on partial temporal relationships in firing patterns, the authors applied a series of complementary analyses to rigorously test this hierarchy at both individual neuron and population levels, with robust statistical validation of significance.

      In addition, recording combined with brief optogenetic silencing of the other region allowed authors to infer the asymmetric functional influence in a more causal manner. This experiment is well designed to focus on the effect of inactivation manifesting through oligosynaptic connections to support the existence of a premotor to primary motor functional hierarchy.

      Subsequent analyses revealed a more complex picture. CCA, PLS, and three measures of predictivity (Granger causality, transfer entropy, and convergent cross-mapping) emphasized similarities in firing patterns and cross-region predictability. However, DLAG suggested an imbalance, with RFA capturing CFA variance at a negative time lag, indicating that RFA 'leads' CFA. Taken together these results provide useful insights for current studies of functional hierarchy about potential limitations in inferring hierarchy solely based on firing rates.

      While I would detail some questions and issues on specifics of data analyses and modeling below, I appreciate the authors' effort in training RNNs that match some behavioral and recorded neural activity patterns including the inactivation result. The authors point out two components that can determine the across-region influence - 1) the amount of inputs received and 2) the dependence on across-region input, i.e., the relative importance of local dynamics, providing useful insights in inferring functional relationships across regions.

      Weaknesses:

      (1) Trial-averaging was applied in CCA and PLS analyses. While trial-averaging can be appropriate in certain cases, it leads to the loss of trial-to-trial variance, potentially inflating the perceived similarities between the activity in the two regions (Figure 4). Do authors observe comparable degrees of similarity, e.g., variance explained by canonical variables? Also, the authors report conflicting findings regarding the temporal relationship between RFA and CFA when using CCA/PLS versus DLAG. Could this discrepancy be due to the use of trial-averaging in former analyses but not in the latter?

      We certainly agree that the similarity in firing patterns is higher in trial averages than on single trials, given the variation in single-neuron firing patterns across trials. Here, we were trying to examine the similarity of activity variance that is clearly movement dependent, as trial averages are, and to use an approach that mirrors those applied in much of the existing literature. We would also agree that there is more that can be learned about interactions from trial-by-trial analysis. 

      It is possible that the activity components identified by DLAG as being asymmetric somehow are not reflected strongly in trial averages. In our Discussion we offer another potential explanation related to the differences in what is calculated in DLAG and CCA/PLS.

      We also note here that all of the firing pattern predictivity analysis we report (Figure 6) was done on single-trial data, and in all cases the predictivity was symmetric. Thus, our results in aggregate are not consistent with symmetry purely being an artifact of trial averaging.

      (2) A key strength of the current study is the precise tracking of forelimb muscle activity during a complex motor task involving reaching for four different targets. This rich behavioral data is rarely collected in mice and offers a valuable opportunity to investigate the behavioral relevance of the PM-M1 functional interaction, yet little has been done to explore this aspect in depth. For example, single-trial time courses of inter-regional latent variables acquired from DLAG analysis can be correlated with single-trial muscle activity and/or reach trajectories to examine the behavioral relevance of inter-regional dynamics. Namely, can trial-by-trial change in inter-regional dynamics explain behavioral variability across trials and/or targets? Does the inter-areal interaction change in error trials? Furthermore, the authors could quantify the relative contribution of across-area versus within-area dynamics to behavioral variability. It would also be interesting to assess the degree to which across-area and within-area dynamics are correlated. Specifically, can acrossarea dynamics vary independently from within-area dynamics across trials, potentially operating through a distinct communication subspace?

      These are all very interesting questions. Our study does not attempt to parse activity into components predictive of muscle activity and others that may reflect other functions. Distinct components of RFA and CFA activity may be involved in distinct interactions between them.

      (3) While network modeling of RFA and CFA activity captured some aspects of behavioral and neural data, I wonder if certain findings such as the connection weight distribution (Figure 7C), across-region input (Figure 7F), and the within-region weights (Figure 7G), primarily resulted from fitting the different overall firing rates between the two regions with CFA exhibiting higher average firing rates. Did the authors account for this firing rate disparity when training the RNNs?

      The key comparison in Figure 7 is shown in 7F, where the firing rates are accounted for in calculating the across-region input strength. Equalizing the firing rates in RFA and CFA would effectively increase RFA rates. If the mean firing rates in each region were appreciably dependent on across-region inputs, we would then expect an off-setting change in the RFA→CFA weights, such that the RFA→CFA distributions in 7F would stay the same. We would also expect the CFA→RFA weights would increase, since RFA neurons would need more input. This would shift the CFA→RFA (blue) distributions up. Thus, if anything, the key difference in this panel would only get larger. 

      We also generally feel that it is a better approach to fit the actual firing rates, rather than normalizing, since normalizing the firing rates would take us further from the actual biology, not closer.

      (4) Another way to assess the functional hierarchy is by comparing the time courses of movement representation between the two regions. For example, a linear decoder could be used to compare the amount of information about muscle activity and/or target location as well as time courses thereof between the two regions. This approach is advantageous because it incorporates behavior rather than focusing solely on neural activity. Since one of the main claims of this study is the limitation of inferring functional hierarchy from firing rate data alone, the authors should use the behavior as a lens for examining inter-areal interactions.

      As we state above, we agree that examining interactions specific to movement-related activity components could be illuminating. Since it remains a challenge to rigorously identify a subset of activity patterns specifically related to driving muscle activity, any such analysis would involve an additional assumption. It remains unclear how well the motor cortical activity that decoders use for predicting muscle activity matches the motor cortical activity that actually drives muscle activity in situ. 

      Reviewer #3 (Public review):

      This study investigates how two cortical regions that are central to the study of rodent motor control (rostral forelimb area, RFA, and caudal forelimb area, CFA) interact during directional forelimb reaching in mice. The authors investigate this interaction using

      (1) optogenetic manipulations in one area while recording extracellularly from the other,

      (2) statistical analyses of simultaneous CFA/RFA extracellular recordings, and

      (3) network modeling.

      The authors provide solid evidence that asymmetry between RFA and CFA can be observed, although such asymmetry is only observed in certain experimental and analytical contexts.

      The authors find asymmetry when applying optogenetic perturbations, reporting a greater impact of RFA inactivation on CFA activity than vice-versa. The authors then investigate asymmetry in endogenous activity during forelimb movements and find asymmetry with some analytical methods but not others. Asymmetry was observed in the onset timing of movement-related deviations of local latent components with RFA leading CFA (computed with PCA) and in a relatively higher proportion and importance of cross-area latent components with RFA leading than CFA leading (computed with DLAG). However, no asymmetry was observed using several other methods that compute cross-area latent dynamics, nor with methods computed on individual neuron pairs across regions. The authors follow up this experimental work by developing a twoarea model with asymmetric dependence on cross-area input. This model is used to show that differences in local connectivity can drive asymmetry between two areas with equal amounts of across-region input.

      Overall, this work provides a useful demonstration that different cross-area analysis methods result in different conclusions regarding asymmetric interactions between brain areas and suggests careful consideration of methods when analyzing such networks is critical. A deeper examination of why different analytical methods result in observed asymmetry or no asymmetry, analyses that specifically examine neural dynamics informative about details of the movement, or a biological investigation of the hypothesis provided by the model would provide greater clarity regarding the interaction between RFA and CFA.

      Strengths:

      The authors are rigorous in their experimental and analytical methods, carefully monitoring the impact of their perturbations with simultaneous recordings, and providing valid controls for their analytical methods. They cite relevant previous literature that largely agrees with the current work, highlighting the continued ambiguity regarding the extent to which there exists an asymmetry in endogenous activity between RFA and CFA.

      A strength of the paper is the evidence for asymmetry provided by optogenetic manipulation. They show that RFA inactivation causes a greater absolute difference in muscle activity than CFA interaction (deviations begin 25-50 ms after laser onset, Figure 1) and that RFA inactivation causes a relatively larger decrease in CFA firing rate than CFA inactivation causes in RFA (deviations begin <25ms after laser onset, Figure 3). The timescales of these changes provide solid evidence for an asymmetry in the impact of inactivating RFA/CFA on the other region that could not be driven by differences in feedback from disrupted movement (which would appear with a ~50ms delay).

      The authors also utilize a range of different analytical methods, showing an interesting difference between some population-based methods (PCA, DLAG) that observe asymmetry, and single neuron pair methods (granger causality, transfer entropy, and convergent cross mapping) that do not. Moreover, the modeling work presents an interesting potential cause of "hierarchy" or "asymmetry" between brain areas: local connectivity that impacts dependence on across-region input, rather than the amount of across-region input actually present.

      Weaknesses:

      There is no attempt to examine neural dynamics that are specifically relevant/informative about the details of the ongoing forelimb movement (e.g., kinematics, reach direction). Thus, it may be preemptive to claim that firing patterns alone do not reflect functional influence between RFA/CFA. For example, given evidence that the largest component of motor cortical activity doesn't reflect details of ongoing movement (reach direction or path; Kaufman, et al. PMID: 27761519) and that the analytical tools the authors use likely isolate this component (PCA, CCA), it may not be surprising that CFA and RFA do not show asymmetry if such asymmetry is related to the control of movement details. 

      An asymmetry may still exist in the components of neural activity that encode information about movement details, and thus it may be necessary to isolate and examine the interaction of behaviorally-relevant dynamics (e.g., Sani, et al. PMID: 33169030).

      To clarify, we are not claiming that firing patterns in no way reflect the asymmetric functional influence that we demonstrate with optogenetic inactivation. Instead, we show that certain types of analysis we might expect to reflect such influence, in fact, do not. Indeed, DLAG did exhibit asymmetries that matched those seen in functional influence (at least qualitatively), though other methods we applied did not.

      As we state above, we do think that there is more that can be gleaned by looking at influence specifically in terms of activity related to movement. However, if we did find that movement-related activity exhibited an asymmetry matching that of functional influence in cases where overall activity exhibited symmetry, our results imply that the activity not related to movement would exhibit an opposite asymmetry, such that the overall balance is symmetric. This would itself be surprising. We also note that the components identified by CCA and PLS show substantial variation across reach targets, indicating that they are not only reflecting condition-invariant components. These analyses used over 90% of the total activity variance, suggesting that both condition-dependent and condition-invariant components are included.

      The idea that local circuit dynamics play a central role in determining the asymmetry between RFA and CFA is not supported by experimental data in this paper. The plausibility of this hypothesis is supported by the model but is not explored in any analyses of the experimental data collected. Given the focus on this idea in the discussion, further experimental investigation is warranted.

      While we do not provide experimental support for this hypothesis, the data we present also do not contradict this hypothesis. Here we used modeling as it is often used – to capture experimental results and generate hypotheses about potential explanations. We feel that our Discussion makes clear where the hypothesis derives from and does not misrepresent the lack of experimental support. We expect readers will take our engagement with this hypothesis with the appropriate grain of salt. The imaginable experiments to support such a hypothesis would constitute another substantial study requiring numerous controls – a whole other paper in itself.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      This study investigates how ant group demographics influence nest structures and group behaviors of Camponotus fellah ants, a ground-dwelling carpenter ant species (found locally in Israel) that build subterranean nest structures. Using a quasi-2D cell filled with artificial sand, the authors perform two complementary sets of experiments to try to link group behavior and nest structure: first, the authors place a mated queen and several pupae into their cell and observe the structures that emerge both before and after the pupae eclose (i.e., "colony maturation" experiments); second, the authors create small groups (of 5,10, or 15 ants, each including a queen) within a narrow age range (i.e., "fixed demographic" experiments) to explore the dependence of age on construction. Some of the fixed demographic instantiations included a manually induced catastrophic collapse event; the authors then compared emergency repair behavior to natural nest creation. Finally, the authors introduce a modified logistic growth model to describe the time-dependent nest area. The modification introduces parameters that allow for age-dependent behavior, and the authors use their fixed demographic experiments to set these parameters, and then apply the model to interpret the behavior of the colony maturation experiments. The main results of this paper are that for natural nest construction, nest areas, and morphologies depend on the age demographics of ants in the experiments: younger ants create larger nests and angled tunnels, while older ants tend to dig less and build predominantly vertical tunnels; in contrast, emergency response seems to elicit digging in ants of all ages to repair the nest.

      We sincerely thank Reviewer #1 for the time and effort dedicated to our manuscript's detailed review and assessment. The revision suggestions were constructive, and we will incorporate them into the next version to improve the manuscript.

      Reviewer #2 (Public review):

      I enjoyed this paper and the approach to examining an accepted wisdom of ants determining overall density by employing age polyethism that would reduce the computational complexity required to match nest size with population (although I have some questions about the requirement that growth is infinite in such a solution). Moreover, the realization that models of collective behaviour may be inappropriate in many systems in which agents (or individuals) differ in the behavioural rules they employ, according to age, location, or information state. This is especially important in a system like social insects, typically held as a classic example of individual-as-subservient to whole, and therefore most likely to employ universal rules of behaviour. The current paper demonstrates a potentially continuous age-related change in target behaviour (excavation), and suggests an elegant and minimal solution to the requirement for building according to need in ants, avoiding the invocation of potentially complex cognitive mechanisms, or information states that all individuals must have access to in order to have an adaptive excavation output.

      We sincerely thank reviewer #2 for the time and effort dedicated to our manuscript's detailed review and assessment. The insightful feedback provided by the reviewer will be incorporated into the successive revisions.

      The only real reservation I have is in the question of how this relationship could hold in properly mature colonies in which there is (presumably) a balance between the birth and death of older workers. Would the prediction be that the young ants still dig, or would there be a cessation of digging by young ants because the area is already sufficient? Another way of asking this is to ask whether the innate amount of digging that young ants do is in any way affected by the overall spatial size of the colony. If it is, then we are back to a problem of perfect information - how do the young ants know how big the overall colony is? Perhaps using density as a proxy? Alternatively, if the young ants do not modify their digging, wouldn't the colony become continuously larger? As a non-expert in social insects, I may be misunderstanding and it may be already addressed in the citations used.

      We thank the reviewer for this interesting question. We find that the nest excavation is predominantly performed by the younger ants in the nest and the nest area increase is followed by an increase in the population. However, if the young ants dig unrestricted, this could result in unnecessary nest growth as suggested by reviewer #2. Therefore, we believe that the innate digging behavior of ants could potentially be regulated by various cues such as;

      (a) Density-based: If the colony becomes less dense as its area expands, this could serve as a feedback signal for young ants to reduce or stop digging, as described in references (25, 29, 30).

      (b) Pheromone depositions: If the colony reaches a certain population density, pheromone signals could inhibit further digging by young ants, references (25, 29,) or space usage as a proxy for the nest area.

      Thus, rather than perfect information, decentralized control, and digging-based local cues probably regulate the level of age-dependent digging, without the ants needing to estimate the overall colony size or nest area.

      In any case, this is an excellent paper. The modelling approach is excellent and compelling, also allowing extrapolation to other group sizes and even other species. This to me is the main strength of the paper, as the answer to the question of whether it is younger or older ants that primarily excavate nests could have been answered by an individual tracking approach (albeit there are practical limitations to this, especially in the observation nest setup, as the authors point out). The analysis of the tunnel structure is also an important piece of the puzzle, and I really like the overall study.

      We thank the reviewer for the comments. We completely agree that individual tracking of ants within our experimental setup would have been the ideal approach, but we were limited by technical and practical limitations of the setup as pointed out by the reviewer such as;

      (a) Continuous tracking of ants in our nests would have required a camera to be positioned at all times in front of the nest, which necessitates a light background. Since Camponotus fellah ants are subterranean, we aimed to allow them to perform nest excavation in conditions as close to their natural dark environment as possible. Additionally, implementing such a system in front of each nest would have reduced the sample sizes for our treatments.

      (b) The experimental duration of our colony maturation and fixed demographics experiments extended for up to six months (unprecedented durations in these kinds of measurements). These naturally limited our ability to conduct individual tracking while maintaining the identity of each ant based on the current design.

      Reviewer #3 (Public review):

      Summary:

      In this study, Harikrishnan Rajendran, Roi Weinberger, Ehud Fonio, and Ofer Feinerman measured the digging behaviours of queens and workers for the first 6 months of colony development, as well as groups of young or old ants. They also provide a quantitative model describing the digging behaviours and allowing predictions. They found that young ants dig more slanted tunnels, while older ants dig more vertically (straight down). This finding is important, as it describes a new form of age polyethism (a division of labour based on age). Age polyethism is described as a "yes or no" mechanism, where individuals perform or not a task according to their age (usually young individuals perform in-nest tasks, and older ones foraging). Here, the way of performing the task is modified, not only the propensity to carry it or not. This data therefore adds in an interesting way to the field of collective behaviours and division of labour.

      The conclusions of the paper are well supported by the data. Measurements of the same individuals over time would have strengthened the claims.

      We sincerely thank reviewer #3 for the time and effort dedicated to our manuscript's detailed review and assessment. We completely agree with the reviewer’s comments on the measurements of the same individuals over time, however, we were limited by the technical and experimental limitations as described above and pointed out by reviewer #2.

      Strengths:

      I find that the measure of behaviour through development is of great value, as those studies are usually done at a specific time point with mature colonies. The description of a behaviour that is modified with age is a notable finding in the world of social insects. The sample sizes are adequate and all the information clearly provided either in the methods or supplementary.

      We thank the reviewer #3 for this assessment.

      Weaknesses:

      I think the paper is failing to take into consideration or at least discuss the role of inter-individual variabilities. Tasks have been known to be undertaken by only a few hyper-active individuals for example. Comments on the choice to use averages and the potential roles of variations between individuals are in my opinion lacking. Throughout the paper wording should be modified to refer to the group and not the individuals, as it was the collective digging that was measured. Another issue I had was the use of "mature colony" for colonies with very few individuals and only 6 months of age. Comments on the low number of workers used compared to natural mature colonies would be welcome.

      Regarding main comment 1

      We completely agree with the reviewer’s comment on considering inter-individual variability based on activity levels. We have discussed how individual morphological variability could influence digging behavior (references: 28, 31), and we will elaborate further on this aspect in future revisions.

      Regarding main comment 2:

      We agree with the reviewer’s comments regarding the wording. The term “mature colony” will be revised in future versions. The wording (“mature colony”‘) will be changed and addressed in the future revisions. We were practically limited by the continuation of the experiments for more than 6 months of age predominantly due to the stability of nests as they were made with a sand-soil mix. We also acknowledge that the colony sizes attained in our maturation experiments may be smaller than those of naturally matured colonies. This trend was observed generally in lab-reared colonies and could be attributed to differences in microclimatic conditions, foraging opportunities, space availability, and other factors. We will address these aspects in more detail in future revisions.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This paper describes the covalent interactions of small molecule inhibitors of carbonic anhydrase IX, utilizing a pre-cursor molecule capable of undergoing beta-elimination to form the vinyl sulfone and covalent warhead.

      Strengths:

      The use of a novel covalent pre-cursor molecule that undergoes beta-elimination to form the vinyl sulfone in situ. Sufficient structure-activity relationships across a number of leaving groups, as well as binding moieties that impact binding and dissociation constants.

      Overall, the paper is clearly written and provides sufficient data to support the hypothesis and observations. The findings and outcomes are significant for covalent drug discovery applications and could have long-term impacts on related covalent targeting approaches.

      Weaknesses:

      No major weaknesses were noted by this reviewer.

      Reviewer #2 (Public review):

      Summary:

      The authors utilized a "ligand-first" targeted covalent inhibition approach to design potent inhibitors of carbonic anhydrase IX (CAIX) based on a known non-covalent primary sulfonamide scaffold. The novelty of their approach lies in their use of a protected pre(pro?)-vinylsulfone as a precursor to the common vinylsulfone covalent warhead to target a nonstandard His residue in the active site of CAIX. In addition to a biochemical assessment of their inhibitors, they showed that their compounds compete with a known probe on the surface of HeLa cells.

      Strengths:

      The authors use a protected warhead for what would typically be considered an "especially hot" or even "undevelopable" vinylsulfone electrophile. This would be the first report of doing so making it a novel targeted covalent inhibition approach specifically with vinylsulfones.

      The authors used a number of orthogonal biochemical and biophysical methods including intact MS, 2D NMR, x-ray crystallography, and an enzymatic stopped-flow setup to confirm the covalency of their compounds and even demonstrate that this novel pre-vinylsulfone is activated in the presence of CAIX. In addition, they included a number of compelling analogs of their inhibitors as negative controls that address hypotheses specific to the mechanism of activation and inhibition.

      The authors employed an assay that allows them to assess target engagement of their compounds with the target on the surface of cells and a fluorescent probe which is generally a critical tool to be used in tandem with phenotypic cellular assays.

      Weaknesses:

      While the authors show that the pre-vinyl moiety is shown biochemically to be transformed into the vinylsulfone, they do not show what the fate of this -SO2CH2CH2OCOR group is in a cellular context. Does the pre-vinylsulfone in fact need to be in the active site of CAIX on the surface of the cell to be activated or is the vinylsulfone revealed prior to target engagement?

      I appreciate the authors acknowledging the limitations of using an assay such as thermal shift to derive an apparent binding affinity, however, it is not entirely convincing and leaves a gap in our understanding of what is happening biochemically with these inhibitors, especially given the two-step inhibitory mechanism. It is very difficult to properly understand the activity of these inhibitors without a more comprehensive evaluation of kinact and Ki parameters. This can then bring into question how selective these compounds actually are for CAIX over other carbonic anhydrases.

      The authors did not provide any cellular data beyond target engagement with a previously characterized competitive fluorescent probe. It would be critical to know the cytotoxicity profile of these compounds or even how they affect the biology of interest regarding CAIX activity if the intention is to use these compounds in the future as chemical probes to assess CAIX activity in the context of tumor metastasis.

      Reviewer #3 (Public review):

      Summary:

      Targeted covalent inhibition of therapeutically relevant proteins is an attractive approach in drug development. This manuscript now reports a series of covalent inhibitors for human carbonic anhydrase (CA) isozymes (CAI, CAII, and CAIX, CAXIII) for irreversible binding to a critical histidine amino acid in the active site pocket. To support their findings, they included co-crystal structures of CAI, CAII, and CAIX in the presence of three such inhibitors. Mass spectrometry and enzymatic recovery assays validate these findings, and the results and cellular activity data are convincing.

      Strengths:

      The authors designed a series of covalent inhibitors and carefully selected non-covalent counterparts to make their findings about the selectivity of covalent inhibitors for CA isozymes quite convincing. The supportive X-ray crystallography and MS data are significant strengths. Their approach of targeted binding of the covalent inhibitors to histidine in CA isozyme may have broad utility for developing covalent inhibitors.

      Weaknesses:

      This reviewer did not find any significant weaknesses. However, I suggest several points in the recommendation for the authors' section for authors to consider.

      Recommendations for the authors:

      Reviewing Editor Comments:

      The reviewers have made excellent suggestions. We believe a revised version addressing those points can improve the assessment and quality of your work.

      Reviewer #1 (Recommendations for the authors):

      (1) The beta-elimination process is referred to as a "rearrangement" in both the text and the Figure 2 legend. Based on the proposed mechanism the authors provided, it is a simple beta-elimination and conjugate addition mechanism, and is not a rearrangement mechanism. This change should be reflected in the text and Figure 2 legend.

      We have made the requested change from rearrangement to elimination reaction.

      (2) From a structure-based design perspective, it is not obvious why only large cyclo-alkyl groups were used to target the lipophilic pocket, with the exception of the phenyl carbamates. Perhaps this is background literature on CAIX that describes this? It seems like this is a flexible functional moiety that could be used to impact drug properties. Why were other lipophilic and especially more aromatic or heteroaromatic moieties not studied?

      The structure-affinity relationship of the lipophilic ring versus other moieties has been studied and reported previously in manuscripts: Dudutiene 2014, Zubriene 2017, Linkuviene 2018, chapter 16 by Zubriene (https://doi.org/10.1007/978-3-030-12780-0_16). The lipophilic ring served better than a flexible tail or an aromatic ring.

      (3) The color-coded "correlation map" in Figure 8 is difficult to follow. Perhaps a standard SAR table with selectivity and affinity values would be easier to read and follow.

      We are trying to promote “correlation maps” because in our opinion they are easier to follow than tables.

      (4) Although there is a statement for this in line 254 of the SI, the compound numbering in the SI, vs. the numbering used in the manuscript is confusing. The standard format for these is to consecutively number all compounds and have identical compound numbers in both the SI and manuscript. The synthetic intermediates included in the SI can be identified by IUPAC names.

      An additional numbering system had to be made because the synthesis was described in the supplementary materials. We would prefer to leave the numbering as in the current manuscript. There are quite a few intermediate compounds that we assigned intermediate numbers such as 20x in order to make it simpler to distinguish intermediate synthesis compounds from compounds that were studied for binding affinity.

      (5) Ranges of isolated yields for the synthetic steps in SI schemes SI, S2, and S3 need to be included.

      We have remade the SI schemes S1, S2, and S3 to include the yields of each compound.

      (6) Presumably, the AcOH/H2O2 reaction forms the sulfones and not sulfoxides when heat is used. In the SI, the structures of 9x and 10x are shown to be sulfoxides and not sulfones. Initially, this is thought to be a simple structural mistake, however, this is concerning, since the HRMS data (for compound 9x) reported is for the sulfoxide (HRMS for C8H7F4NO4S2 [(M+H)+]: calc. 321.9825, found 321.9824. 482) and not the sulfone? In the synthesis scheme S1, condition "C" is used for both the sulfoxide and sulfone synthesis (i.e. 3ax to 9x vs. 12x to 13x). It appears the sulfoxide is prepared using a room temperature procedure, vs. the sulfone requiring 75 degrees centigrade heat. These two similar conditions need to be designated as different synthetic steps in the schemes with the specific conditions noted since the products formed are different.

      We have made requested corrections/adjustments and added separate reaction conditions for sulfoxide synthesis in SI scheme S1.

      Reviewer #2 (Recommendations for the authors):

      I appreciate that it's difficult to determine parameters such as kinact or Ki of such potent inhibitors and ones that work by a two-step mechanism. I might suggest characterizing the steps separately to determine the detailed parameters. Maybe something like NMR for the for the activation step and SPR for the kinact and Ki of the unmasked vinylsulfone?

      We agree that such information would be helpful. However, it requires significant effort and equipment and will be performed in a separate study.

      I always advocate for at least a global proteomics analysis using a pulldown probe to get an idea of the specificity profile, especially for the so-far untried and untested pre-vinylsulfone moiety.

      We fully agree that the pull-down assay is a good idea. However, this major task will be performed in a separate study.

      This might be picky but wouldn't this be considered a pro-vinylsulfone rather than pre-vinylsulfone? Just as the term "prodrug" is used?

      We agree that both the pre-vinylsulfone and pro-vinylsulfone are suitable names. However, in pharmacology, the prodrug is common, but in organic synthesis, the precursor is commonly used. Therefore, we prefer to keep the pre-vinylsulfone.

      I would also be curious to know what species is responsible for activating the compound to the vinylsulfone. Maybe make some key point mutations of nearby basic residues?

      The His64 formed the covalent bond, thus His64 was the likely activating base. Preparing a mutation could be a good path for future studies.

      Reviewer #3 (Recommendations for the authors):

      (1) The authors presented only a close-up view of the active site with a 2Fo-Fc map mesh in three panels of Figure 4. For readers unfamiliar with the carbonic anhydrase field, adding a complete illustration of each protein-inhibitor complex (protein in cartoon mode and ligand in stick) will be helpful. Also, an image of the 180º rotation of the close-up view presented in each panel should be added. Depicting h-bonds between critical residues (Asn62, Gln 92, etc.) with dashed lines and marking the distances will be helpful for readers.

      We have prepared a requested picture for CAIX. Panels on the left show entire protein molecule view of the bound ligands to each isozyme and there are two close-up views for each structure rotated 180 degrees.

      (2) Line 198 should be revised to refer to the correct complexes. 20, 21, and 23 should be 21, 20, 23.

      We appreciate that the reviewer noticed this error. We corrected the mistake.

      (3) Omit electron density maps around each ligand in Figure 4 should be included for compounds 20, 21, and 23, perhaps as a supplementary figure.

      Detailed electron density map information is provided in the mtz files that have been submitted to the PDB. We think the omit maps are not necessary in the supplementary materials.

      (4) The cyclooctyl group is stabilized by hydrophobic active site residues, L131, A135, L141, and L198. However, only L131 is shown in Figure 4. All residues that stabilize the ligands should be shown.

      For clarity purposes of the figure, we have omitted some of the residues that make contact with the ligand molecule. We think that the structure provided to the PDB could be analyzed in detail to see all contacts between the ligand and protein molecule.

      (5) The supplementary table S1 lacks the crystallographic data on the CAIX-23 complex.

      We have added a new version of the supplementary materials that contains the crystallographic data on the CAIX-23 complex.

      (6) A minor peak (30213 Da) with a 638 Dalton shift compared to the unmodified enzyme is for Figure 5A, not Figure 5B, as mentioned in line 235. This sentence in line 235 should be corrected.

      We corrected this mistake.

      (7) As the authors stated in the text, a minor peak (30213 Da) represents a potential second binding site. Can they revisit their electron density maps and show any residual density if it is present around a second histidine residue? The MS data in Figure S17C indicates the presence of additional sites for compound 12. Thus, additional electron density around the secondary and tertiary sites is possible.

      CAII contains His3 and His4 that are at the N-end of the protein and not visible in the crystal structure. The NMR data indicate that the additional modification may occur at one of these His residues.

      (8) MS data were presented for compounds 12 and 22 in Figure 5A, B, but the co-crystal structures were generated with compounds 21, 20, and 23. Why was no MS data included for compounds 20, 21, and 23? Would these compounds show the presence of a secondary binding site? Can authors include the MS data?

      In the main body of the manuscript in Figure 5A we only present MS data on CAXIII with compound 12. It is only an example that confirms covalent interaction. In the supplementary we have MS data for compound 12 with all carbonic anhydrase isozymes and compound 20 with almost all (except CAVI) CA isozymes. There are also MS data provided with numerous compounds (3, 9, 13, and other) and CA isozymes that serve as a control or confirmation of covalent bond formation.

      (9) The coordination between the zinc ion and NH of the ligand is mentioned in the enzyme schematic in Figure 3. Can the distances and coordination with Zinc be illustrated in ligand-bound structures in Figure 4?

      We considered and decided that picture which shows the numerous distances between ligand atoms and protein residues would be difficult to follow. The structures provided to the PDB could be analyzed for every aspect of the complex structure.

      (10) A key difference between covalent (compound 12) and its non-covalent counterpart, compound 5, is the two oxygens attached to sulfur in compound 12. Do protein side chains or water interact with these oxygens? Are these oxygen atoms exposed to solvent? Can authors show the interactions or clarify if there is no interaction?

      The two oxygens in the ligand molecule serve several purposes. First, they pull out electrons and diminish the pKa of the sulfonamide, thus making interaction stronger. Second, the oxygen atoms may make contacts, hydrogen bonds with the protein molecule and may also be important for covalent bond formation. Exact energy contributions cannot be determined from the structure directly. Thus, we decided to not yet explore and delve into this area.

      (11) Fix the font size of the text in lines 355-356.

      The font has been corrected.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      This study explores the therapeutic potential of KMO inhibition in endometriosis, a condition with limited treatment options. 

      Strengths: 

      KNS898 is a novel specific KMO inhibitor and is orally bioavailable, providing a convenient and non-hormonal treatment option for endometriosis. The promising efficacy of KNS898 was demonstrated in a relevant preclinical mouse model of endometriosis with pathological and behavioural assessments performed. 

      Weaknesses: 

      (1) The expression of KMO in human normal endometrium and endometrial lesions was not quantified. Western blot or quantification of IHC images will provide valuable insight.

      Given the differential expression of KMO in luminal epithelial cells lining the endometrial glands compared to the other parts of the endometrium, a general endometrial Western Blot prep is not going to be additionally helpful or accurate in addressing this question, without e.g. laser capture microdissection or single cell quantitative proteomics. Furthermore, KMO is a flavin-dependent monooxygenase and the activity, especially generating the oxidative stressor product 3-hydroxykynurenine is far more dependent on kynurenine substrate availability than it is on actual enzyme abundance - although it is important to show (as we have done), that KMO is present in the human endometrial glands and in human distended endometrial gland-like structures (DEGLS).

      If KMO is not overexpressed in diseased tissues i.e. it may have homeostatic roles, and inhibition of KMO may have consequences on general human health and wellbeing.

      KMO certainly does have important homeostatic roles, for example as key step in the repletion of NAD+ through de novo synthesis. Although with good nutrition and sufficient NAD+ precursors in the diet e.g. niacin, that specific role may be partially redundant. KMO knockout mice exhibit normal fertility and fecundity and do not show a survival deficit compared to littermate wildtype controls (e.g. Mole et al Nature Medicine 2016). To further develop KNS898 towards clinical use, preclinical GLP safety and toxicology studies and human Phase 1 clinical trials will of course need to be completed, but that is standard for the development of any new drug

      In addition, KMO expression in control mice was not shown or quantified.

      Control mice that were not inoculated intraperitoneally with endometrial fragments did not develop DEGLS and therefore there is nothing to show or quantify.

      Images of KMO expression in endometriosis mice with treatments should be shown in Figure 4.

      We have now included a representative KMO immunohistochemistry image from each endometriosis group and included all KMO immunohistochemistry images in Supplementary Information.

      The images showing quantification analysis (Figure 4A-F) can be moved to supplementary material.

      This recommendation contradicts the emphasis placed by the same reviewer earlier regarding quantification, so we have elected to keep it where it is.

      (2) Figure 1 only showed representative images from a few patients. A description of whether KMO expression varies between patients and whether it correlates with AFS stages/disease severity will be helpful. Images from additional patients can be provided in supplementary material. 

      We have added extra information to the Figure legend to clarify the disease stage of the superficial peritoneal lesions which were illustrated (Stage I/II) and to link them to the information in supplementary Table S1. In total we examined 11 peritoneal lesions and 5 ovarian lesions (stage III/IV) – in every sample examined immunopositive staining was most intense in epithelial cells lining gland-like structures. Sections illustrated were chosen to illustrate this key finding.

      (3) For Home Cage Analysis, different measurements were performed as stated in methods including total moving distance, total moving time, moving speed, isolation/separation distance, isolated time, peripheral time, peripheral distance, in centre zones time, in centre zones distance, climbing time, and body temperature. However, only the finding for peripheral distance was reported in the manuscript. 

      This was indeed a large amount of output, which we rationalised for the benefit of a concise paper. The paper now includes a description of which parameters showed a difference with drug treatment.

      (4) The rationale for choosing the different dose levels of KNS898 - 0.01-25mg/kg was not provided. What is the IC50 of a drug? 

      KNS898 dosing has been extensively characterised by us in multiple species, and the pIC50 has already been published (e.g. Hayes et al Cell Reports 2023 and elsewhere). We now include the pIC50 in the present manuscript to save the reader from having to search through another reference.

      (5) Statistical significance: 

      (a) Were stats performed for Fig 3B-E?

      Now included, thank you.

      (b) Line 141 - 'P = 0.004 for DEGLS per group' 

      However, statistics were not shown in the figure. 

      Thanks, now displayed on figure.

      (c) Line 166 - 'the mechanical allodynia threshold in the hind paw was statistically significantly lower compared to baseline for the group' 

      However, statistics were not shown in the figure. 

      (d) Line 170 - 'Two-way ANOVA, Group effect P = 0.003, time effect P < 0.0001' The stats need to be annotated appropriately in Figure 5A as two separate symbols. 

      Arguably the far more important comparison in this figure is whether there is any effect of treatment, and to mark multiple statistical comparisons on the figure would make it difficult to understand. Instead, the figure legend and results text have been clarified on this point.

      (e) Figure 5B - multiple comparisons of two-way ANOVA are needed. G4 does not look different to G3 at D42. 

      Multiple comparison testing (Dunnett’s T3) was done and the results have been clarified in the text and figure legends.

      (f) Line 565 - 'non-significant improvement in KNS898 treated groups'. However, ** was annotated in Figure 5A. 

      Thank you. This is an error that has been checked and corrected.

      (6) Discussion is very light. No reference to previous publications was made in the discussion. Discussion on potential mechanistic pathways of KYR/KMO in the pathogenesis of endometriosis will be helpful, as the expression and function of KMO and/or other metabolites in endometrial-related conditions. 

      The discussion is deliberately concise and focussed. The paper has 21 references to previous publications. A speculative discussion is generally not favoured by us.

      The findings in this study generally support the conclusion although some key data which strengthen the conclusion eg quantification of KMO in normal and diseased tissue is lacking.

      We differ from the reviewer here and do not think that those data would materially affect the likelihood of KMO inhibition being efficacious in human endometriosis in Phase 2/3 clinical trials.

      Before KMO inhibitors can be used for endometriosis, the function of KMO in the context of endometriosis should be explored eg KMO knockout mice should be studied. 

      We take the view that before KMO inhibitors can be used for endometriosis in patients there are multiple other regulatory and clinical development steps that are required that would be a priority. While using a KMO knockout mouse might be an interesting scientific experiment, it would not impact on the critical path in a material way.

      Reviewer #2 (Public Review): 

      Summary: 

      The authors aim to address the clinical challenge of treating endometriosis, a debilitating condition with limited and often ineffective treatment options. They propose that inhibiting KMO could be a novel non-hormonal therapeutic approach. Their study focuses on: 

      • Characterising KMO expression in human and mouse endometriosis tissues. 

      • Investigating the effects of KMO inhibitor KNS898 on inflammation, lesion volume, and pain in a mouse model of endometriosis. 

      • Demonstrating the efficacy of KMO blockade in improving histological and symptomatic features of endometriosis. 

      Strengths: 

      • Novelty and Relevance: The study addresses a significant clinical need for better endometriosis treatments and explores a novel therapeutic target. 

      • Comprehensive Approach: The authors use both human biobanked tissues and a mouse model to study KMO expression and the effects of its inhibition. 

      • Clear Biochemical Outcomes: The administration of KNS898 reliably induced KMO blockade, leading to measurable biochemical changes (increased kynurenine, increased kynurenic acid, reduced 3-hydroxykynurenine). 

      Weaknesses: 

      • Limited Mechanistic Insight: The study does not thoroughly investigate the mechanistic pathways through which KNS898 affects endometriosis. Specifically, the local vs. systemic effects of KMO inhibition are not well differentiated. 

      While we agree that this is not a comprehensive mechanistic analysis, given that the ultimate therapy would be almost certainly a once daily oral dosing i.e. systemic administration, we do not consider differentiating local vs systemic effects of KMO inhibition to be critical to therapeutic development in this scenario.

      • Statistical Analysis Issues: The choice of statistical tests (e.g., two-way ANOVA instead of repeated measures ANOVA for behavioral data) may not be the most appropriate, potentially impacting the validity of the results. 

      The selection of two-way ANOVA (time and group) is sufficient and correct for this experimental analysis and its use does not invalidate the results. We agree that repeated measures ANOVA could be a valid alternative.

      • Quantification and Comparisons: There is insufficient quantitative comparison of KMO expression levels between normal endometrium and endometriosis lesions,

      Please see response above to quantification question raised by Reviewer 1.

      and the systemic effects of KNS898 are not fully explored or quantified in various tissues. 

      Please see earlier responses. KNS898 has been thoroughly explored in multiple tissues, species and experimental models, but those data do not need rehearsed here.

      • Potential Side Effects: The systemic accumulation of kynurenine pathway metabolites raises concerns about potential side effects, which are not addressed in the study. 

      As discussed above (response to Reviewer 1), KMO knockout mice exhibit normal fertility and fecundity and do not show a survival deficit compared to littermate wildtype controls (e.g. Mole et al Nature Medicine 2016). To further develop KNS898 towards clinical use, preclinical GLP safety and toxicology studies and human Phase 1 clinical trials will naturally need to be completed, but this is standard for the development of any new drug.

      Achievement of Aims: 

      • The authors successfully demonstrated that KMO is expressed in endometriosis lesions and that KNS898 can induce KMO blockade, leading to biochemical changes and improvements in endometriosis symptoms in a mouse model. 

      Support of Conclusions: 

      • While the data supports the potential of KMO inhibition as a therapeutic strategy, the conclusions are somewhat overextended given the limitations in mechanistic insights and statistical analysis. The study provides promising initial evidence but requires further exploration to firmly establish the efficacy and safety of KNS898 for endometriosis treatment. 

      We do not agree that the conclusions are overextended based on the data presented, as expanded in the reply to the eLife editorial assessment at the beginning of this response. It is clear that additional preclinical, regulatory and clinical development work, and human clinical trials will be required to firmly establish the efficacy and safety of KN898 for endometriosis treatment.

      Impact on the Field: 

      • The study introduces a novel therapeutic target for endometriosis, potentially leading to non-hormonal treatment options. If validated, KMO inhibition could significantly impact the management of endometriosis. 

      Utility of Methods and Data: 

      • The methods used provide a foundation for further research, although they require refinement. The data, while promising, need more rigorous statistical analysis and deeper mechanistic exploration to be fully convincing and useful to the community. 

      We believe that the data are a) convincing, and b) useful to the community. To be advanced effectively towards patients, KNS898 needs to follow the critical development path outlined above.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      (1) Change 'hyperalgia' to hyperalgesia throughout the manuscript including the title. 

      Done

      (2) Line 69 - write '3-HK' in full. 

      Done

      (3) Line 85 - the findings of the study include 'define the preclinical efficacy of KNS898 in reducing inflammation'. The inflammatory profile was not studied. 

      Changed to “disease”

      (4) Line 259 - write 'EPHect' in full. 

      Done

      (5) Line 260 - write 'AFS' in full. Also, abbreviate 'AFS' in the caption of Table S1. 

      Done

      (6) 20 patients were listed in Table S1 but only 19 were accounted for in the methods section. 

      Apologies there was an error and has now been corrected in the methods section as one of the endometrial samples had not been included. Table S1 has also been changed to make it clear which samples were eutopic endometrium to differentiate them from the lesions.

      (7) The location from which the endometrial lesion tissues were obtained should be provided in Table S1. 

      Table S1 has been changed to make it clear that the subtypes of lesions examined were classified as Stage I/II – superficial peritoneal subtype and Stage III/IV – endometrioma. The methods section has also been updated to reflect these subtypes (lines 272-277).

      (8) Table S2 - G5 should be given compound 'A' not 'B'. 

      Thank you. Corrected.

      (9) Figure 2E was not referenced in the text and no figure legend was provided. 

      Now referenced and the figure legend updated.

      (10) Figure 3A - font needs to be enlarged. HCA baseline recording was annotated as performed twice in the protocol. When is the baseline taken and on what day was the Week 12 measurement taken (refer to Figures 5C and D)? 

      Font has been enlarged as requested. The second HCA baseline annotation in Fig 3A is a cut-and-paste error, now rectified and the time of second measurement annotated.

      (11) Line 133 - 'In KNS898-treated group G4 (endometriosis + treatment from Day 19), DEGLS formed in 4 of 15 mice (26.7%) and in G5 (Endo + treatment start on Day 26) in 6 of 15 mice (40%) (Fig. 3f).'. The aforementioned data is not reflected in Figure 3F. 

      Thank you. This has been rectified.

      (12) Line 137 - 'Mice with endometriosis receiving KNS898 from the time of inoculation (G4) had an average of 2.0 DEGLS per animal with DEGLS (total = 8 DEGLS in 4 mice in G4) and those receiving KNS898 1 week after inoculation (G5) had an average of 1.8 DEGLS per animal (total = 11 DEGLS in 6 mice in G5) (Figs. 3g and 3h).' 

      The aforementioned data is not reflected in Figure 3G. There is no Figure 3H shown. 

      Rectified as above.

      (13) Provide a discussion of why KA levels were significantly lower in Figure 3E compared to Figure 2C. 

      (14) Figure legend for Figure 3 - G1 and G2 were noted as n=8. However, Figure S1 and Table S2 noted both groups as n=10. 

      Thank you. This is a typographical error. The legend for Fig 3 should indeed read n=10 for G1 and G2 and has been corrected.

      (15) Line 181 - 'compared to non-operated and sham-operated control groups'. Only the sham group was shown in Figures 5C and D. 

      This text has been clarified to refer only to the data shown.

      (16) Figure 1 images need scalebars. Same for Figure 4. 

      Now added

      (17) Figure 3B - y-axis is fold change? 

      Relative concentration. Legend has been clarified.

      (18) Figures 5A and B - are the last Von Frey measurements taken on Day 40 (as per Figure 3A) or 42?

      Taken on Day 42. Fig 3A (the prospective protocol figure) has been clarified to reflect what actually happened (D42) as opposed to what was planned (D40) to pre-empt any further confusion.

      (19) Symbols in Figure S1 need to be explained in the Figure legend. 

      Done

      (20) Figures 2A and 2D should not be plotted in log scale to match the description of results in Line 106 and Line 118. 

      These particular results are plotted on a log scale to allow the reader to visualise that detectable levels of drug are measurable at very low doses and that there is no significant pharmacodynamic effect at that low dose. We choose to retain the present format.

      Reviewer #2 (Recommendations For The Authors): 

      Comments and queries 

      Introduction/aims section: 

      Line 82 - 87: Clarify in the proposal aims what is being accessed and analysed in humans and/or in animal models (mice). Specifically state clearly the correlations with KMO expression. Were the correlations between KMO expression with features of inflammation performed only in mice or also in humans? 

      Thank you for this comment. The aims have been clarified in the Introduction.

      Section - KMO is expressed in human eutopic endometrium and human endometriosis tissue lesions: 

      Was any quantitative or semi-quantitative method used to quantify the KMO expression in human tissues? Although the authors claimed that "KMO was strongly immunopositive in human peritoneal endometriosis lesions" by the representative figures it is not clear if KMO expression is similar, higher or lower between normal endometrium and peritoneal endometriosis lesions. 

      We have added extra information to the legend of Figure 1 to identify the PIN number of the superficial lesions illustrated. The key finding from the immunostaining with the antibody which had been previously validated as specific for KMO was that the most intense immunopositive response was in glandular epithelial cells and the samples illustrate this result.

      Section - Oral KNS898 inhibits KMO in mice: 

      The authors clearly confirmed the target engagement of KNS898 in inhibiting KMO activity and, therefore, affecting upstream and downstream metabolites systemically in (peripheral fluid/ plasma) mice. Whether KNS898 effect is broad and targets systemic immune cells and whole body cells and tissue was not explored. It was also not explored if KNS898 is able to specifically inhibit KMO locally at the endometrium tissue by targeting epithelial and/or infiltrated immune cells, for example. 

      That is correct.

      It would be interesting to measure (or if it was measured to report in this section and also in Figure 2) the levels of KYN, KA and 3HK in naïve animals that did not receive KNS898. It would help to understand the net effect of KNS898 on the levels of kynurenine pathway metabolites and, therefore, justify the dose chosen.

      These data are already presented in Fig 3B-E, control group.

      Perhaps then the chosen dose could be lower considering the possible substantial changes in kynurenine pathway metabolites levels, which are reported to exert an effect in many cells, tissues and systems and could, therefore, precipitate side effects. Even more considering that the values for these metabolites are expressed as ng/ml, which hinders the comparison of the metabolite levels with the one reported for naïve animals in the literature. I would also suggest expressing the metabolite levels as nM/L. 

      This is not a relevant method of determining dose-limiting toxicity or safety pharmacology/toxicology, either non-GLP or GLP. There are international guidelines on the proper conduct of those studies. This is also why it is important not to make claims about the safety or otherwise of an experimental compound in an in vivo setting that has not explicitly complied with those regulatory standards. With regard to the units recommendation, accepted units are ng/mL or nM, not usually nM/L.

      Section - KMO blockade reduces endometrial gland-like lesion burden in experimental endometriosis in mice: 

      Line 130: It would be better to replace "blockade of 3HK production" with "reduction of 3HK production" to better reflect the results. 

      Changed to “inhibition of 3HK production”.

      Line 140: In G5 (treatment starting at Day 26/ 1 week after inoculation), is the experimental model of endometriosis already established with all pathological and phenotypic features? 

      This was not specifically tested in this experiment.

      Lines 146 - 148: It would be better to specify that "Overall, there was no significant difference IN BODY WEIGHT between G3 and the KNS898 treatment groups G4 and G5 (endometriosis + treatment from Day 26)". Otherwise, this last sentence might be interpreted as the overall conclusion of this result sub-section. 

      Thank you, a good point and has been corrected.

      The authors demonstrated with an experimental approach that KMO blockade reduces a pathological measure of endometriosis i.e., endometrial gland-like lesion burden, in experimental endometriosis in mice when both administrated concomitant but also after the disease development. Although mechanistic insights about how reduced KMO activity can reduce the developed distended endometrial gland-like structures were not explored. Therefore, it remains to be investigated which (and how ) kynurenine pathway metabolites are directly linked to the beneficial effects of KMO blockade in the experimental model of endometriosis.

      We agree.

      Although the beneficial effects on the pathological measures are evident, Figure 3 shows an exorbitant accumulation of KYN and KA and also a substantial reduction in 3HK after the treatment with KNS898, which then raises concerns about tolerability and side effects. Would this effective KNS898 dose be viable and translational as a therapeutic approach? 

      Please refer to comments above at multiple junctures about safety pharmacology and the clinical development critical path.

      Section - KMO is expressed in experimental endometriosis in mice: 

      By histological examination, the authors confirm that the treatment with KNS898 specifically reduced the KMO expression intensity in the DEGLS from mice. Therefore, the effect exerted by KNS898 locally on the KMO expression at the DEGLS could be, at least, partially responsible for the beneficial effects observed in Figure 3 i.e., the reduction of pathological measures. Although remains to be explored whether the effect of KNS898 in other cells or tissues could also be accountable for the beneficial effects exerted by KNS898 on the animal model of endometriosis. 

      This is correct.

      From a logical experimental point of view, I would suggest switching the order of the result subsection "KMO blockade reduces endometrial gland-like lesion burden in experimental endometriosis in mice" and "KMO is expressed in experimental endometriosis in mice" as well as the respective Figures 3 and 4. 

      We do not agree. Fig 3 (and section) is the macroscopic enumeration of DEGLS, Fig 4 (and section) is the microscopic and immunohistochemical evaluation of the lesions introduced in Fig 3. The sequence as originally presented is the more logical.

      Sections - KMO inhibition reduces mechanical allodynia in experimental endometriosis - and - KMO inhibition reduces mechanical allodynia in experimental endometriosis: 

      The authors suggested that the KMO inhibition with KNS898 exerts beneficial effects on behavioural paradigms related to the experimental model of endometriosis. Based on the statistical analysis performed for the author, KMO inhibition with KNS898 reduces mechanical allodynia, as well as rescues, impaired cage exploration behaviour and mobility in mice with endometriosis. However, I believe that the most indicated statistical tests for Von Frey (allodynia behaviour) and Home cage (illness behaviour) analyses over time would be repeated measures ANOVA and paired t-test, respectively (and not two-way ANOVA as performed). Therefore for a more trustful analysis and interpretation of this data set, I would suggest the authors modify the statistical analysis and report the corresponding interpretation of these tests. 

      The selection of two-way ANOVA (time and group) is suitable for this experimental analysis and its use does not invalidate the results. We agree that repeated measures ANOVA could be a valid alternative.

      Overall, the authors present a solid and useful case for KMO inhibition as a potential therapeutic strategy for endometriosis. However, the study would benefit from more detailed mechanistic insights, appropriate statistical analyses, and an evaluation of potential side effects. With these improvements, the research could have a significant impact on the field and pave the way for new treatment modalities for endometriosis. 

      We thank the reviewer for the positive comments and we have responded to the criticisms above.

      Specific recommendations for improvement: 

      • Mechanistic Studies: Conduct detailed studies to understand the local vs. systemic effects of KMO inhibition and its specific impacts on different cell types and tissues. If not feasible here, the authors could include in the discussion section a detailed overview of the possible mechanisms implicated. 

      While we agree that this is not a comprehensive mechanistic analysis, given that the ultimate therapy would be almost certainly a once daily oral dosing i.e. systemic administration, we do not consider differentiating local vs systemic effects of KMO inhibition to be critical to therapeutic development in this scenario. We do not think speculation about possible mechanisms that is not supported by experimental data should be included. Furthermore, that notion (of statements not supported by data) has been given as a criticism by the reviewers, and therefore consistency on this point must be preferable.

      • Quantitative Analysis: Include more robust quantitative methods to compare KMO expression levels in different tissues and assess the correlation between KNO expression and pathological and behavioural changes. 

      As discussed above, the pathophysiological importance of KMO is in its enzymatic activity, not in its abundance as a protein, and 3HK production is far more dependent on kynurenine substrate availability rather than KMO protein abundance.

      • Appropriate Statistics: Use the most suitable statistical tests for behavioural and other repeated measures data to ensure accurate interpretation. 

      As discussed above

      • Side Effect Evaluation: Investigate potential side effects of systemic KMO inhibition, particularly focusing on the long-term implications of altered kynurenine pathway metabolites. If not feasible here, the authors could include in the discussion section a detailed overview of the possible side effects associated as well as inform if KNS898 can cross the BBB and its implications. 

      For a novel small molecule therapeutic compound in preclinical/clinical development, there are strictly regulated preclinical and clinical development standards that need to be met. It would not be responsible to publish or make claims about safety and potential adverse effect profiles without conducting the proper panel of tests within a suitable regulatory framework.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Orlovskis and his colleagues revealed an interesting phenomenon that SAP54-overexpressing leaf exposure to leafhopper males is required for the attraction of followed females. By transcriptomic analysis, they demonstrated that SAP54 effectively suppresses biotic stress response pathways in leaves exposed to the males. Furthermore, they clarified how SAP54, by targeting SVP, heightens leaf vulnerability to leafhopper males, thus facilitating female attraction and subsequent plant colonization by the insects.

      Strengths:

      The phenomenon of this study is interesting and exciting.

      Weaknesses:

      The underlying mechanisms of this phenomenon are not convincing.

      We thank the reviewer for the comment of finding our study interesting and exciting. However, we respectfully disagree with the reviewer assertion that the mechanisms we uncovered are unconvincing.

      We have uncovered a significant portion of the mechanisms by which SAP54 induces the leafhopper attraction phenotype.

      First, we discovered that the SAP54-mediated attraction of leafhoppers requires the presence of male leafhoppers on the leaves. Female leafhoppers were only attracted and laid more eggs on leaves when both SAP54 and male leafhoppers were present. In the absence of either males or SAP54, female leafhoppers did not exhibit this behaviour.

      Second, we found that biotic stress responses in leaves were significantly downregulated when exposed to SAP54 and male leafhoppers, with a much lesser effect observed in the presence of females.

      Third, we identified that the presence of the MADS-box transcription factor SHORT VEGETATIVE PHASE (SVP) in leaves is crucial for the leafhopper attraction phenotype, and that SAP54 facilitates the degradation of SVP.

      Our research corroborates previous findings that SAP54-mediated degradation of MADS-box transcription factors depends on the 26S proteasome shuttle factor RAD23, which we found previously to also be necessary for the leafhopper attraction phenotype (MacLean et al., 2014. PMID: 24714165). This finding has been replicated by other research groups. Previous research has also revealed that leafhoppers are specifically attracted to leaves, not to the leaf-like flowers (Orlovskis & Hogenhout, 2016. PMID: 27446117).

      Collectively, these results suggest that SAP54 acts as a "matchmaker", helping male leafhoppers locate mates more easily by degrading SVP-containing complexes in leaves. We have updated the model in Fig. 7 to better illustrate our findings.

      Reviewer #2 (Public Review):

      Summary:

      In this study, the authors show that leaf exposure to leafhopper males is required for female attraction in the SAP54-expressing plant. They clarify how SAP54, by degrading SVP, suppresses biotic stress response pathways in leaves exposed to the males, thus facilitating female attraction and plant colonization.

      Strengths:

      This study suggests the possibility that the attraction of insect vectors to leaves is the major function of SAP54, and the induction of the leaf-like flowers may be a side-effect of the degradation of MTFs and SVP. It is a very surprising discovery that only male insect vectors can effectively suppress the plant's biotic stress response pathway. Although there has been interest in the phyllody symptoms induced by SAP54, the purpose, and advantage of secreting SAP54 were unknown. The results of this study shed light on the significance of secreted proteins in the phytoplasma life cycle and should be highly evaluated.

      Weaknesses:

      One weakness of this study is that the mechanisms by which male and female leafhoppers differentially affect plant defense responses remain unclear, although I understand that this is a future study.

      The authors show that female feeding suppresses female colonization on SAP54-expressing plants. This is also an intriguing phenomenon but this study doesn't explain its molecular mechanism (Figure 7).

      Strengths:

      We appreciate the reviewer's assessment of the strengths of our study. We do indeed discuss the possibility that the induction of leaf-like flowers could be a side effect of the SAP54 effector function. However, it is not uncommon for effectors to have multiple functions, as has been frequently demonstrated for viral proteins (e.g., PMID: 34618877). Furthermore, it is increasingly evident that developmental and immune processes in organisms often overlap and are mediated by the same proteins. A notable example is the Toll-like receptors, which are widely recognized for their role in innate immunity but were initially discovered for their involvement in various developmental processes (e.g., PMID: 29695493).

      MADS-box transcription factors are known to regulate various developmental pathways in plants, and their diversification has been a key driver of evolutionary innovations in plant development. These factors are comparable to HOX genes, which are essential for the development of bilateral animals. While the role of MADS-box transcription factors in orchestrating flowering has been well-documented, recent evidence has emerged showing that they also play a role in regulating immune processes in plants. Our findings contribute to this emerging understanding, presenting novel insights into the multifunctional roles of these transcription factors.

      Specifically, the MADS-box transcription factor SVP has vital roles in both plant immunity and flowering. The SAP54-mediated targeting of this transcription factor may therefore confer multiple advantages to phytoplasmas that, as obligate colonisers, depend on plants and transmission by insects for survival. Firstly, the inhibition of flowering could delay plant senescence and death, which is particularly relevant in annual plants, the primary hosts of AY-WB phytoplasma studied here. Secondly, the downregulation of plant defence responses, particularly against males, facilitates the attraction of females, which are more likely to reproduce and thus increase the number of vectors for phytoplasma transmission. Given that phytoplasmas are obligate organisms with highly reduced genomes, it is plausible that they rely on ‘efficient proteins’ capable of targeting multiple key pathways in their hosts.

      Weaknesses:

      As explained above, we have uncovered a substantial portion of the mechanisms through which SAP54 induces the leafhopper attraction phenotypes that includes the identification of MADS-box transcription factor SVP as an important contributor. We have updated the model in Fig. 7 to better illustrate our findings.

      It is known that SVP forms quaternary structures with other (MADS-box) transcription factors, and it is seems likely that the degradations of specific SVP complexes present in fully developed leaves play a significant role in the downregulation of immune genes in the presence of SAP54 and males. These specific complexes also do not form in svp mutants, which could explain why females are attracted to these mutant plants in the presence of males. However, transcription profiles are different in male-exposed SAP54 vs male-exposed svp plants. This may be explained by SVP having multiple functions, including those that are not targeted by SAP54.

      Identifying which SVP complexes contribute to the male-mediated downregulation of immunity in the presence of SAP54 would require the development of a broad range of tools to investigate plant immunity without the confounding effects of developmental changes. This line of inquiry extends beyond the findings presented in this study.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Orlovskis and colleagues revealed an interesting phenomenon that SAP54-overexpressing leaf exposure to leafhopper males is required for the attraction of followed females. By transcriptomic analysis, they demonstrated that SAP54 effectively suppresses biotic stress response pathways in leaves exposed to the males. Furthermore, they clarified how SAP54, by targeting SVP, heightens leaf vulnerability to leafhopper males, thus facilitating female attraction and subsequent plant colonization by the insects. The discovery of this study is interesting and exciting. However, I have a few concerns that require authors to address.

      (1) The author demonstrated that SAP54-overexpressing leaf exposure to leafhopper males is more attractive to females. However, I was confused that the author did not analyse the choice preference of males. This is important, as the author demonstrated later that "SAP54 plants exposed to males display significant downregulation of biotic stress responses". It is very possible that the female is attracted by a mating signal, but not by reduced biotic stress responses. Also, it is important to address whether the female used in this study is virgin.

      We have analysed male preference in feeding choice tests (Figure 1, treatment 3) and described our findings in the text (p7; lines 214-216). For added clarity, we have revised the text on p7 (lines 214-216) to specify that males alone do not show any feeding preference for SAP54 plants.

      Additionally, we investigated whether females could be attracted to male-exposed SAP54 plants prior to landing and feeding using choice experiments, as depicted in Supplemental Figure 3 and discussed in the text (p9; lines 265-271). These findings suggest that long-distance cues alone do not fully account for the female attraction phenotype observed in Figure 1. We acknowledge that mating calls or volatiles may complement or enhance the transcriptional changes in male-exposed SAP54 leaves. This interpretation is further supported by comparing Figure 1, treatments 4 and 5, which shows that removing males from SAP54 leaves before female choice does not increase female colonisation. To enhance clarity and precision, we have added the term "solely" to the results (p9; line 265) and discussion (p25; line 719), and included a new sentence on p26 (lines 726-730): "However, given that the removal of males from SAP54 leaves prior to female choice does not enhance female colonisation (comparison of Figure 1, treatment 4 with treatment 5), we cannot exclude the possibility that male-produced volatiles or mating calls could enhance or supplement SAP54-dependent changes in biotic stress responses to males, thereby enhancing female attraction."

      We have also updated the methods section to clarify that a mixture of virgin and pre-mated females was used in all experiments (p28; lines 798-799), consistent with our previously published work (Orlovskis & Hogenhout, 2016. PMID: 27446117; MacLean et al., 2014. PMID: 24714165).

      (2) I was confused by the rationality of the section "Female leafhopper preference for male-exposed SAP54 plants unlikely involves long-distance cues". The volatile cues or mating calls from males can be only perceived from a distance?

      As mentioned in our response to comment 1, for clarity, we have added new text to both the results (p9; line 265) and discussion sections (p25; lines 719 and 726-730). In the results section highlighted by the reviewer (p8-9), we aimed to explicitly test whether cues produced by males (such as mating calls or pheromones) or SAP54 plants (such as plant volatiles) could account for female attraction from a distance, independent of, and prior to, physical contact with the plants or male insects.

      To address the possibility that volatiles or mating calls might be perceived simultaneously with downregulated biotic stress responses, we have included an additional sentence in the discussion, which addresses comments 1 and 2 from the reviewers. Furthermore, it is important to note that Figure 1, treatment 4, mirrors the results of Figure 1, treatment 1, suggesting that direct physical contact between males and females is not necessary for the observed female attraction. This conclusion, derived from our experiments, was already emphasised in the main text (p7; lines 218-222).

      (3) Line 271-273. How the author concluded the "immediate access". A time course experiment (detect the number of insects on each plant at different time point) for host-choice experiment is necessary.

      We have corrected and rephrased the sentence as follows:

      ‘’Therefore, these results indicate that female reproductive preference for the male-exposed SAP54 versus GFP plants is dependent on immediate access of the direct females access to the leaves of SAP54 plants and presence of males on these leaves.’’ (p9; lines 267-271).

      (4) I appreciate the transcriptome analysis. However, the figures are poorly organized. i.e. the heatmap in Figure 2 was poorly understood. The author should clearly address what is upregulated or downregulated. It is meaningless to exhibit the heatmap without explaining what gene represented. Also, it is hard for readers to distinguish the difference between the 4 maps in Figure 2, similar to the two figures in Figure 3.

      We thank the reviewer for the recommendation. To make Figure 2 and 3 easier to read and understand as stand-alone, we have changed and improved the corresponding figure legends, highlighting the colouring of up- and down-regulated DEGs as well as explaining the related supplementary file content in figure legends. For brevity and clarity, we have removed the mentioning of figure supplement 4, 5 and 6 as they have already been explained and referred to in the main text but do not directly relate to Figure 2 or 3 but rather data processing prior to analysis in Figure 2.

      We hope that the improvements in figure legends will make the Figures 2 and 3 easier and quicker to understand.

      (5) For transcriptomic analysis, three out of four replicates were well clustered, and the author excluded the outliers in subsequent analysis. Is this treatment commonly used in transcriptomic analysis? If yes, please provide corresponding references.

      Removing outliers from transcriptomic data is not unusual, as it enhances the classification of treatment groups and increases the efficiency of detecting biologically relevant differentially expressed genes (DEGs) (PMID: 36833313; PMID: 32600248). For large datasets, especially in clinical studies, automated procedures and algorithms have been developed for this purpose (PMID: 32600248; doi.org/10.1101/144519). Given our relatively small sample size of 4, we opted for a PCA-based manual outlier evaluation, followed by repeated PCA without the identified outliers. This approach demonstrated improved group discrimination (Figure Supplement 4), which can enhance downstream characterization of DEGs and pathways that explain female preference for male-exposed SAP54 plants. We have detailed this procedure on pages 9-10. It is worth noting that other automated outlier removal methods, which are also based on PCA, have been shown to be as effective as manual outlier removal (PMID: 32600248).

      (6) Figure 5A. How the experiment was done? The HA-SVP and other HA-tagged genes were stably or transiently expressed in GFP and GFP-SAP54 plants? How many replicates were conducted? The band intensity from different biological replicates should be provided. In this manuscript, no information is provided even in the method section.

      We thank the reviewer for noticing this and have updated the methods section providing more details on transient protoplast expression assays (p39; line 835). We have performed two independent degradation assays for all 5 MTF proteins and indicated in the legend of Figure 5. Western blot results from both experiments are provided as a new figure supplement 10 (p53). The degradation/destabilisation efficiency was calculated as the HA intensity divided by the RuBisCo large subunit (rbcL) intensity from the same sample, normalised to the intensity of the sample with the highest ratio from the same leaf (Rel HA/rbcL) using ImageJ. Relative pixel intensities are provided above each treatment in new figure supplement 10, as requested by the reviewer.

      (7) For the interaction assay, only Y2H was conducted. Generally, at least two methods are needed to confirm protein interaction. This is also applicable to degradation assays.

      There is substantial prior evidence that SAP54 interacts with MADS-box transcription factors and facilitates their degradation in plants, a process that also involves the 26S proteasome shuttle factor RAD23 (MacLean et al., 2014; PMID: 24714165). This interaction has been independently confirmed by other research groups using various methods, including split-YFP assays (e.g., PMID: 24597566, PMID: 26179462). Given the extensive data already available on this topic, it would be redundant to replicate all of these findings in our manuscript. Instead, we have focused on a few validated assays that effectively demonstrate the specific interactions between SAP54 and MADS-box transcription factors.

      (8) Lines 528-530. No direct evidence in this study was provided for how SAP54-mediated degradation of SVP. The author should tone down the claim.

      Our findings demonstrate that SVP is degraded in plant cells in the presence of SAP54. Additionally, through yeast two-hybrid assays, we show that SAP54 does not directly bind to SVP but does directly interact with several MADS-box transcription factors known to associate with SVP. We also provide evidence that they interact with SVP herein. Furthermore, previous studies have shown that SAP54 facilitates the degradation of MADS-box transcription factor complexes of Arabidopsis and several other eudicot species (PMID: 24597566, PMID: 26179462, PMID: 28505304, PMID: 35234248; PMID: 38105442). We have described observations herein and of others (see main text pages 4-5,  pages 19-20), and believe that we have presented them accurately without overstating our conclusions.

      (9) Overall, the phenomenon of this study is interesting, but the underlying mechanisms are not solidified. Additional work is still needed in future studies.

      We respectfully disagree—we have identified a significant portion of the mechanisms by which SAP54 induces these phenotypes. As with any research, new data often leads to further questions that may be addressed by follow-up studies. Please refer to our previous responses for additional context.

      Reviewer #2 (Recommendations For The Authors):

      Major comment

      It will be interesting to see how long male feeding affects changes in gene expression in plants. No feeding choice of females was observed on the SAP54 plants when males were removed from the clip-cages prior to the choice test with females alone (Figure 1, Treatment 5; Figure Supplement 1, Treatment 5). This indicates that SAP54 plants lose their ability to attract females as soon as males are removed. On the other hand, if the suppression of the plant's stress response pathway by male feeding continues for some time even after males are removed, I think that we cannot exclude the possiblity that volatiles emitted by males may partially promote female feeding and colonization.

      As described above, our findings suggest that long-distance cues alone do not fully account for the female attraction phenotype observed in Figure 1. We acknowledge that mating calls or volatiles may complement or enhance the transcriptional changes in male-exposed SAP54 leaves. This interpretation is further supported by comparing Figure 1, treatments 4 and 5, which shows that removing males from SAP54 leaves before female choice does not increase female colonisation. To enhance clarity and precision, we have added the term "solely" to the results (p9; line 265) and discussion (p25; line 719), and included a new sentence on p26 (lines 726-730): "However, given that the removal of males from SAP54 leaves prior to female choice does not enhance female colonisation (comparison of Figure 1, treatment 4 with treatment 5), we cannot exclude the possibility that male-produced volatiles or mating calls could enhance or supplement SAP54-dependent changes in biotic stress responses to males, thereby enhancing female attraction."

      Minor comments

      The legend of Figure 1 is missing an explanation for panel C.

      Thank you for noticing this. We have added the missing information.

      Although from a different perspective from this study, a relationship between phytoplasma infection and SVP has been previously reported (Yang et al., Plant Physiology, 2015). Shouldn't this paper be cited somewhere?

      We thank the reviewer for identifying this oversight. We have added the missing reference (PMID: 26103992) and clarified that, as seen in Figure 5E (p20; lines 555-558), our findings show a similar upregulation of SVP in male-exposed SAP54 plants as reported by Yang et al. This suggests that SAP54 and its homologs, such as PHYL1, may indeed operate through similar mechanisms by targeting MTFs that are crucial for their function. While Yang et al. described the role of SVP in the development of abnormal flower phenotypes in Catharanthus, our study reveals a completely novel role for SVP in plant-insect interactions. Although SAP54 destabilises the SVP protein, its transcript is upregulated in the presence of SAP54, indicating a potential disruption of MTF autoregulation and the MTF network as a whole.

    1. Author response:

      The following is the authors’ response to the current reviews.

      Response to reviewer 1:

      We thank the reviewer for their positive comments and note that we made many attempts to genetically alter endothelial cells to expression mutants of SEC61A1 that are resistant to the effects of mycolactone. However, these cells were not capable of supporting expression of this transgene. Instead, we used an approach where we tested other translocation inhibitors, with a different chemical structure but same mechanism of action at the Sec61 translocon and found that these phenocopied the effects.


      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors have investigated the effect of the toxin mycolactone produced by mycobacterium ulcerans on the endothelium. Mycobacterium ulcerans is involved in Buruli ulcer classified as a neglected disease by WHO. This disease has dramatic consequences on the microcirculation causing important cutaneous lesions. The authors have previously demonstrated that endothelial cells are especially sensitive to mycolactone. The present study brings more insight into the mechanism involved in mycolactone-induced endothelial cells defect and thus in microcirculatory dysfunction. The authors showed that mycolactone directly affected the synthesis of proteoglycans at the level of the golgi with a major consequence on the quality of the glycocalyx and thus on the endothelial function and structure. Importantly, the authors show that blockade of the enzyme involve in this synthesis (galactosyltransferase II) phenocopied the effects of mycolactone. The effect of mycolactone on the endothelium was confirmed in vivo. Finally, the authors showed that exogenous laminin-511 reversed the effects of mycolactone, thus opening an important therapeutic perspective for the treatment of wound healing in patients suffering Buruli ulcer and presenting lesions.  

      Reviewer #2 (Public Review):  

      The authors dissected the effects of mycolacton on endothelial cell biology and vessel integrity. The study follows up on previous work by the same group, which highlighted alterations in vascular permeability and coagulation in patients with Buruli ulcer. It provides a mechanistic explanation for these clinical observations, and suggests that blockade of Sec61 in endothelial cells contributes to tissue necrosis and slow wound healing.  

      Overall, the generated data support their conclusions and I only have two major criticisms:  

      - Replicating the effects of mycolactone on endothelial parameters with Ipomoeassin F (or its derivative ZIF-80) does not demonstrate that these effects are due to Sec61 blockade. This would require genetic proof, using for example endothelial cells expressing Sec61A mutants that confer resistance to mycolactone blockade. The authors claimed in the Discussion that they could not express such mutants in primary endothelial cells, but did they try expressing mutants in HUVEC cell lines? Without such genetic evidence all statements claiming a causative link between the observed effects on endothelial parameters and Sec61 blockade should be removed or rephrased. The same applies to speculations on the role of Sec61 in epithelial migration defects in discussion. Data corresponding to Ipomoeassin F and ZIF-80 do not add important information, and may be removed or shown as supplemental information.  

      - While statistical analysis is done and P values are provided, no information is given on the statistical tests used, neither in methods nor results. This must be corrected, to evaluate the repeatability and reproducibility of their data.  

      We respectfully but fundamentally disagree with the comments regarding the Sec61 dependence of the effects that we observed. We showed that loss of glycocalyx and basement membrane components underpinned the phenotypic changes in endothelial cells (morphological changes, loss of adhesion, increased permeability, and reduced ability to repair scratch wounds). We demonstrated that we could phenocopy permeability increases and elongation phenotype by knocking down the type II membrane protein B3Galt6, and reverse the adhesion defect by exogenous provision of the secreted laminin-511 heterotrimer.

      Our conclusion that mycolactone mediates these effects via Sec61 inhibition is not based solely on the use of alternative inhibitors but is built on several pillars of evidence:  

      First, the proteomics data conforms entirely to predictions based on the topology of affected vs. non-effected proteins, and agrees with independently published proteomic datasets from T lymphocytes, dendritic cells and sensory neurons (ref.12), as well as biochemical studies performed using in vitro translocation assays (ref.11,34). Furthermore, the pattern of membrane protein down regulation observed in our experiments fits perfectly with established models of protein translocation mechanisms, particularly with respect to the lack of effect on specific topologies of multipass membrane proteins, tail anchored- and type III membrane proteins (ref.34-36).  

      Second, since Sec61 very highly conserved amongst mammals and is found in all nucleated cells, it is hard to conceptualise a framework in which mycolactone targets Sec61 in some cells and not others, as this reviewer suggests might be the case for epithelial cells [noting that the work being referred to (ref.29) predates our 2014 work showing that mycolactone is a Sec61 inhibitor (ref.7)]. Indeed, mycolactone has been shown to target Sec61 in multiple independent approaches including forward genetic screens involving random mutagenesis and CRISPR/Cas9 (ref.10, PMID: 35939511). Genetic evidence has previously been provided for the Sec61 dependence of mycolactone effects in epithelial cells (ref.10,17). We have unpublished genetic evidence that the rounding and detachment of epithelial cells due to mycolactone is reduced when resistance mutations are over expressed, and will consider including this in the next version of the manuscript.

      Third, given this weight of evidence, one would be hard-pressed to provide an alternative explanation for the specific down-regulation of glycosaminoglycan-synthesising enzymes and adhesion/basement membrane molecules while most cytosolic and non-Sec61 dependent membrane proteins are unchanged or upregulated. However, seeking to be as rigorous as possible we have here shown that a completely independent Sec61 inhibitor produces the same phenotype at the gross and molecular level. Ipomoeassin F (Ipom-F) is a glycolipid, not a polyketide lactone, yet they both compete for binding with cotransin in Sec61α (ref.6). There is significant overlap in the cellular responses to mycolactone and Ipom-F, including the induction of the integrated stress response (ref.17, PMID: 34079010), which we observed again in the current data, providing further evidence that this approach is useful when genetic approaches are technically unattainable.  

      Therefore, we are confident the effects seen on endothelial cells are Sec61-dependent. We are happy to provide more detail on our lengthy attempts at over-expressing mycolactone resistant SEC61A1 genes in HUVECs; primary endothelial cells derived from the umbilical vein. We are highly experienced in this area, and have previously stably expressed these proteins in epithelial cell lines, reproducing the resistance profile (ref.10,17). Notably though, these cells do not have normal ‘fitness’ in the absence of challenge. Since endothelial cells (and endothelial cell lines; PMID: 12560236) are extremely hard to transfect with plasmids, with efficiency routinely 5-10% (including in our hands), we developed a lentivirus system. We were eventually (after multiple attempts using different protocols) able to transduce primary HUVECs with constructs expressing GFP (at an efficiency of about 10-20%) and select/expand these under puromycin selection. Never-the-less, we never recovered any cells that expressed the flag-tagged SEC61A1 wild type or SEC61A1 carrying the resistance mutant D60G. We also attempted to select D60G-transduced cells with mycolactone epimers, an approach that can help the cells compete against non-transduced cells in culture flasks (ref.10).  We concluded that primary endothelial cells are unable to tolerate the expression of additional Sec61α, and this was incompatible with survival.  

      It’s also important to note that most endothelial cell specialists would agree that endothelial cell lines are not good models of endothelial behaviour. We tested the HMEC-1 cell line, but found it did not express prototypical endothelial marker vWF in the expected way. Therefore we focussed our efforts on primary endothelial cells. Should we be able to overcome the dual challenge of the necessity to work in primary cells, and the difficulty of over-expressing Sec61, we will update this paper at a later date with this data, and will also expand the above arguments.  

      We apologise for the embarrassing oversight of not including information about the statistical analyses we used, which of course we will correct in full in the revised version. However, we would like to provide this information to readers of the current version of the manuscript. All data were analysed using GraphPad Prism Version 9.4.1:

      Figure 1: one-way ANOVA with Dunnett’s (panel A) or Tukey’s (panel B) correction for multiple comparisons

      Figure 2 supplement: one-way ANOVA with Tukey’s correction for multiple comparisons (analysed panel)

      Figure 3: one-way ANOVA with Tukey’s (panel B) or Dunnett’s (panel E&F) correction for multiple comparisons

      Figure 4:  one-way ANOVA with Dunnett’s correction for multiple comparisons (all analysed panels)

      Figure 5 and supplement:  one-way ANOVA with Dunnett’s correction for multiple comparisons (all analysed panels)

      Figure 6:  one-way ANOVA with Dunnett’s correction for multiple comparisons (analysed panel)

      Figure 6 supplement: one-way ANOVA with Dunnett’s correction for multiple comparisons (all analysed panels)

      Figure 7: two-way ANOVA with Tukey’s correction for multiple comparisons (all analysed panels; panels B&C also included the Geisser Greenhouse correction for sphericity)

      Figure 7 supplement: Panels A&D used a repeated measures one-way ANOVA with Dunnett’s correction for multiple comparisons (panel D also included the Geisser Greenhouse correction for sphericity). Panels B,C&E used a two-way ANOVA with Tukey’s correction for multiple comparisons (panels B&C also included the Geisser Greenhouse correction for sphericity)

      Reviewer #3 (Public Review):

      Buruli ulcer is a severe skin infection in humans that is caused by a bacterium, Mycobacterium ulcerans. The main clinical sign is a massive tissue necrosis subsequent to an edema stage. The main virulence factor called mycolactone is a polyketide with a lactone core and a long alkyl chain that is released within vesicles by the bacterium. Mycolactone was already shown to account for several disease phenotypes characteristic of Buruli ulcer, for instance tissue necrosis, host immune response modulation and local analgesia. A large number of cellular pathways in various cell types was reported to be impacted by mycolactone. Among those, the Sec61 translocon involved in the transport of certain proteins to the endoplasmic reticulum was first identified by the authors of the study and is currently the most consensual target. Mycolactone disruption of Sec61 function was then shown to directly impact on cell apoptosis in macrophages, limited immune responses by T-cells and increased autophagy in dermal endothelial cells and fibroblasts. In their manuscript, TzungHarn Hsieh and their collaborators investigated the Sec61- dependent role of mycolactone on morphology, adhesion and migration of primary human dermal microvascular endothelial cells (HDMEC). They used a combination of sugar and proteomic studies on a live imagebased phenotypic assay on HDMEC to characterize the effect of mycolactone. First, they showed that upon incubation of monolayer of HDMEC with mycolactone at low dose (10 ng/mL) for 24h, the cells become elongated before rounding and eventually detached from the culture dish at 48h. Next, mycolactone was probed on a scratch assay and migration of the cells ceased upon a 24h incubation. The same effect as mycolactone on these two assays was observed for two other Sec61 inhibitors Ipomoeassin F and ZIF-80. Then, the authors resorted to the widely established mouse footpad model of M. ulcerans infection to evidence fibrinogen accumulation outside the blood vessel within the endothelium at 28 days postinfection, correlating with severe endothelial cell morphology changes.  

      To dissect the molecular pathways involved in these phenotypes, the authors performed an HDMEC membrane protein analysis and showed a decrease in the numbers of proteins involved in glycosylation and adhesion. As protein glycosylation mainly occurs in the Golgi apparatus, a deeper analysis revealed that enzymes involved in glycosaminoglycan (GAG) synthesis were lost in mycolactone treated HDMEC. A combination of immunofluorescence and flow cytometry approaches confirmed the impact of mycolactone on the ability of endothelial cells to synthesize GAG chains. The mycolactone effect on cell elongation was phenocopied by knock-down of galactosyltransferase II (B3Galt6) involved in GAG biosynthesis. A second extensive analysis of the endothelial basement membrane component and their ligands identified multiple laminins affected by mycolactone. Using similar functional studies as for GAG, the impact of mycolactone on cell rounding and migration could be reversed by the addition of laminin α5.  

      The major strengths of the study relies on a combination of cleverly designed phenotypic assays and in-depth cleverly designed membrane proteomic studies and follow-up analysis.  

      The results really support the conclusions. Congratulations!  

      The discussion takes into account the current state of the art, which has mostly been established by the authors of the present manuscript.  

      Recommendations for the authors:

      In preparing this revised version we have made a number of general improvements:

      • We added the missing information on statistical analysis that was mentioned in the public review of reviewer #2

      • We have changed all gene names to the HUGO nomenclature

      • We have changed our abbreviation of mycolactone from “MYC” to “Myco” in all figures to avoid any potential confusion with other protein factors

      • We have moved the fibrin(ogen) staining of the mouse footpads to its own figure (now Fig 2), partly due to the inclusion of additional data in Fig 1. This has changed the numbering of subsequent figures, but has also made the supplementary figures easier to track.

      Reviewer #1 (Recommendations For The Authors):  

      (1) Figure 1I. When mice are injected M. Ulcerens a measurement of local blood flow would be very informative in addition of the data shown. Cutaneous blood flow at the level of the feet is possible using laser doppler or Laser speckle imaging. With these measurements the authors would have a functional quantification of the effect of the glycosaminoglycans- Sec61α associated damages on the microcirculatory blood flow. The same measurement could also better validate the therapeutic effect of laminin. 

      We thank the reviewer for this great suggestion, and respectfully remind the reviewer that these experiments take place in CL3 containment. This often completely precludes certain procedures due to the availability of equipment inside the containment, and our ability to sterilise it. Where we are able to perform procedures, it greatly increases their complexity since any procedures on live animals must take place inside of a cabinet. Therefore, we can only use equipment that we have at our animal facility. It is not trivial to set up the regulatory permissions to perform these experiments at other facilities where more specialist equipment is located due to the containment restrictions. 

      Never-the-less we have attempted to perform ultrasound imaging of mouse feet using the VivoF and have set up a collaboration with other researchers at Surrey who have developed a novel imaging instrument to measure microvascular circulation call optical coherence tomography (OCT; https://pubmed.ncbi.nlm.nih.gov/34882760/), and we are working with them to develop a protocol that be used in small rodents.  

      However, while we have dedicated considerable time to trying to perform the suggested experiment, we have not been successful within a reasonable time frame. Consequently, if we are able to establish this technique in the M. ulcerans infection model, and/or OCT in small rodents, this will likely be beyond the scope of the current manuscript and will be a publication in its own right. We note that we have been able to perform almost all of the other requested experiments (see below), and have also been able to undertake transmission electron microscopy of M. ulcerans infected mouse footpads, which confirms the loss of the basement membrane at high resolution (Fig 7E).

      (2) Figure 1 -D. Endothelial cells were exposed to mycolactone, Ipomoeassin F or ZIF-80. The effect on the cells is clear and impressive. Nevertheless, endothelial cells in no flow conditions are considered "diseased" cells as in the areas of low flow or no flow are prone to atherosclerosis in vivo. Would the authors expect similar effects in cells submitted to flow? In this conditions cells would be already elongated in the direction of flow. 

      We agree that flow is usually experienced by endothelial cells in vivo, and have repeated a selection of our experiments under conditions that mimic flow and produce uniaxial shear stress. All showed a similar pattern of response to mycolactone, including the phenotypic changes (Fig 1I-K), loss of perlecan (Fig S6C) and laminin α4 (Fig S7B). It is true that the elongation phenotype is not as striking in a cell monolayer that already contains many elongated cells, but qualitatively the cells become disorganised and at 48 hours, their length/width ratio had increase. These results provide reassurance that our findings are physiologically relevant.

      (3) Discuss the possible consequences of your findings on vascular reactivity and especially on flow-mediated dilation and/or flow-mediated remodeling which as both are important in tissue repair and wound healing. 

      We agree with this reviewer that there are likely to be broad consequences to endothelial and vascular function as a result of our findings here. Vascular reactivity is not something we directly considered in this manuscript, and is probably better linked to our planned future work, laid out above, regarding vascular flow in the infected animals. While a key mediator of vascular tone, endothelin 1, is a Sec61-dependent secreted peptide mediator (and is likely to also be affected by mycolactone’s actions), this was not one of the >6500 proteins we identified in our proteomic study. On the other hand, it has been shown by others that mycolactone can induce NO production by in other types of cells.

      Reviewer #2 (Recommendations For The Authors):  

      - The authors use a mouse model of M. ulcerans infection of footpads to assess the in vivo relevance of their results. It would be useful to comment on any differences between human and mouse with regard to endothelial cell biology and vessel wall architecture. Since the authors have access to patients samples, parallel stainings in human lesions would have strengthened the study. 

      This is an important issue, and is one we have already addressed in our two previous articles https://pubmed.ncbi.nlm.nih.gov/35100311/ https://pubmed.ncbi.nlm.nih.gov/26181660/ . Indeed, this latter work already included a detailed analysis of fibrin staining in these Buruli ulcer patient biopsies and underpinned the hypothesis that we have now tested in the current manuscript. 

      It is worth noting that our data supports that the critical step is at an early (pre-clinical) stage, for which patient samples are not available. The proposed human challenge model (https://pubmed.ncbi.nlm.nih.gov/37384606/ ) may well provide a suitable platform such studies in the future.

      - The authors should provide in the Discussion some explanation for the differential effects of Laminin-11, -411 and -511 in Fig. 7 

      This is an interesting point, and probably related to the expression of laminin binding proteins by mycolactone-exposed endothelial cells. We pursued several candidates based on the proteomic data but could not identify a unique gene that explained this observation. Mostly likely they are explained by partial (be it low or high) loss of a combination of integrin binding proteins. Since this was rather inconclusive and we preferred not to present this data, and already said (p34-35) “We have not been able to ascribe this to the retention of a specific adhesion molecule, and instead postulate that rescue could be via residual expression of a wide variety of laminin α5 receptors

      - The word "catastrophic" in the title is very dramatic given the limited impact on the vital prognosis of patients 

      This word has been changed to “destructive”

      Reviewer #3 (Recommendations For The Authors):  

      Several points could be further discussed:  

      -In mouse model of M. ulcerans infection, in 5% of cases, animals heal spontaneously. How could the authors results contribute to bring hypothesis to this phenomenon? 

      Others have shown that the ability of some mice to control M. ulcerans infection is related to loss of mycolactone production by an unknown mechanism. It is not something we have ever observed in the infection experiments we have performed, although this may be due to the humane endpoints of our licence. However, this seems somewhat outside the main focus of the paper and we have not discussed this further.  

      -Mycolactone was also reported to induce analgesia in the mouse model. There is still controversy about the precise mechanisms involved in this mycolactone mediated painless effect. Could the data obtained here help to resolve the controversy? 

      We agree that analgaesia in M. ulcerans infection (both in mouse models and in clinical infections) is an extremely interesting area. However, we cannot mechanistically link loss of vascular integrity with the analgaesia based on the data generated in the current manuscript. Therefore we prefer not to speculate on this.

      The quantification of the microscopy images and videos should be provided as well as the script used to quantify them. 

      The reviewer is not specific about which microscopy images are being referred to in this comment, but the reference to videos leads us to assume this is related to the ZenCell OWL images/videos presented in Figure 1 and Figure S1. We had already provided quantification of these in the graphs provided, and the algorithms use for % coverage and % detached cells were provided in the instrument software used to gather the data, the ZenCell OWL (which are proprietary). Other counts were made manually, and the length:width ratio is simple arithmetic as already described in the methodology.

      The authors performed their work using chemically synthesized mycolactone obtained from the very generous Professor Kishi (Harvard University). Would the same phenotype and proteomics analysis be obtained with biologically purified mycolactone? 

      Our lab has extensive experience of both biologically purified and synthetic mycolactone, and the phenotypes observed have always been identical when using the chemically synthesised form. Therefore we did not repeat the proteomics experiments as we do not believe it would provide any greater insight into the disease mechanism. However, we have now replicated a range of findings using mycolactone biologically purified from M. ulcerans. In particular, we confirmed that the cytotoxic activity of synthetic and biological mycolactone are inseparable (Figure S1A), and the main phenotypic changes induced by mycolactone in endothelial cells (Phenotypes; Figures S1D-F, B3GALT6/perlecan/laminin α5 loss; S5A, S6B, S7A).

      Although already very comprehensive, a kinetic study of their proteomic analysis over time could strengthen the analysis (from 2H to 48H). 

      We agree that more data is always better, but since we validated our proteomic data set over multiple timepoints between 2 and 48 hrs, we do not believe this would alter the main conclusions of our work.   

      The siRNA transfection protocol could be better described. A Table listing all the reagents would help the reader.  

      A more detailed siRNA transfection protocol has been added to the methods section, and we now include a Key Resources Table at the start of the Materials & Methods section.

    1. Author response:

      Reviewer #1:

      Summary:

      The investigators undertook detailed characterization of a previously proposed membrane targeting sequence (MTS), a short N-terminal peptide, of the bactofilin BacA in Caulobacter crescentus. Using light microscopy, single molecule tracking, liposome binding assays, and molecular dynamics simulations, they provide data to suggest that this sequence indeed does function in membrane targeting and further conclude that membrane targeting is required for polymerization. While the membrane association data are reasonably convincing, there are no direct assays to assess polymerization and some assays used lack proper controls as detailed below. Since the MTS isn't required for bactofilin polymerization in other bacterial homologues, showing that membrane binding facilitates polymerization would be a significant advance for the field

      We thanks Reviewer #1 for the constructive criticism and will address the points detailed below in a revised version of the manuscript.

      Major concerns

      (1) This work claims that the N-termina MTS domain of BacA is required for polymerization, but they do not provide sufficient evidence that the ∆2-8 mutant or any of the other MTS variants actually do not polymerize (or form higher order structures). Bactofilins are known to form filaments, bundles of filaments, and lattice sheets in vitro and bundles of filaments have been observed in cells. Whether puncta or diffuse labeling represents different polymerized states or filaments vs. monomers has not been established. Microscopy shows mis-localization away from the stalk, but resolution is limited. Further experiments using higher resolution microscopy and TEM of purified protein would prove that the MTS is required for polymerization.

      We do not propose that the MTS is directly involved in the polymerization process, and preliminary transmission electron microscopy (TEM) data show that variants lacking the MTS or carrying amino acid exchanges in the MTS still form polymers when highly overproduced in E. coli and then purified from cell lysates by affinity chromatography. This finding is consistent with the results of previous studies and in line with the finding that bactofilin polymerization is exclusively mediated by the conserved bactofilin domain (Deng et al, Nat Microbiol, 2019). However, under native expression conditions, bactofilin levels are often relatively low, with only a few hundred molecules of BacA measured per cell in C. crescentus (Kühn et al, EMBO J, 2006). Our data indicate that, under this condition, the concentration of BacA on the 2D surface of the cytoplasmic membrane and, potentially, steric contraints induced by membrane curvature, may be required to facilitate its efficient assembly into functional polymeric complexes. We will provide TEM images of purified proteins in a revised version of our manuscript and explain this model in more detail in the Discussion.

      In the case of polymer-forming proteins, defined localized signals are typically interpreted as polymeric complexes. An even distribution of the fluorescence signals, by contrast, indicates that the proteins form monomers or, at most, small oligomers that diffuse rapidly within the cell and are thus no longer detected as a stationary focus by widefield microscopy. Our single-molecule data also indicate that proteins that are no longer able to interact with the membrane (as verified by cell fractionation studies and in vitro liposome binding assays) show a high diffusion rate, similar to that measured for the non-polymerizing and non-membrane-bound F130R variant. These results indicate that a loss of membrane binding strongly reduces the ability of BacA to form polymeric assemblies. To support this hypothesis, we will perform additional single-molecule tracking analyses of a freely diffusible and membrane-bound monomeric fluorescent proteins for comparison.

      (2) Liposome binding data would be strengthened with TEM images to show BacA binding to liposomes. From this experiment, gross polymerization structures of MTS variants could also be characterized.

      We do not have the possibility to perform cryo-electron microscopy studies of liposomes bound to BacA. However, the results of the cell fractionation and liposome sedimentation assays clearly support a critical role of the MTS in membrane binding.

      (3) The use of the BacA F130R mutant throughout the study to probe the effect of polymerization on membrane binding is concerning as there is no evidence showing that this variant cannot polymerize. Looking through the papers the authors referenced, there was no evidence of an identical mutation in BacA that was shown to be depolymerized or any discussion in this study of how the F130R mutation might to analogous to polymerization-deficient variants in other bactofilins mentioned in these references.

      Residue F130 in the C-terminal polymerization interface of BacA is highly conserved among bactofilin homologs, although its absolute position in the protein sequence may vary, depending on the length of the N-terminal unstructured tail. The papers cited in our manuscript show that an exchange of this conserved phenylalanine residue abolishes polymer formation. We will make this fact clearer in the revised version of the manuscript. Moreover, we will provide gel filtration and transmission electron microscopy data showing that the BacA-F130R variant no longer forms polymers.

      (4) Microscopy shows that a BacA variant lacking the native MTS regains the ability to form puncta, albeit mis-localized, in the cell when fused to a heterologous MTS from MreB. While this swap suggests a link between puncta formation and membrane binding the relationship between puncta and polymerization has not been established (see comment 1).

      We show that a BacA variant lacking the MTS regains the ability to form membrane-associated foci when fused to the MTS of MreB. In contrast, a similar variant that additionally carries the F130R exchange (preventing its polymerization) shows a diffuse cytoplasmic localization. In addition, we show that the F130R exchange leads to a loss of membrane binding and to a considerable increase in the mobility of the variants carrying the MreB MTS. Together, these results strongly support the hypothesis that membrane binding and polymerization act synergistically to establish localized bactofilin assemblies.

      (5) The authors provide no primary data for single molecule tracking. There is no tracking mapped onto microscopy images to show membrane localization or lack of localization in MTS deletion/ variants. A known soluble protein (e.g. unfused mVenus) and a known membrane bound protein would serve as valuable controls to interpret the data presented. It also is unclear why the authors chose to report molecular dynamics as mean squared displacement rather than mean squared displacement per unit time, and the number of localizations is not indicated. Extrapolating from the graph in figure 4 D for example, it looks like WT BacA-mVenus would have a mobility of 0.5 (0.02/0.04) micrometers squared per second which is approaching diffusive behavior. Further justification/details of their analysis method is needed. It's also not clear how one should interpret the finding that several of the double point mutants show higher displacement than deleting the entire MTS. These experiments as they stand don't account for any other cause of molecular behavior change and assume that a decrease in movement is synonymous with membrane binding.

      We agree that a more in-depth analysis of the single-molecule-tracking data would be helpful to support our conclusions.  We will map the reads on the cells, although the loss of membrane localization of BacA variants with a defective MTS is already obvious in the widefield fluorescence images. Moreover, we will perform additional measurements on soluble mVenus and a membrane-associated variant of mVenus for comparison and address the other issues raised here.

      The single-molecule tracking data alone are certainly not sufficient to draw firm conclusions on the relationship between membrane binding and protein mobility. However, our other in vivo and in vitro analyses indicate a very clear correlation of between the mobility of BacA and its ability to interact with the membrane and polymerize (processes that synergistically promote each other).

      (6) The experiments that map the interaction surface between the N-terminal unstructured region of PbpC and a specific part of the BacA bactofilin domain seem distinct from the main focus of the paper and the data somewhat preliminary. While the PbpC side has been probed by orthogonal approaches (mutation with localization in cells and affinity in vitro), the BacA region side has only been suggested by the deuterium exchange experiment and needs some kind of validation

      The results of the HDX analysis per se are not preliminary and clearly indicate a change in the accessibily of surface-exposed residues in the central bactofilin domain. However, we agree that additional experiments would be required to verify the binding site suggested by these data. However, this aspect is indeed not the main focus of the paper. We included the analysis of the interaction between PbpC and BacA, because we see effects of membrane binding/polymerization on the BacA-PbpC interaction and thus on the physiological function of BacA in C. crescentus.

      Reviewer #2:

      Summary:

      The authors of this study investigated the membrane-binding properties of bactofilin A from Caulobacter crescentus, a classic model organism for bacterial cell biology. BacA was the progenitor of a family of cytoskeletal proteins that have been identified as ubiquitous structural components in bacteria, performing a range of cell biological functions. Association with the cell membrane is a common property of the bactofilins studied and is thought to be important for functionality. However, almost all bactofilins lack a transmembrane domain. While membrane association has been attributed to the unstructured N-terminus, experimental evidence had yet to be provided. As a result, the mode of membrane association and the underlying molecular mechanics remained elusive.

      Liu at al. analyze the membrane binding properties of BacA in detail and scrutinize molecular interactions using in-vivo, in-vitro and in-silico techniques. They show that few N-terminal amino acids are important for membrane association or proper localization and suggest that membrane association promotes polymerization. Bioinformatic analyses revealed conserved lineage-specific N-terminal motifs indicating a conserved role in protein localization. Using HDX analysis they also identify a potential interaction site with PbpC, a morphogenic cell wall synthase implicated in Caulobacter stalk synthesis. Complementary, they pinpoint the bactofilin-interacting region within the PbpC C-terminus, known to interact with bactofilin. They further show that BacA localization is independent of PbpC.

      Strengths

      These data significantly advance the understanding of the membrane binding determinants of bactofilins and thus their function at the molecular level. The major strength of the comprehensive study is the combination of complementary in vivo, in vitro and bioinformatic/simulation approaches, the results of which are consistent.

      We thank Reviewer #2 for the positive evaluation of our paper and for the constructive criticism sent to us in the the non-public review. We will address the points raised in a revised version of the manuscript.

      Weaknesses:

      The results are limited to protein localization and interaction, as there is no data on phenotypic effects. Therefore, the cell biological significance remains somewhat underrepresented.

      We agree that it would be interesting to investigate the phenotypic effects caused by a defect of BacA in membrane binding. We will investigate PbpC localization and stalk length in phosphate-limited medium for mutants producing MTS-deficient BacA variants and include these data in the revised version of the manuscript. However, we would like to point out that the relevance of our findings goes beyond the C. cres­centus system, because the MTS and its role for bactofilin function is likely to be conserved in many other species.

    1. Author response:

      We thank the reviewers for their valuable comments. Our revision will address their recommendations and clarify any misconceptions. The main points we plan to amend are as follows:

      Direct comparison of pRF sizes

      We may have misunderstood this comment in the eLife assessment. We believe our original analyses and the figures already provided a “direct comparison between pRF sizes in the high-adapted and low-adapted conditions”. Specifically, we included a figure showing the histograms of pRF sizes in both conditions, and also reported statistical tests to compare conditions both within each participant and across the group. However, we now realize these comparisons might not be as clear to readers as we intended, which would explain Reviewer #2’s interpretations. To clarify, in our revised version we will instead show 2D plots comparing pRF sizes between conditions as suggested by Reviewer #2, and also show the pRF size plotted against eccentricity (rather than only the difference) as suggested by Reviewer #3.

      Data sharing 

      The behavioral data, fMRI data (where ethically permissible), stimulus-generation code, statistical analyses, and fMRI stimulus video are already publicly available at the link: https://osf.io/9kfgx/. However, we unfortunately failed to include the link in the preprint. We apologize for this oversight. It will be included in the revision. The repository now also contains a script for simulated adaptation effects on pRF size used in our response to Reviewer #2. Moreover, for transparency, we will include plots of all the pRF parameter maps for all participants, including pRF size, polar angle, eccentricity, normalized R2, and raw R2.

      Sample size

      The reviewers shared concerns about the sample size of our study. We disagree that this is a weakness of our study. It is important to note that large sample sizes are not necessary to obtain conclusive results, especially when the research aims to test whether an effect exists, rather than finding out how strong the effect is on average in a population (Schwarzkopf & Huang, 2024, currently out as preprint, but in press at Psychological Methods). Our results showed robust within-subject effects, consistent across multiple visual regions in most individual participants. A larger sample size would not necessarily improve the reliability of our findings. Treating each individual as an independent replication, our results suggest a high probability that they would replicate in each additional participant we could scan. 

      Reviewer #1:

      We thank the reviewer for their careful evaluation and positive comments. We will include a more detailed discussion about the issues pointed out, and an additional plot showing the polar angle for both adapter conditions. In line with previous work on the reliability of pRF estimates (van Dijk, de Haas, Moutsiana, & Schwarzkopf, 2016; Senden, Reithler, Gijsen, & Goebel, 2014), both polar angle and eccentricity maps are very stable between the two adaptation conditions.

      Reviewer #2:

      We thank the reviewer for their comments - we will improve how we report key findings which we hope will clarify matters raised by the reviewer.

      RF positions in a voxel

      The reviewer’s comments suggest that they may have misunderstood the diagram (Figure 1A) illustrating the theoretical basis of the adaptation effect, likely due to us inadvertently putting the small RFs in the middle of the illustration. We will change this figure to avoid such confusion.

      Theoretical explanation of adaptation effect

      The reviewer’s explanation for how adaptation should affect the size of pRF averaging across individual RFs is incorrect. When selecting RFs from a fixed range of semi-uniformly distributed positions (as in an fMRI voxel), the average position of RFs (corresponding to pRF position) is naturally near the center of this range. The average size (corresponding to pRF size) reflects the visual field coverage of these individual RFs. This aggregate visual field coverage thus also reflects the individual sizes. When large RFs have been adapted out, this means the visual field coverage at the boundaries is sparser, and the aggregate pRF is therefore smaller. The opposite happens when adapting out the contribution of small RFs. We demonstrate this with a simple simulation at this OSF link: https://osf.io/ebnky/.

      Figure S2 

      It is not actually possible to compare R2 between regions by looking at Figure S2 because it shows the pRF size change, not R2. Therefore, the arguments Reviewer #2 made based on their interpretation of the figure are not valid. Just as the reviewer expected, V1 is one of the brain regions with good pRF model fits. In our revision, we will include normalized and raw R2 maps to make this more obvious to the readers and provide additional explanations.

      V1 appeared essentially empty in that plot primarily due to the sigma threshold we selected, which was unintentionally more conservative than those applied in our analyses and other figures. We apologize for this mistake and will correct it in the revised version by including a plot with the appropriate sigma threshold.

      Thresholding details 

      Thresholding information was included in our original manuscript; however, we will include more information in the figure captions to make it more obvious.

      2D plots will replace histograms

      We thank the reviewer for this suggestion. The manuscript contained histograms showing the distribution of pRF size for both adaptation conditions for each participant and visual area (Figure S1). However, we agree that 2D plots better communicate the difference in pRF parameters between conditions, so we will replace this figure. We will consider 2D kernel density plots as suggested by the reviewer; however, such plots can obscure distributional anomalies so they may not be the optimal choice and we may opt to show transparent scatter plots of individual pRFs instead.

      (proportional) pRF size-change map 

      The reviewer requests pRF size difference maps. Figure S2 in fact demonstrates the proportional difference between the pRF sizes of the two adaptation conditions. Instead of simply taking the difference, we believe showing the proportional change map is more sensible because overall pRF size varies considerably between visual regions. We will explain this more clearly in our revision. 

      pRF eccentricity plot 

      “I suspect that the difference in PRF size across voxels correlates very strongly with the difference in eccentricity across voxels.”

      Our manuscript already contains a supplementary plot (Figure S4 B) comparing the eccentricity between adapter conditions, showing no notable shift in eccentricities except in V3A - but that is a small region and the results are generally more variable. We will comment more on this finding in the main text and explain this figure in more detail. 

      To the reviewer’s point, even if there were an appreciable shift in eccentricity between conditions (as they suggest may have happened for the example participant we showed), this does not mean that the pRF size effect is “due [...] to shifts in eccentricity.” Parameters in a complex multi-dimensional model like the pRF are not independent. There is no way of knowing whether a change in one parameter is causally linked with a change in another. We can only report the parameter estimates the model produces. 

      In fact, it is conceivable that adaptation causes both: changes in pRF size and eccentricity. If more central or peripheral RFs tend to have smaller or larger RFs, respectively, then adapting out one part of the distribution will shift the average accordingly. However, as we already established, we find no compelling evidence that pRF eccentricity changes dramatically due to adaptation, while pRF size does. We will illustrate this using the 2D plots in our revision.

      Reviewer #3:

      We thank the reviewer for their comments.

      pRF model

      Top-up adapters were not modelled in our analyses because they are shared events in all TRs, critically also including the “blank” periods, providing a constant source of signal. Therefore modelling them separately cannot meaningfully change the results. However, the reviewer makes a good suggestion that it would be useful to mention this in the manuscript, so we will add a discussion of this point.

      pRF size vs eccentricity

      We will add a plot showing pRF size in the two adaptation conditions (in addition to the pRF size difference) as a function of eccentricity.

      Correlation with behavioral effect

      In the original manuscript, we pointed out why the correlation between the magnitude of the behavioral effect and the pRF size change is not an appropriate test for our data. First, the reviewer is right that a larger sample size would be needed to reliably detect such a between-subject correlation. More importantly, as per our recruitment criteria for the fMRI experiment, we did not scan participants showing weak perceptual effects. This limits the variability in the perceptual effect and makes correlation inapplicable.

      References

      van Dijk, J. A., de Haas, B., Moutsiana, C., & Schwarzkopf, D. S. (2016). Intersession reliability of population receptive field estimates. NeuroImage, 143, 293–303. https://doi.org/10.1016/J.NEUROIMAGE.2016.09.013

      Schwarzkopf, D. S., & Huang, Z. (2024). A simple statistical framework for small sample studies. BioRxiv, 2023.09.19.558509. https://doi.org/10.1101/2023.09.19.558509

      Senden, M., Reithler, J., Gijsen, S., & Goebel, R. (2014). Evaluating population receptive field estimation frameworks in terms of robustness and reproducibility. PloS One, 9(12). https://doi.org/10.1371/JOURNAL.PONE.0114054

    1. Author response:

      The following is the authors’ response to the current reviews.

      Reviewer #4

      We sincerely appreciate the time and effort you have taken to review our manuscript. We followed your recommendations to polish the text and make it easier to understand.

      Regarding terms and terminology, we changed “non-breeding” everywhere in the text to “over- wintering.”

      Regarding the title, as it was suggested by reviewer #1 as his recommendation, we tried to find a compromise and make the changes you suggested but left part of the suggestion from reviewer #1. So, now it’s “Foxtrot migration and dynamic over-wintering range of an arctic raptor”

      Thank you for highlighting the importance of snow cover and changes in snow cover as a possible factor of over-wintering movements. We appreciate your feedback and have explored several approaches to address this issue. Specifically, we examined how both snow cover extent and changes in snow cover influenced movement distance. However, we found no effect of either factor on movement distance.

      Our data show that birds leave their sites in October and move southwest, even though snow cover is minimal at that time. They also leave their sites in November and in subsequent months, regardless of the snow cover levels. Thus, we observed no pattern of birds leaving sites when snow cover reaches a specific threshold (e.g., 75-80%). Similarly, we found no evidence of birds staying in areas with a certain snow cover extent (e.g., 30%), nor did they leave sites when snow cover increased by a specific amount (e.g., by 10 or 20%).

      It is possible that more experienced birds anticipate that October plots will become inaccessible later in the winter and, therefore, leave early without waiting for significant snow accumulation. Alternatively, other factors, such as brief heavy snowfalls, may trigger movement, even if these do not lead to sustained increases in snow cover. Multiple factors, possibly acting asynchronously, could also play a role. This complexity adds an interesting dimension to the study of ecological patterns. However, in this study, we chose to focus on describing the migration pattern itself and its impact on aspects like over-winter range determination and population dynamics. While we have prioritized this approach, we remain committed to further analyzing the data to uncover additional details about this behavior.

      In response to your suggestion, we have expanded the Methods sections to clarify that we tested the effects of snow cover and changes in snow cover on distance (Lines 241-246); the Results section (Lines 348-349). We have also included the relevant plots in the Supplementary Materials. In the Discussion, we noted that this approach did not reveal any significant dependence and acknowledged that this issue requires further investigation (Lines 422-459).

      ---------

      The following is the authors’ response to the previous reviews.

      Reviewer #2:

      We sincerely appreciate the time and effort you have taken to review our manuscript. 

      First of all, we apologize for publishing the preprint without incorporating certain adjustments outlined in our earlier response, particularly in the Methods section. This was due to an oversight regarding the different versions of the manuscript. We have corrected this mistake. Our response to the feedback on this section (Methods), with line numbers of the changes made, is immediately below this response. In addition, we have included the units of measurement (mean and standard deviation) in both the results and figure captions for clarity.

      To focus on the main point regarding wintering strategies, we acknowledge that in the previous versions, this aspect was inadequately addressed and caused some confusion. In the revised edition, both the Introduction and the Discussion have been thoroughly reworked.

      As you suggested, we have removed the long introductory paragraph and all references to foxtrot migrations from the Introduction. As a result, the Introduction is now short and to the point. In the second paragraph, we explain why we propose the wintering strategies outlined (L74-81).

      In the Discussion, we've added a substantial new section at the beginning that discusses different wintering strategies. We have also updated Figure 4 accordingly. Previously, we erroneously suggested that Montagu's harrier and other African-Palaearctic migrants might adopt wintering strategies similar to those we describe. Upon further investigation, however, we found that almost all African-Palaearctic migrants exhibit an itinerant wintering strategy. Conversely, the strategy we describe is primarily observed in mid-latitude wintering species.

      We have shown that, unlike itinerancy, the birds in our study don't pause for 1-2 months at multiple non-breeding sites, but instead migrate significant distances, up to 1000 km, throughout the winter. Furthermore, unlike itinerancy, the sites they reach are consistently snow-free throughout the year. Following the logic of publications on Montagu's harriers (Schlaich et al. 2023), our birds do not wait for favorable conditions at the next site, as is typical of itinerancy. Moreover, this behavior is influenced by external factors such as snow cover dynamics and occurs primarily in mid-latitudes. Researchers studying a species similar to our subject, the Common buzzard, observed a similar pattern and termed it "prolonged autumn migration" rather than itinerancy. Although their transmitters stopped working in mid-winter, precluding a full observation of the annual cycle, they captured the essence of continued migration at a slower pace, distinct from itinerancy. We've detailed all of these findings in a new section.

      In addition, we acknowledge the mischaracterization of the implications of our research as ‘Conservation implications’ and have corrected this to ‘Mapping ranges and assessing population trends’, as you suggested.

      Finally, we've rewritten the Conclusion, removing overly grandiose statements and simply summarizing the main findings.

      We appreciate your time and effort in reviewing our manuscript. With your invaluable input, it has become clearer, more concise, and easier to understand.

      Dataset: unclear what is the frequency of GPS transmissions. Furthermore, information on relative tag mass for the tracked individuals should be reported.

      We have included this information in our manuscript (L 115-122). We also refer to the study in which this dataset was first used and described in detail (L 123).

      Data pre-processing: more details are needed here. What data have been removed if the bird died? The entire track of the individual? Only the data classified in the last section of the track? The section also reports on an 'iterative procedure' for annotating tracks, which is only vaguely described. A piecewise regression is mentioned, but no details are provided, not even on what is the dependent variable (I assume it should be latitude?).

      Regarding the deaths, we only removed the data when the bird was already dead. We estimated the date of death and excluded tracking data corresponding to the period after the bird's death. We have corrected the text to make this clear (L 130-131).

      Regarding the piecewise regression. We have added a detailed description on lines 136-148.

      Data analysis: several potential issues here:

      (1) Unclear why sex was not included in all mixed models. I think it should be included.

      Our dataset contains 35 females and eight males (L116). This ratio does not allow us to include sex in all models and adequately assess the influence of this factor. At the same time, because adult females disperse farther than males in some raptor species, we conducted a separate analysis of the dependence of migration distance on sex (Table S8) and found no evidence for this in our species. We have written about that in the Methods (L177-181) and after in the Results (L277-278).

      (2) Unclear what is the rationale of describing habitat use during migration; is it only to show that it is a largely unsuitable habitat for the species? But is a formal analysis required then? Wouldn't be enough to simply describe this?

      Habitat use and snow cover determine the two main phases (quick and slow) of the pattern we describe. We believe that habitat analysis is appropriate in this case, and a simple description would be uninformative and not support our conclusions.

      (3) Analysis of snow cover: such a 'what if' analysis is fine but it seems to be a rather indirect assessment of the effect of snow cover on movement patterns. Can a more direct test be envisaged relating e.g. daily movement patterns to concomitant snow cover? This should be rather straightforward. The effectiveness of this method rests on among-year differences in snow cover and timing of snowfall. A further possibility would be to demonstrate habitat selection within the entire non-breeding home range of an individual in relation snow cover. Such an analysis would imply associating presenceabsence of snow to every location within the non-breeding range and testing whether the proportion of locations with snow is lower than the proportion of snow of random locations within the entire nonbreeding home range (95% KDE) for every individual (e.g. by setting a 1/10 ratio presence to random locations).

      The proposed analysis will provide an opportunity to assess whether the Rough-legged buzzard selects areas with the lowest snow cover, but will not provide an opportunity to follow the dynamics and will therefore give a misleading overall picture. This is especially true in the spring months. In March-April, Rough-legged buzzards move northeast and are in an area that is not the most open to snow. At this time, areas to the southwest are more open to snow (this can be seen in Figure 3b). If we perform the proposed analysis, the control points for this period would be both to the north (where there is more snow) and to the south (where there is less snow) from the real locations, and the result would be that there is no difference in snow cover. 

      A step-selection analysis could be used, as we did in our previous work (Curk et al 2020 Sci Rep) with the same Rough-legged buzzards (but during migration, not winter). But this would only give us a qualitative idea, not a quantitative one - that Rough-legged Buzzards move from snow (in the fall) and follow snowmelt progression (in the spring). 

      At the same time, our analysis gives a complete picture of snow cover dynamics in different parts of the non-breeding range. This allows us to see that if Rough-legged buzzards remained at their fall migration endpoint without moving southwest, they would encounter 14.4% more snow cover (99.5% vs. 85.1%). Although this difference may seem small (14.4%), it holds significance for rodent-hunting birds, distinguishing between complete and patchy snow cover.

      Simultaneously, if Rough-legged buzzards immediately flew to the southwest and stayed there throughout winter, they would experience 25.7% less snow cover (57.3% vs. 31.6%). Despite a greater difference than in the first case, it doesn't compel them to adopt this strategy, as it represents the difference between various degrees of landscape openness from snow cover.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      In an era of increasing antibiotic resistance, there is a pressing need for the development of novel sustainable therapies to tackle problematic pathogens. In this study, the authors hypothesize that pyoverdines - metal-chelating compounds produced by fluorescent pseudomonads - can act as antibacterials by locking away iron, thereby arresting pathogen growth. Using biochemical, growth, and virulence assays on 12 opportunistic pathogens strains, the authors demonstrate that pyoverdines induce iron starvation, but this effect was highly context-dependent. This same effect has been demonstrated for plant pathogens, but not for human opportunistic pathogens exposed to natural siderophores. Only those pathogens lacking (1) a matching receptor to take up pyoverdine-bound iron and/or (2) the ability to produce strong iron chelators themselves experienced strong growth arrest. This would suggest that pyoverdines might not be effective against all pathogens, thereby potentially limiting the utility of pyoverdines as global antibacterials.

      Strengths:

      The work addresses an important and timely question - can pyoverdines be used as an alternative strategy to deal with opportunistic pathogens? In general, the work is well conducted with rigorous biochemical, growth, and virulence assays. The work is clearly written and the findings are supported by high-quality figures.

      Weaknesses:

      I do not think there are any 'weaknesses' as such. However, it is well known that siderophore production is highly plastic, typically being upregulated in response to metal limitation (as well as toxic metal stress). Did the authors quantify whether pyoverdine supplementation altered siderophore production in the focal pathogens (either through phenotypic assays / transcriptomics)? Could such a phenotypic plastic response result in an increased capacity to scavenge iron from the environment? Importantly, increased expression of siderophores has been shown to enhance pathogen virulence (e.g. Lear et al 2023: increased pyoverdine production is linked with increased virulence in Pseudomonas aeruginosa). I really appreciate the amount of work the authors have put into this study, but I would suggest expanding the discussion a bit to include a few sentences on

      (1) unintentional consequences of pyoverdine treatment (e.g. changes in gene expression and non-siderophore-related mutations (e.g. biofilm formation)) on disease dynamics/pathogen virulence:

      (2) the efficacy of siderophore treatment under more natural conditions, i.e. when the pathogens have to compete with other species in the resident community (i.e. any other effects than resistance evolution through HGT of pyoverdine receptors as mentioned).

      Response 1: We would like to thank reviewer # 1 for the positive and constructive assessment. We agree that discussing the above points is important. We have added new paragraphs in the discussion, in which we elaborate on unintentional consequences (lines 532-551) and HGT of receptors (lines 599-607).

      Reviewer #1 (Recommendations For The Authors):

      I only have minor comments/suggestions for the authors, all listed below:

      • The authors' findings show that the antibacterial activity of pyoverdine is highly context-dependent. As such, I would suggest somewhat toning down the quite general statement in the Abstract: 'Thus, pyoverdines from environmental strain could become new sustainable antibacterials against human pathogens'

      Response 2: We agree that the pyoverdine treatment is especially potent against Acinetobacter baumannii and Staphylococcus aureus, but less so against Klebsiella pneumoniae. The treatment success is pathogen-dependent, and we have thus modified the phrase in the abstract (lines 32-34). The new sentence now reads: 'Thus, pyoverdines from environmental strains have the potential to become a new class of sustainable antibacterials against specific human pathogens.' Also in other parts of the manuscript (Results and Discussion), we emphasize that the pyoverdine treatment will likely be effective against specific pathogens (e.g., those with lower-iron affinity siderophores).

      • Bacteria often produce more than one type of siderophore. Do you know whether the 320 natural isolates used in this study produce any non-pyoverdine siderophores? Previous work has shown that pyochelin production is suppressed in PAO1 under a wider range of lab conditions. Do you know whether this is the case for the natural isolates used here (and rule out a potential role of non-pyoverdines in iron starvation as observed in Figure 1).

      Response 3: This is a valid question. Our own bioinformatic and phenotypic assays reveal that a certain fraction of strains (~ 40%) can produce secondary siderophores (unpublished data). We now mention the existence of secondary siderophores on lines 97-100 and 123. However, we do not think that their contribution to the supernatant assay results is large since the expression of pyoverdine typically suppresses the expression of the secondary siderophores (Cornelis 2010 Appl Microbiol Biotechnol; Dumas et al. 2013 Proc B) under stringent iron limitation. Furthermore, secondary siderophores have lower iron-binding affinities than pyoverdine. Finally, both the semi-pure and ultra-pure pyoverdine extracts showed strong pathogen inhibition (Fig. 3), and we are thus confident that pyoverdine is responsible for the observed growth inhibition.

      • Upon first mentioning the 'mock control' in the Results section in the main text, please state what the actual treatment is.

      Response 4: Thank you for noticing this. We now explain in more detail the actual treatment conditions used on lines 103-107 and in the caption of Figure 1. We have further removed the term 'mock' as it is confusing in this context and simple refer to the 'control treatment' in the text.

      • Please mention what the different colours mean in the legend of growth recovery in Figure 1B

      Response 5: We have clarified the colour scheme in the legend of Figure 1B.

      • Please clarify whether you used 12 or 14 strains of human pathogens (the latter number is mentioned in the results section)?

      Response 6: In the methods (lines 647-650), we now clearly specify that we used 12 strains of human pathogens in the initial supernatant screen (Figure 1). For all subsequent analyses (dose-response curves and infection experiments), we included the ESKAPE pathogens K. pneumoniae and A. baumannii.

      • Please explain whether ferribactin can be used in any other way than iron chelation (e.g. can this precursor be recycled to form pyoverdine)?

      Response 7: We apologize for not having properly explained the role of ferribactin. Under natural conditions, ferribactin is not secreted. It is kept in the periplasmic space, where it matures to pyoverdine. We most likely recovered ferribactin in the supernatant because of the vigorous shaking and centrifugation involved in the pyoverdine purification protocol. We now explain this on lines 216-218. Thus, there is no ferribactin secretion and recycling.

      • Have the authors looked at whether there is a relationship between the degree of growth arrest and phylogenetic distance? Would you expect there to be one?

      Response 8: This is an interesting question. We have now constructed a phylogenetic tree to explore this relationship (new Figure S2). We found that strains with inhibitory supernatants were scattered across the phylogenetic tree (described on lines 129-135). However, we also found two branches on the tree on which strains with inhibitory supernatant effects were overrepresented. This matches well our previous analysis that closely related species can produce similar pyoverdine types, but that the same pyoverdine can also be produced by completely different species (Gu et al. 2024 eLife).

      • In the Methods section, please mention you used pyoverdine-only controls in the infection assay.

      Response 9: We now mention the use of pyoverdine-only controls in the Methods section (lines 788-790). Overall, we have improved the infection procedure section (starting on line 770). Thank you for pointing this out.

      • Did you confirm whether the addition of pyoverdine resulted in lower bacterial loads in Galleria? In other words, were the observed changes in mortality solely related to changes in bacterial density?

      Response 10: Thank you for this valid question. No, we did not test whether pyoverdine treatment reduces the bacterial load. However, we did this in the past in two studies with a similar set of pathogens (Weigert et al. 2017 Evol Appl; Schmitz et al. 2023 Proc B) and found strong correlations between G. mellonella survival and bacterial loads. We agree that it is important to understand how pyoverdine affects pathogen load in the host and we will address this point in future studies.

      • In your infection assay, were Galleria (n=10) for each treatment housed in the same environment/container? If so, can you treat these as independent observations or should you use some sort of grouping variable in your survival analysis?

      Response 11: Thank you for pointing this out. We forgot to clarify this in the Methods section and now do so on lines 777-779. All larvae were individually housed in separate wells of a 24-well plate. There was no physical contact between larvae and no opportunity for pathogen exchange. As such, we treat each individual larvae as an independent observation.

      Reviewer #2 (Public Review):

      In this work, Vollenweider et al. examine the effectiveness of using natural products, specifically molecules that chelate iron, to treat infectious agents. Through the purification of 320 environmental isolates, 25 potential candidates were identified from natural products based on inhibition assays and were further screened. The structural information and chemical composition were determined.

      The paper is well-structured and thorough; targeting virulence factors in this manner is a great idea. My enthusiasm is dampened by the mediocre effects of the compounds. The lack of a dose-response curve in the survivability assays suggests a limited scope for these molecules. While it is encouraging that the best survivability occurred at the lowest toxicity level, it opens questions as to how effective such molecules can be. Either the reduction in mortality was offset by using higher concentrations, which was not observed in the compound-alone test, or there is no dose-response curve. The latter would suggest to me that the variation in survivability is not due to the addition of siderophores.

      Response 12: Thank you very much for the overall positive assessment. We understand your concern regarding the effectiveness of pyoverdines in the host. However, we wish to emphasize that hazard risks were reduced by more than 50% when treating A. baumannii and K. pneumoniae. Moreover, it was not so surprising to us that the treatment worked best at intermediate pyoverdine concentrations. We anticipated that pyoverdines could have negative effects for the host at relatively high concentrations because siderophore can interfere with host iron stocks (see discussion starting on line 552). Finally, dose-response curves do not necessarily need to be linear or sigmoid, they can also be hump-shaped. To better illustrate this aspect, we have now plotted the time to death for all the deceased larvae against the pyoverdine concentration gradient and fitted polynomial regression (new Fig. S6). For the above two pathogens, we found humped-shaped dose-response curves in four out of the six comparisons. We present this new analysis on lines 351-362.

      I would also like to see how these molecules compare to other iron-chelating molecules. Desferoxamine is a bacteria-derived siderophore that is FDA-approved. However, it is not used to treat infections. Would the author consider comparing their candidate molecules to well-studied molecules? This also raises questions about the novelty of this work; I think the authors could rephrase the discussion to better reflect that bioprospecting for iron-chelating molecules has previously occurred and been successful.

      Response 13: Thank you for the comment. The initial version of our manuscript already featured a brief discussion on other iron-chelation therapies. We have now changed the narrative to better reflect the differences of our approach to already existing iron-chelating molecules such as deferoxamine (lines 608-632).

      Finally, I am concerned about the few mutations reported in the resistance study. Looking at the SI, it appears that very few mutations were seen. It is unclear what filtering the authors used to arrive at such a low number of mutations. Even filtering against mutations that were selected by adaptation to the media, it seems low that only a handful of clones had distinct mutations.

      Response 14: We apologise for the unclear explanations and data analysis. When reanalysing the data we indeed detected a mistake: we originally treated all genomes as clonal origin, despite the fact that we sequenced entire populations for the control treatments. We have now completely re-done the mutational analysis using the breseq pipeline as newly described in the Methods (lines 861-866) and presented in the Results (lines 421-451). We have improved the filtering process and indeed found many more mutations, including the loss of mobile genetic elements. However, it is important to note that it is not uncommon to only find a few beneficial mutations. Especially, in cases where there are selective sweeps often only a few mutations fix.

      This paper has a lot of strengths. The workflow is logical and well-executed; the only significant weakness is the effect of the molecules and the lack of an explanation for a dose-response curve in the survivability assay, especially when compared to the data reported in Figure 3. As the authors describe in lines 214-217.

      Response 15: Thank you for this overall positive assessment. As discussed in our response 12, the effect of the molecule in the host was not weak as it decreased hazard risks by more than 50% for A. baumannii and K. pneumoniae. Moreover, we explain that the benefit of the pyoverdine treatment (in terms of treating the infection) can be offset by adverse effects on the host, especially at high pyoverdine concentrations.

      Reviewer #2 (Recommendations For The Authors):

      • Compare these compounds to well-studied iron chelating molecules.

      Response 16: We have addressed this comment in our response 13.

      • Considering adding time of death to the analysis for the survivability. While the reduction in mortality was not large perhaps the time to death increased.

      Response 17: This is an excellent suggestion. We have now analysed the time-to-death as a function of pyoverdine concentration (new Figure S6). Time-to-death was highly variable and sample size was fairly low for A. baumannii and K. pneumoniae as many larvae survived. Nonetheless, we found hump-shaped dose-response curves in four out of six comparisons and a linear dose-response curve in one case. We now report the new analyses on lines 351-362. Finally, we like to stress once more that reduction in mortality was considerable (hazard risk reduction by more than 50%).

      • I would also like to see the actual growth curves of the pathogens in the SI to accompany Fig 6.

      Response 18: This is a good point. We have now included the actual growth curves of the pathogens in the Supporting Information to accompany Figure 6 (new Figures S9 and S10).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Joint Public Review:

      Summary:

      This study presents a strategy to efficiently isolate PcrV-specific BCRs from human donors with cystic fibrosis who have/had Pseudomonas aeruginosa (PA) infection. Isolation of mAbs that provide protection against PA may be a key to developing a new strategy to treat PA infection as the PA has intrinsic and acquired resistance to most antibiotic drug classes. Hale et al. developed fluorescently labeled antigen-hook and isolated mAbs with anti-PA activity. Overall, the authors' conclusion is supported by solid data analysis presented in the paper. Four of five recombinantly expressed PcrV-specific mAbs exhibited anti-PA activity in a murine pneumonia challenge model as potent as the V2L2MD mAb (equivalent to gremubamab). However, therapeutic potency for these isolated mAbs is uncertain as the gremubamab has failed in Phase 2 trials. Clarification of this point would greatly benefit this paper.

      Strengths:

      (1) High efficiency of isolating antigen-specific BCRs using an antigenic hook.

      (2) The authors' conclusion is supported by data.

      Weaknesses:

      Although the authors state that the goal of this study was to generate novel protective mAbs for therapeutic use (P12; Para. 2), it is unclear whether PcrV-specific mAbs isolated in this study have therapeutic potential better than the gremubamab, which has failed in Phase 2 trials. Four of five PcrV-specific mAbs isolated in this study reduced bacterial burdens in mice as potent as, but not superior to, gremubamab-equivalent mAb. Clarification of this concern by revising the text or providing experimental results that show better potential than gremubamab would greatly benefit this paper.

      The authors thank the reviewer for their thoughtful positive assessment. As noted by the reviewer, the studies described here, which were performed in mice, show that our MBC-derived mAbs are as effective as V2L2MD, a mAb that is one component of the gremubamab bi-specific. However, key theoretical strengths of MBC-derived mAbs (reduced immunogenicity, full participation in effector functions) are not easily tested in mice. We have clarified and expanded our discussion of these points in our revised manuscript, particularly in the Discussion paragraph 4.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Page 8. Using improved methods that enhanced the efficiency and depth of sequencing (manuscript in preparation...). This method is not provided in detail. The authors should provide a detailed method (as a preprint on a public database or described in the method section).

      We thank the reviewers for their interest in the details of the specific methods for single cell B cell receptor sequencing. We regret that the manuscript is still in preparation. In fact, our current methods section provides much more detail about sequencing methods than is customarily supplied by authors mAb development papers. However, we understand the frustration and will remove our citation of our manuscript in preparation in our revised manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      With socioeconomic development, more and more people are obese which is an important reason for sub-fertility and infertility. Maternal obesity reduces oocyte quality which may be a reason for the high risk of metabolic diseases for offspring in adulthood. Yet the underlying mechanisms are not well elucidated. Here the authors examined the effects of maternal obesity on oocyte methylation. Hyper-methylation in oocytes was reported by the authors, and the altered methylation in oocytes may be partially transmitted to F2. The authors further explored the association between the metabolome of serum and the altered methylation in oocytes. The authors identified decreased melatonin. Melatonin is involved in regulating the hyper-methylation of high-fat diet (HFD) oocytes, via increasing the expression of DNMTs which is mediated by the cAMP/PKA/CREB pathway.

      Strengths:

      This study is interesting and should have significant implications for the understanding of the transgenerational inheritance of GDM in humans.

      Thank you for your positive comments to our manuscript.

      Weaknesses:

      The link between altered DNA methylation and offspring metabolic disorders is not well elucidated; how the altered DNA methylation in oocytes escapes reprogramming in transgenerational inheritance is also unclear.

      Thanks. These are very good questions. There is a long way to completely elucidate the relationship between methylation and offspring metabolic disorders, and the underlying mechanisms of obtained methylation escaping the reprogramming during development. We would like to explore these in the future.

      Reviewer #2 (Public Review):

      This manuscript offers significant insights into the impact of maternal obesity on oocyte methylation and its transgenerational effects. The study employs comprehensive methodologies, including transgenerational breeding experiments, whole genome bisulfite sequencing, and metabolomics analysis, to explore how high-fat diet (HFD)-induced obesity alters genomic methylation in oocytes and how these changes are inherited by subsequent generations. The findings suggest that maternal obesity induces hyper-methylation in oocytes, which is partly transmitted to F1 and F2 oocytes and livers, potentially contributing to metabolic disorders in offspring. Notably, the study identifies melatonin as a key regulator of this hyper-methylation process, mediated through the cAMP/PKA/CREB pathway.

      Strengths:

      The study employs comprehensive methodologies, including transgenerational breeding experiments, whole genome bisulfite sequencing, and metabolomics analysis, and provides convincing data.

      Thank you for your positive comments to our manuscript.

      Weaknesses:

      The description in the results section is somewhat verbose. This section (lines 126~227) utilized transgenerational breeding experiments and methylation analysis to demonstrate that maternal obesity-induced alterations in oocyte methylation (including hyper-DMRs and hypo-DMRs) can be partially transmitted to F1 and F2 oocytes and livers. The authors should consider condensing and revising this section for clarity and brevity.

      Thanks for your suggestions. We have re-written this parts in the revised manuscript.

      There is a contradiction with Reference 3, but the discrepancy is not discussed. In this study, the authors observed an increase in global methylation in oocytes from HFD mice, whereas Reference 3 indicates Stella insufficiency in oocytes from HFD mice. This Stella insufficiency should lead to decreased methylation (Reference 33). There should be a discussion of how this discrepancy can be reconciled with the authors' findings.

      Thanks for your suggestions. As reported by Reference 33, STELLA prevents hypermethylation in oocytes by sequestering UHRF1 from the nuclei which recruits DNMT1 into nuclei. Han et al. reported that obesity induced by high-fat diet reduces STELLA level in oocytes. These indicate that STELLA insufficiency might induce hypermethylation in oocytes, although significant hypermethylation in obese oocytes is not reported by Han et al. using immunofluorescence. This contradiction may be caused by the limited sample sizes (n=14) used by Han et al. We have added a brief discussion in the revised manuscript.

      Reviewer #3 (Public Review):

      Summary:

      Maternal obesity is a health problem for both pregnant women and their offspring. Previous works including work from this group have shown significant DNA methylation changes for offspring of obese pregnancies in mice. In this manuscript, Chao et al digested the potential mechanisms behind the DNA methylation changes. The major observations of the work include transgenerational DNA methylation changes in offspring of maternal obesity, and metabolites such as methionine and melatonin correlated with the above epigenetic changes. Exogenous melatonin treatment could reverse the effects of obesity. The authors further hypothesized that the linkage may be mediated by the cAMP/PKA/CREB pathway to regulate the expression of DNMTs.

      Strengths:

      The transgenerational change of DNA methylation following HFD is of great interest for future research to follow. The metabolic treatment that could change the DNA methylation in oocytes is also interesting and has potential relevance to future clinical practice.

      Thank you for your positive comments to our manuscript.

      Weaknesses:

      The HFD oocytes have more 5mC signal based on staining and sequencing (Fig 1A-1F). However, the authors also identified almost equal numbers of hyper- and hypo-DMRs, which raises questions regarding where these hypo-DMRs were located and how to interpret their behaviors and functions. These questions are also critical to address in the following mechanistic dissections as the metabolic treatments may also induce bi-directional changes of DNA methylation. The authors should carefully assess these conflicts to make the conclusions solid.

      Thanks for the helpful comments and suggestions. As presented in Fig. 1F, there is an increase of methylation level in promoter and exon regions and there is a decrease in intron, utr3 and repeat regions. According to the suggestions, we further analyzed the distribution of DMRs, and found that hypo-DMRs were mainly distributed at utr3, intron, repeat, and tes regions compared with hyper-DMRs (Fig. S3). These suggest that the distribution of DMRs in genome is not random.

      The transgenerational epigenetic modifications are controversial. Even for F0 offspring under maternal obesity, there were different observations compared to this work (Hou, YJ., et al. Sci Rep, 2016). The authors should discuss the inconsistencies with previous works.

      Thanks for the suggestions. There are contradictions on the whole genome DNA methylation of oocytes in obese mice. Hou YJ et al. in 2016 reported that obesity reduces the whole genome DNA methylation of NSN GV oocytes using immunofluorescence. In 2018, Han LS et al. reported that the whole genome 5mC of oocytes is not significantly influenced by obesity using immunofluorescence, but they find the Stella level is reduced in oocytes by obesity. Stella locates in the cytoplasm and nuclei of oocytes and sequesters Uhrf1 from the nuclei. Stella knockout in oocytes results in about twofold increase of global methylation in MII oocytes via recruiting more DNMT1 into nuclei. These suggest that the global methylation of oocytes in obese mice should be increased, but the similar methylation in oocytes between obese and non-obese mice is reported by Han LS et al. Thus, the contradiction may be induced by the different sample size in our manuscript and previous studies, and Hou YJ and colleagues just examined the methylation of NSN GV oocytes. As present in Stella+/- oocytes, the global methylation of oocytes is normal, which suggest that the insufficiency of Stella may be not the main reason for the increased methylation of oocytes in obese mice. We have added a brief discussion in the revised manuscript.

      In addition to the above inconsistencies, the DNA methylation analysis in this work was not carefully evaluated. Several previous works were evaluating the DNA methylation in mice oocytes, which showed global methylation levels of around 50% (Shirane K, et al. PLoS Genet, 2013; Wang L., et al, Cell, 2014). In Figure 1E, the overall methylation level is about 23% in control, which is significantly different from previous works. The authors should provide more details regarding the WGBS procedure, including but not limited to sequencing coverage, bisulfite conversion rate, etc.

      Thanks for the good questions. Smallwood et al. reported the the CG methylation of MII oocyte is about 33.1% (Smallwood et al. Nature Methods, 2014) using single-cell genome-wide bisulfite sequencing. Shirane K et al. reported that the average methylation level of GV oocytes is 37.9%. Kobayashi H et al. Reported that the CG methylation in GV oocytes is about 40% (Kobayashi H et al. Plos Genet. 2012). CG methylation in fully grown oocytes is about 38.7% (Maenohara S et al. Plos Genet. 2017). The variation of methylation in oocytes is associated with sequencing methods, sequencing depth, and mapping rates. In the present study, whole genome bisulfite sequencing (WGBS) for small sample and methylation analysis were performed by NovoGene. The reads are 31613641 to 37359643, unique mapping rate is ≥32.88%,  conversation rate is > 99.44%, and sequencing depth is 2.45 to 2.75. Relative information is presented in Table S1. The sequencing depth might be a reason for the inconsistence. But we further confirmed our sequencing results using bisulfite sequencing (BS), and the result is similar between BS and WGBS results. These findings suggest that our results are reliable.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) Since the results show that melatonin may play a role in hyper-methylation, the authors need to give some basic information in the Introduction section.

      Thanks. We added more information in the section of Introduction.

      (2) There are many differential metabolites identified. Besides melatonin, other differential metabolites are involved in the altered methylation in oocytes

      These is a good question. We firstly filtered the differential metabolites which may be involved in methylation, and then further filtered these metabolites according to the relative DNA methylation pathways and published papers. After that, we confirmed the concentrations of relative metabolites in the serum using ELISA. Certainly, we can not completely exclude all the metabolites which might involved in regulating DNA methylation.

      (3) The altered methylation would be found in the F1 tissues. Did the authors examine the other parts besides the liver?

      Thank you. In the present study, we didn’t examined the DNA methylation in the other tissues besides the liver. We agree that the altered methylation should be observed in the other tissues.

      (4) Did the authors try or guess how many generations the maternal obesity-induced genomic methylation alterations can be transmitted?

      Thanks. This is a good question. Takahashi Y and colleagues reported that obtained DNA methylation at CpG island can be transmitted across multiple generations using DNA methylation-edited mouse (Takahashi Y et al. 2023, cell). Similar inheritance is also reported by other studies using different models.

      (5) The F2 is indirectly affected by maternal obesity, so the evidence is not enough to prove the transgenerational inheritance of the altered methylation.

      Thanks. We find the altered DNA methylation in F2 tissue and oocytes is similar to that in F1 oocytes. These suggest the altered DNA methylation in F2 oocytes should be at least partly transmitted to F3. Previous paper (Takahashi Y et al. 2023, cell) confirms that obtain DNA methylation in CpG island can be transmitted across several generations through paternal and maternal germ lines. Certainly, it’s better if it is examined in F3 tissues.

      Reviewer #2 (Recommendations For The Authors):

      (1) Figure Font Size: The font sizes in the figures are quite inconsistent. Please try to uniform the font size of similar types of text.

      Thanks for your suggestions. We re-edited the relative figures in the revised manuscript.

      (2) Figure Clarity: Ensure that all critical information in the figures is clearly visible, such as in Figure 3C.

      Thank you. We revised this figure.

      (3) Figure 1B, C: The position of the asterisks ("**") is not centered in the corresponding columns, and the font size is too small. Please correct this and address similar issues in other figures.

      Thank you for your suggestions. We re-edited these in the revised figures.

      (4) Line 126: The current expression is confusing. It may be revised to: "Both the oocyte quality and the uterine environment can contribute to adult diseases, which may be mediated by epigenetic modifications."

      Thanks. We revised this sentence in the revised manuscript.

      (5) Missing Panel in Figure 3: Figure 3 is missing panel 3N.

      Thank you so much. We corrected it in the revised manuscript.

      (6) Figure Panel Order: Please adjust the order of the panels in the figures to follow a logical reading sequence.

      Thank you. We changed the orders in the revised manuscript.

      (7) Line 493: Correct "inthe" to "in the".

      Thank you. We revised it.

      (8) Lines 102-106: Polish the wording and expression, an example as follows: "We analyzed the differentially methylated regions (DMRs) in oocytes from both HFD and CD groups and identified 4,340 DMRs. These DMRs were defined by the criteria: number of CG sites {greater than or equal to} 4 and absolute methylation difference {greater than or equal to} 0.2. Among these, 2,013 were hyper-DMRs (46.38%) and 2,327 were hypo-DMRs (53.62%) (Fig. 1G). These DMRs were distributed across all chromosomes (Fig. 1H). "

      Thank you! We re-wrote these parts in the revised manuscript.

      Reviewer #3 (Recommendations For The Authors):

      The sample numbers should be annotated in the figure legend for all the bar plots using Image J. The lines in Figures 2B and 2C were without error bars. How many mice were used for these plots?

      Thanks for your suggestions. We added the sample size in the revised manuscript. We made a mistake when we prepared the pictures for figure 2B and figure 2C, which resulted in missing the error bars. We have corrected these pictures. Thanks again!

      The authors should revise the panel arrangement of the figures (Figure 2, Figure 5, etc) to make them more clear and readable.

      Thank you! We have revised these in the revised manuscript.

      The writing should be improved since there were multiple typos and unclear expressions. AI tools like Grammarly or ChatGPT may help.

      Thank you! We have re-edited the language in the revised manuscript using AI tools.

      Please recheck the immunofluorescence images for clear interpretability. For example, in Figure 5F (H89 treated), the GV is all the way at the edge of the oocyte, and the oocyte in the DIC image appears like it is partially lysed. The DIC images and the DAPI images are not clear enough.

      Thanks for your suggestions. We have re-edited these pictures in the revised manuscript.

      Another concern is that the Methods describes the immunofluorescence preparation for 5mC and 5hmC staining as a simple fixation in 4% paraformaldehyde followed by permeabilization with .5% TritonX-100, but there is no antigen exposure step described, a step that is normally required for visualizing these DNA modifications (e.g., 4N HCl).

      Thanks. Sorry for that we didn’t describe the methods clearly. We have added more information about the methods in the revised manuscript.

      The metabolomic analysis revealed a highly significant increase in dibutylphthalate, genistein, and daidzein in the control mice. The presence of these exogenous metabolites suggests that the diets differed in many aspects, not just fat content, so it would be very difficult to interpret the results as related to a high-fat diet alone. Both daidzein and genistein are phytoestrogens and dibutylphthalate is a plasticizer, suggesting differences in the diet and/or in the materials used to collect the samples for analysis from the mice. The Methods define the high-fat diet adequately, as the formulation can be found online using the catalog number. However, the control diet is just listed as "normal diet", so one has no idea what is in it

      Thank you for your good questions. The daidzein and genistein may be from the diets and the dibutylthalate may be from the materials used to collect samples. If so, these should be similar between groups. Thus, we added the formulation of normal diet in the revised manuscript. The raw materials of normal diet include corn, bean pulp, fish meal, flour, yeast powder, plant oil, salt, vitamins, and mineral elements. According to the suggestions, we re-checked the data about these metabolites, and found that the abundance of these metabolites was low. And the result of these metabolites was at a low confidence level because the iron of these metabolites was only mapped to ChemSpider(HMDB,KEGG,LIPID MAPS). To further confirm these results, we examined these metabolites in serum using ELISA, and results revealed that the concentrations of genistein and dibutylthalate were similar between groups. These results suggest that these metabolites may be not involved in the altered methylation of oocytes induced by obesity.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews: 

      Reviewer #1 (Public review): 

      Summary: 

      UGGTs are involved in the prevention of premature degradation for misfolded glycoproteins, by utilizing UGGT1-KO cells and a number of different ERAD substrates. They proposed a concept by which the fate of glycoproteins can be determined by a tug-of-war between UGGTs and EDEMs. 

      Strengths: 

      The authors provided a wealth of data to indicate that UGGT1 competes with EDEMs, which promotes the glycoprotein degradation. 

      Weaknesses: 

      NA 

      We appreciate your comment.

      Reviewer #2 (Public review): 

      In this study, Ninagawa et al., sheds light on UGGT's role in ER quality control of glycoproteins. By utilizing UGGT1/UGGT2 DKO , they demonstrate that several model misfolded glycoproteins undergo early degradation. One such substrate is ATF6alpha where its premature degradation hampers the cell's ability to mount an ER stress response. 

      This study convincingly demonstrates that many unstable misfolded glycoproteins undergo accelerated degradation without UGGTs. Also, this study provides evidence of a "tug of war" model involving UGGTs (pulling glycoproteins to being refolded) and EDEMs (pulling glycoproteins to ERAD). 

      The study explores the physiological role of UGGT, particularly examining the impact of ATF6α in UGGT knockout cells' stress response. The authors further investigate the physiological consequences of accelerated ATF6α degradation, convincingly demonstrating that cells are sensitive to ER stress in the absence of UGGTs and unable to mount an adequate ER stress response. 

      These findings offer significant new insights into the ERAD field, highlighting UGGT1 as a crucial component in maintaining ER protein homeostasis. This represents a major advancement in our understanding of the field. 

      Thank you very much for your comment.

      Reviewer #3 (Public review): 

      This valuable manuscript demonstrates the long-held prediction that the glycosyltransferase UGGT slows degradation of endoplasmic reticulum (ER)-associated degradation substrates through a mechanism involving re-glucosylation of asparaginelinked glycans following release from the calnexin/calreticulin lectins. The evidence supporting this conclusion is solid using genetically-deficient cell models and well established biochemical methods to monitor the degradation of trafficking-incompetent ER-associated degradation substrates, although this could be improved by better defining of the importance of UGGT in the secretion of trafficking competent substrates. This work will be of specific interest to those interested in mechanistic aspects of ER protein quality control and protein secretion. 

      The authors have attempted to address my comments from the previous round of review, although some issues still remain. For example, the authors indicate that it is difficult to assess how UGGT1 influences degradation of secretion competent proteins, but this is not the case. This can be easily followed using metabolic labeling experiments, where you would get both the population of protein secreted and degraded under different conditions. Thus, I still feel that addressing the impact of UGGT1 depletion on the ER quality control for secretion competent protein remains an important point that could be better addressed in this work. 

      We mainly focused on the impact of UGGT1 depletion on ERAD in this paper and intend to determine the impact of UGGT1 depletion on the ER quality control for secretion competent protein in the near future.

      Further, in the previous submission, the authors showed that UGGT2 depletion demonstrates a similar reduction of ATF6 activation to that observed for UGGT1 depletion, although UGGT2 depletion does not reduce ATF6 protein levels like what is observed upon UGGT1 depletion. In the revised manuscript, they largely remove the UGGT2 data and only highlight the UGGT1 depletion data. While they are somewhat careful in their discussion, the implication is that UGGT1 regulates ATF6 activity by controlling its stability. The fact that UGGT2 has a similar effect on activity, but not stability, indicates that these enzymes may have other roles not directly linked to ATF6 stability. It is important to include the UGGT2 data and explicitly highlight this point in the discussion. Its fine to state that figuring out this other function is outside the scope of this work but removing it does not seem appropriate.

      We have added the data of UGGT2-KO and UGGT-DKO cells to Figure 4 and discussed appropriately.

      As I mentioned in my previous review, I think that this work is interesting and addresses an important gap in experimental evidence supporting a previously asserted dogma in the field. I do think that the authors would be better suited for highlighting the limitations of the study, as discussed above. Ultimately, though, this is an important addition to the literature. 

      We appreciate your comments. Thank you very much.

      Recommendations for the authors: 

      Reviewer #1 (Recommendations for the authors): 

      I have carefully gone through the revised manuscript and responses to the reviewers' comments; I believe that the authors did a great job on revisions, and I do think that now this manuscript has been much improved (far easier to read through). Now I have only minor comments as follows; 

      Page 9: Lines 8-9; Comparison between WT and EDEM-TKO cells indicates that ATF6alpha is still degraded via gpERAD requiring mannose trimming even in the presence of DNJ (Fig. 1D). (it would be better to indicate which figure to look) 

      We have fixed it.

      Page 10: Lines 9-11; as multiple higher molecular weight bands (representing a mixture of G3M9, G2M9m and GM9 etc.) in WT cells treated with CST -> I am NOT AT ALL convinced with this statement on Figure 1-figure supplement 6A). How can the subtle glycan structure difference cause the ladder of the band? And if it is indeed the case (which I frankly doubt by the way), will endo-alpha-mannosidase treatment end up with a single band for CST? And PNGase F digestion can cancel all size difference between samples (control, +DNJ and +CST)? 

      CD3d-DTM-HA is a small protein (~20 kDa) possessing three N-glycans. Clear increase in the level of GM9 in WT cells treated with DNJ (Figure 1-Figure supplement 5A) caused an upward band shift (Figure 1-Figure supplement 6A). Similarly, clear increase in the levels of GM9, G2M9, G3M9 in WT cells treated with CST (Figure 1-Figure supplement 6B) produced the ladder of the band (Figure 1-Figure supplement 6A).

      Crystal violet assay (new Fig 4G; Page 33); It said that, after treating cells with drug (Tg) for 4 hours, cells were spread on 24 well plates and cultured without Tg for 5 days. If incubated that long, I wonder that any compromised viability may have been canceled by growing cells (cells become confluent no matter what?). Am I missing something? Please clarify. 

      We employed a previously published method to determine ER stress sensitivity (Yamamoto et al., Dev. Cell, 2007). Although any compromised viability may have been canceled by growing cells, as suggested, we were able to detect the difference between WT and UGGT-KO cells.

      Figure 5D; why one of the three N-glycans is missing on the last protein?? 

      We have fixed it.

    1. Author response:

      Reviewer #1 (Public review): 

      Summary: 

      Walton et al. set out to isolate new phages targeting the opportunistic pathogen Pseudomonas aeruginosa. Using a double ∆fliF ∆pilA mutant strain, they were able to isolate 4 new phages, CLEW-1. -3, -6, and -10, which were unable to infect the parental PAO1F Wt strain. Further experiments showed that the 4 phages were only able to infect a ∆fliF strain, indicating a role of the MS-protein in the flagellum complex. Through further mutational analysis of the flagellum apparatus, the authors were able to identify the involvement of c-di-GMP in phage infection. Depletion of c-di-GMP levels by an inducible phosphodiesterase renders the bacteria resistant to phage infection, while elevation of c-di-GMP through the Wsp system made the cells sensitive to infection by CLEW-1. Using TnSeq, the authors were able to not only reaffirm the involvement of c-di-GMP in phage infection but also able to identify the exopolysaccharide PSL as a downstream target for CLEW-1. C-di-GMP is a known regulator of PSL biosynthesis. The authors show that CLEW-1 binds directly to PSL on the cell surface and that deletion of the pslC gene resulted in complete phage resistance. The authors also provide evidence that the phage-PSL interaction happens during the biofilm mode of growth and that the addition of the CLEW-1 phage specifically resulted in a significant loss of biofilm biomass. Lastly, the authors set out to test if CLEW-1 could be used to resolve a biofilm infection using a mouse keratitis model. Unfortunately, while the authors noted a reduction in bacterial load assessed by GFP fluorescence, the keratitis did not resolve under the tested parameters. 

      Strengths: 

      The experiments carried out in this manuscript are thoughtful and rational and sufficient explanation is provided for why the authors chose each specific set of experiments. The data presented strongly supports their conclusions and they give present compelling explanations for any deviation. The authors have not only developed a new technique for screening for phages targeting P. aeruginosa, but also highlight the importance of looking for phages during the biofilm mode of growth, as opposed to the more standard techniques involving planktonic cultures. 

      Weaknesses: 

      While the paper is strong, I do feel that further discussions could have gone into the decision to focus on CLEW-1 for the majority of the paper. The paper also doesn't provide any detailed information on the genetic composition of the phages. It is unclear if the phages isolated are temperate or virulent. Many temperate phages enter the lytic cycle in response to QS signalling, and while the data as it is doesn't suggest that is the case, perhaps the paper would be strengthened by further elimination of this possibility. At the very least it might be worth mentioning in the discussion section. 

      Thank you for your review. We will upload the genomes of all Clew phages and Ocp-2 before resubmission. It turns out that the Clew phage are highly related, which we wanted to express with the genomic comparison in the supplementary figure (rather unsuccessfully). It therefore made sense to focus our in-depth analysis on one of the phage. We will include a supplementary figure demonstrating that all Clew-1 phage require an intact psl locus for infection, to make that logic clearer. The phage are virulent (there is apparently a bit of a debate about this with regard to Bruynogheviruses, but we have not been able to isolate lysogens). This will be explained in the revised version of the manuscript as well.

      Reviewer #2 (Public review): 

      This manuscript by Walton et al. suggests that they have identified a new bacteriophage that uses the exopolysaccharide Psl from Pseudomonas aeruginosa (PA) as a receptor. As Psl is an important component in biofilms, the authors suggest that this phage (and others similarly isolated) may be able to specifically target biofilm-growing bacteria. While an interesting suggestion, the manner in which this paper is written makes it difficult to draw this conclusion. Also, some of the results do not directly follow from the data as presented and some relevant controls seem to be missing. 

      Thank you for your review. We would argue that the combination of demonstrating Psl-dependent binding of Clew-1 to P. aeruginosa, as well as demonstration of direct binding of Clew-1 to affinity-purified Psl, indicates that the phage binds directly to Psl and uses it as a receptor. In looking at the recommendations, it appears that the remark about controls refers to not using the ∆pslC mutant alone (as opposed to the ∆fliF2 ∆pslC double mutant) as a control for some of the binding experiments. However, since the ∆fliF2 mutant is more permissive for phage infection, analyzing the effect of deleting pslC in the context of the ∆fliF2 mutant background is the more stringent test.

    1. Author response:

      We sincerely thank all the reviewers for their enthusiasm and positive feedback, which has encouraged us to delve deeper into this research. As this is the first report of POLK in the brain using a longitudinal normative aging model, our primary aim was to establish the observational and phenomenological aspects. We agree with the reviewers that more detailed molecular, biochemical, and cellular studies are essential to elucidate underlying mechanisms. However, as noted by some reviewers, these investigations, while they will raise the impact, may fall outside the scope of the current report. Indeed, many of these lines of investigation are currently ongoing. Below, we provide our provisional responses to individual reviewer comments.

      Response to Reviewer #1:

      a) Concern over POLK antibody characterization in mice:

      We performed knocking down of POLK by siRNA in mice cortical primary neuronal culture (Fig S1C). In the revised version, we will provide a more detailed characterization of POLK antibodies in mouse cells.

      b) More mechanistic investigation is needed before POLK could be considered as a brain aging clock:

      We sincerely appreciate the valuable suggestion. In our ongoing work exploring the mechanisms of POLK in postmitotic neurons, preliminary findings using siPOLK indicate an upregulation of senescence markers along with a reduction in DNA repair synthesis (manuscript in preparation). We will reference this companion manuscript in the revised version and are pleased to share these data with the reviewers for their consideration.

      Response to Reviewer #2:<br /> a) Concern on more mechanistic understanding of the pathways regulating POLK dynamics between the nucleus and cytosol:

      We sincerely appreciate the reviewer’s enthusiasm and valuable guidance in helping us better understand the mechanism of nuclear-cytoplasmic POLK dynamics. Previously, we developed a modified aniPOND (accelerated native isolation of proteins on nascent DNA) protocol, which we termed iPoKD-MS (isolation of proteins on Pol kappa synthesized DNA  followed by mass spectrometry), to capture proteins bound to nascent DNA synthesized by POLK in human cell lines (bioRxiv https://www.biorxiv.org/content/10.1101/2022.10.27.513845v3). In this dataset, we identified potential candidates that may regulate nuclear/cytoplasmic POLK dynamics. These candidates are currently undergoing validation in human cell lines, and we are preparing a manuscript on these findings. Among these, some candidates, including previously identified proteins such as exportin and importin (Temprine et al., 2020, PMID: 32345725), are being explored further as potential POLK nuclear/cytoplasmic shuttles. We are also conducting tests on these candidates in mouse cortical primary neurons to assess their role in POLK dynamics. In the revised version of the manuscript, we will include a discussion of our current understanding and outline our planned studies.

      b) Question on “… what is POLK doing in the cytosol, and what is it interacting with …”:

      Our data so far indicate that POLK accumulates in stress granules and lysosomes. We are very grateful for the reviewer’s insightful suggestions and will make every effort to incorporate them in the revised manuscript. Currently, we are characterizing POLK accumulation in the cytoplasm using additional lysosomal markers, as recommended by the reviewer. If these experiments prove challenging in mouse brain tissues, we plan to investigate them in primary neuron cultures. We are hopeful to include these findings in the revised version. Additionally, we have optimized the POLK antibody for immunoprecipitation from nuclear and cytoplasmic fractions of mouse brain tissue. These findings, which are beyond the scope of the current study, will be reported in a separate manuscript.

      Response to Reviewer #3:

      We highly appreciate the reviewer bringing up the context of biomolecular condensates. Our iPoKD-MS data referenced above suggests candidates from various biomolecular condensates that we are currently investigating. We are currently investigating by subcellular fractionation the presence of POLK in different biomolecular condensates that will be fully reported in future publications. We appreciate the reviewer providing important literature that will be cited and potential biomolecular condensates will be discussed in the revised version.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      This manuscript from Mukherjee et al examines potential connections between telomere length and tumor immune responses. This examination is based on the premise that telomeres and tumor immunity have each been shown to play separate, but important, roles in cancer progression and prognosis as well as prior correlative findings between telomere length and immunity. In keeping with a potential connection between telomere length and tumor immunity, the authors find that long telomere length is associated with reduced expression of the cytokine receptor IL1R1. Long telomere length is also associated with reduced TRF2 occupancy at the putative IL1R1 promoter. These observations lead the authors towards a model in which reduced telomere occupancy of TRF2 - due to telomere shortening - promotes IL1R1 transcription via recruitment of the p300 histone acetyltransferase. This model is based on earlier studies from this group (i.e. Mukherjee et al., 2019) which first proposed that telomere length can influence gene expression by enabling TRF2 binding and gene transactivation at telomere-distal sites. Further mechanistic work suggests that G-quadruplexes are important for TRF2 binding to IL1R1 promoter and that TRF2 acetylation is necessary for p300 recruitment. Complementary studies in human triple-negative breast cancer cells add potential clinical relevance but do not possess a direct connection to the proposed model. Overall, the article presents several interesting observations, but disconnection across central elements of the model and the marginal degree of the data leave open significant uncertainty regarding the conclusions.

      Strengths:

      Many of the key results are examined across multiple cell models.

      The authors propose a highly innovative model to explain their results.

      Weaknesses:

      Although the authors attempt to replicate most key results across multiple models, the results are often marginal or appear to lack statistical significance. For example, the reduction in IL1R1 protein levels observed in HT1080 cells that possess long telomeres relative to HT1080 short telomere cells appears to be modest (Supplementary Figure 1I). Associated changes in IL1R1 mRNA levels are similarly modest.

      Related to the point above, a lack of strong functional studies leaves an open question as to whether observed changes in IL1R1 expression across telomere short/long cancer cells are biologically meaningful.

      Statistical significance is described sporadically throughout the paper. Most major trends hold, but the statistical significance of the results is often unclear. For example, Figure 1A uses a statistical test to show statistically significant increases in TRF2 occupancy at the IL1R1 promoter in short telomere HT1080 relative to long telomere HT1080. However, similar experiments (i.e. Figure 2B, Figure 4A - D) lack statistical tests.

      TRF2 overexpression resulted in ~ 5-fold or more change in IL1R1 expression. Compared to this, telomere length-dependent alterations in IL1R1 expression, although about 2-fold, appear modest (~ 50% reduction in cells with long telomeres across different model systems used). Notably, this was consistent and significant across cell-based model systems and xenograft tumors (see Figure 1). Unlike TRF2 induction, telomere elongation or shortening vary within the permissible physiological limits of cells. This is likely to result in the observed variation in IL1R1 levels.

      For biological relevance, we have shown this using multiple models where telomere length was either different (patient tissue, organoids) or were altered (cell lines, xenograft models) . Where IL1 signalling in TNBC tissue and tumor organoids, and cells/xenografts were shown to impact M2 macrophage infiltration in a telomere length sensitive fashion. We made use of the tumor organoids to test M2 macrophage infiltration using IL1RA and small molecule based IL1R1 inhibition.

      We have now included statistical tests in all the relevant figures and incorporated the necessary details about the tests performed in the figure legend for clarity of readers. Additionally, all data points, p values and details of statistical tests have been included in Figure wise excel sheets for both main and supplementary figures.

      Reviewer #1 (Recommendations For The Authors):

      There are typos throughout the manuscript. The word 'expression' is incorrectly spelled on y-axis labels throughout the manuscript (for example see Figure 1B). The word 'telomere' is incorrectly spelled in Supplementary Figure 1 legend panel A. Most errors, such as these, do not interfere with my comprehension of the manuscript. However, others made the manuscript difficult to follow. For example, I think that MDAMB231, MDAMD231, and MDAM231 are frequently used interchangeably to refer to the same cell line. This makes it very difficult to understand certain experiments.

      I often found it difficult to understand which statistical test was used for a specific experiment. I suggest changing the style in the legends to more clearly connect statistical tests with specific data points.

      We thank the reviewer for pointing out the typological errors. We have now made relevant corrections to both figures and text.

      As stated above, we have now provided details of statistical tests performed in the figure legend for clarity of readers. Additionally, all data points, p values and details of statistical tests have been included in Figure wise excel sheets for both main and supplementary figures.

      Reviewer #2 (Public Review):

      This study highlights the role of telomeres in modulating IL-1 signaling and tumor immunity. The authors demonstrate a strong correlation between telomere length and IL-1 signaling by analyzing TNBC patient samples and tumor-derived organoids. Mechanistic insights revealed non-telomeric TRF2 binding at the IL-1R1. The observed effects on NF-kB signaling and subsequent alterations in cytokine expression contribute significantly to our understanding of the complex interplay between telomeres and the tumor microenvironment. Furthermore, the study reports that the length of telomeres and IL-1R1 expression is associated with TAM enrichment. However, the manuscript lacks in-depth mechanistic insights into how telomere length affects IL-1R1 expression. Overall, this work broadens our understanding of telomere biology.

      The mechanism of how telomere length affects IL1R1 expression involves sequestration and reallocation of TRF2 between telomeres and gene promoters (in this case, the IL1R1 promoter). We have previously shown this across multiple genomic sites (Mukherjee et al, 2018; reviewed in J. Biol. Chem. 2020, Trends in Genetics 2023). We have described this in the manuscript along with references citing the previous works. A scheme explaining the model was provided as Additional Supplementary Figure 1, along with a description of the mechanistic model.

      Figure 1-4 in main figures describe the molecular mechanism of telomere-dependent IL1R1 activation. This includes ChIP data for TRF2 on the IL1R1 promoter in long/short telomeres, as well as TRF2-mediated histone/p300 recruitment and IL1R1 gene expression. We further show how specific acetylation on TRF2 is crucial for TRF2-mediated IL1R1 regulation (Figure 5).

      Reviewer #2 (Recommendations For The Authors):

      The study primarily provides a snapshot of cytokine expression and telomere length at a single time point. Longitudinal studies or dynamic analyses could provide a more comprehensive understanding of the temporal relationship between telomere length and cytokine expression.

      Tumor heterogeneity is a significant problem for the various therapies. The study notes significant heterogeneity in telomere length but does not investigate the implications of this heterogeneity. Understanding the role of telomere length variation in different tumor cell populations is essential for a comprehensive interpretation of the results.

      The study only mentions a correlation between IL1R1 and relative telomere length but does not provide any potential clinical correlations with patient outcomes or survival. Addressing the clinical relevance of these molecular changes would improve the translational impact.

      The importance of IL1R1 in prognostic and clinical outcomes of TNBC has been studied by multiple groups. The overall consensus is that higher IL1R1 leads to poor prognosis – aiding both cancer progression and metastasis. Using publicly available TCGA data, we found that IL1R1 high samples had significantly lower survival in breast cancer (BRCA) datasets. The results have now been included in the manuscript as Supplemnetray Figure 7G.

      Addition in text:

      “We, next, used publicly available TCGA gene expression data of breast cancer samples (BRCA) (Supplementary file 4) to assess the effect of IL1R1 expression on cancer prognosis. We categorized samples based on IL1R1 expression: IL1R1 high (N=254) and IL1R1 low samples (N= 709). It was seen that overall patient survival was significantly lower in IL1R1 high samples (Log-rank p value -0.0149) (Supplementary Figure 7G). We also checked the frequency of occurrence of various breast cancer sub-types in IL1R1 high and low samples (Supplementary Figure 7H). While invasive mixed mucinous carcinoma (the most abundant sub-type) was predominantly seen in IL1R1 low samples, metaplastic breast cancer was only found within the IL1R1 high samples. Interestingly, metaplastic breast cancer has been frequently found to be ‘triple negative’-i.e., ER-,PR- and HER2-. (Reddy et al., 2020).”

      However, we could not access a TNBC (or any breast cancer dataset) that has been characterized for telomere length. Unfortunately, the clinical TNBC samples that we had access to did not have any paired short-term/long-term survival datasets. We could, in principle, use TERT/TERC expression as a proxy for telomere length; however, in our experiments, we found that telomerase activity did not positively correlate with telomere length as expected (Supplementary Figure 7C, Supplementary Figure 8D). Therefore, transcriptional signature (of telomere-associated genes) may not be a reliable indicator of telomere length.

      The study lacks in-depth mechanistic insights into how telomere length affects IL1R1 expression and subsequently influences TAM infiltration. Further molecular studies or pathway analyses are necessary to elucidate the underlying mechanisms.

      The mechanism involves sequestration and reallocation of TRF2 between telomeres and gene promoters (in this case, IL1R1 promoter). We have previously shown this across multiple genomic sites (Mukherjee et al, 2018). We have appropriately discussed this in the manuscript.

      A schematic explaining the model has been provided as Additional Supplementary Figure 1.

      We have provided ChIP data for TRF2 on IL1R1 promoter in long/short telomeres in the manuscript as well as histone/p300 ChIP and gene expression (Figure 1-4 in main figures exclusively deal with molecular mechanism of telomere dependent IL1R1 activation).  We further go on to show how specific acetylation on TRF2 might be crucial for TRF2-mediated IL1R1 regulation (Figure 5). One of the key findings herein is the fact that TRF2 can directly regulate IL1R1 expression through promoter occupancy- tested in telomere altered cell lines (HT1080, MDAMB231) and tumor xenografts (Figure 1 A, F, I- for TRF2 promoter occupancy).

      Pathway analysis of HT1080 (short vs long telomere) transcriptome, shows that cytokine-cytokine receptor interaction is one of the key pathways in upregulated genes.

      While we have focused on TRF2 mediated IL1R1 regulation, it is quite possible that there are other telomere sensitive pathways/mechanisms by which IL1R1 is regulated. This has been duly acknowledged in the discussion.

      The manuscript title suggests modulation of immune signaling in the tumor microenvironment, yet the authors exclusively focus on CD206+ TAMs, limiting the scope. It is recommended to investigate other immune cell types for a more comprehensive understanding of changes in the immune tumor microenvironment.

      As stated above, we approached the manuscript from the purview of TRF2-mediated IL1R1 regulation. In our assessment of TCGA data for breast cancer, we found that CD206 (MRC1) had the highest enrichment in IL1R1 high samples among key TAM and TIL markers- now added as Figure 8A (Details in Supplementary file 5). It also had the highest correlation with IL1R1 among the tested markers. Therefore, we proceeded to check CD206+ve TAMs.

      Now the following section has been added to text:

      “We further found that the total proportion of immune cells (% of CD45 +ve cells) did not vary significantly between short and long telomere TNBC samples (Supplementary Figure 8C). However, TNBC-ST samples had a higher percentage of myeloid cells (CD11B +ve) within the CD 45 +ve immune cell population. We checked in three TNBC-ST and TNBC-LT samples each and found that the percentage of M1 macrophages (CD86 high CD 206 low) in the myeloid population was lower than that of the M2 macrophages (CD 206 high CD 86 low) and unlike the latter, did not vary significantly between the TNBC-ST and TNBC-LT samples (Supplementary Figure 8C).”

      Unfortunately, due to sample limitations we are unable to test this on a larger cohort of samples.

      A single cell transcriptome experiment may have been a good way to have a more comprehensive immune profiling. However, with our TNBC samples, isolated nuclei for downstream processing had low viability as per 10X genomics specifications.

      Does IL1R1 influence TAM recruitment or polarization within the tumor microenvironment? To assess the impact, the authors should use a marker indicative of M1-like macrophages, such as CD80 or CD86.

      To address the issue of TAM recruitment vs polarization meaningfully we need to characterize tissue resident macrophages as well as macrophages in circulation. We did not have access to patient blood.  A murine breast cancer in-vivo model might be a more appropriate model to test this, which would take considerable time for us to develop. It is something that we hope to address in a follow up study.

      Did the authors analyze other breast cancer subtypes for telomere length?

      Unfortunately, other breast cancer sub-types besides TNBC were not available to us for experimentation.

      Figure legends are very briefly written and need to be elaborated. Scale bars are also missing in images.

      Add a gating strategy for flow cytometry results in Figure 8A.

      Figure legend have been expanded for clarity. More prominent scale bars have been added for better visibility and reference.  A relevant gating strategy has been added as Supplementary figure 8B.

      Reviewer #3 (Public Review):

      Summary:

      In this manuscript, entitled "Telomere length sensitive regulation of Interleukin Receptor 1 type 1 (IL1R1) by the shelterin protein TRF2 modulates immune signalling in the tumour microenvironment", Dr. Mukherjee and colleagues pointed out clarifying the extra-telomeric role of TRF2 in regulating IL1R1 expression with consequent impact on TAMs tumor-infiltration.

      Strengths:

      Upon careful manuscript evaluation, I feel that the presented story is undoubtedly well conceived. At the technical level, experiments have been properly performed and the obtained results support the authors' conclusions.

      Weaknesses:

      Unfortunately, the covered topic is not particularly novel. In detail, the TRF2 capability of binding extratelomeric foci in cells with short telomeres has been well demonstrated in a previous work published by the same research group. The capability of TRF2 to regulate gene expression is well-known, the capability of TRF2 to interact with p300 has been already demonstrated and, finally, the capability of TRF2 to regulate TAMs infiltration (that is the effective novelty of the manuscript) appears as an obvious consequence of IL1R1 modulation (this is probably due to the current manuscript organization).

      Here we studied the TRF2-IL1R1 regulatory axis (not reported earlier by us or others) as a case of the telomere sequestration model that we described earlier (Mukherjee et al., 2018; reviewed in J. Biol. Chem. 2020, Trends in Genetics 2023). This manuscript demonstrates the effect of the TRF2-IL1R1 regulation on telomere-sensitive tumor macrophage recruitment. To the best of our knowledge, no previous study connects telomeres of tumor cells mechanistically to the tumor immune microenvironment. Here we focused on the IL1R1 promoter and provided mechanistic evidence for acetylated-TRF2 engaging the HAT p300 for epigenetically altering the promoter. This mechanism of TRF2 mediated activation has not been previously reported. Further, the function of a specific post translational modification (acetylation of the lysine residue 293K) of TRF2 in IL1R1 regulation is described for the first time. Additional experiments showed that TRF2-acetylation mutants, when targeted to the IL1R1 promoter, significantly alter the transcriptional state of the IL1R1 promoter. To our knowledge, the function of any TRF2 residue in transcriptional activation had not been previously described. Taken together, these demonstrate novel insights into the mechanism of TRF2-mediated gene regulation, that is telomere-sensitive, and affects the tumor-immune microenvironment.

      We considered the reviewer’s suggestion to reorganize the result section. Reorganizing the manuscript to describe the TAM-related results first would, in our opinion, limit focus of the new findings and discovery [and novelty of the mechanisms (as described in above response, and in response to other comments by reviewers)] of the non-telomeric TRF2-mediated IL1R1 regulation. We have tried to bring out the novelty, implications and importance of the TAM-related observations in the discussion.

      Reviewer #3 (Recommendations For The Authors):

      Based on the comments reported above, I would encourage the author to modify the manuscript by reorganizing the text. I would suggest starting from the capability of TRF2 to modulate macrophages infiltration. Data relative to IL1R1 expression may be used to explain the mechanism through which TRF2 exerts its immune-modulatory role. This, in my view, would dramatically strengthen the presented story.

      Concerning the text, "results" should be dramatically streamlined and background information should be just limited to the "introduction" section.

      The manuscript should be carefully revisited at grammar level. A number of incomplete sentences and some typos are present within the text.

      We thank the reviewer for the appreciation of our work for its technical strengths.

      At the onset, we agree that we have explored the TRF2-IL1R1 regulatory axis. This underscores the significance of the telomere sequestration model that we had proposed earlier (Mukherjee et al., 2018). Herein, however, we significantly extend our previous work (which was more general and intended for putting forward the idea of telomere-dependent distal gene expression) by studying TRF2-mediated regulation of IL1 signalling (which was previously unreported). In addition, mechanistic details of how telomeres are connected to IL1 signaling through non-telomeric TRF2 are entirely new, not reported before by us or others.

      We have removed some text descriptions from the result section to streamline the section.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      …several previous studies have identified co-expression of vomeronasal receptors by vomeronasal sensory neurons, and the expression of non-vomeronasal receptors, and this was not adequately addressed in the manuscript as presented.

      We’ve added context and citations to the Introduction and Results sections relating to recent studies on the co-expression of vomeronasal receptors and the expression of non-vomeronasal receptors in VSNs.

      The data resulting from the use of the Resolve Biosciences spatial transcriptomics platform are somewhat difficult to interpret, and the methods are somewhat opaque.

      The Molecular Cartography platform relies on multi-plex imaging of fluorescent probes that bind specifically to individual gene transcripts to determine their spatial location. Unfortunately, the detailed protocols remain proprietary at Resolve Biosciences and were not disclosed. We have clarified this in the revised manuscript. Our role in the acquisition and processing of data for this experiment is included in the current Methods section. Additional analysis produced from the Molecular Cartography data have been added (See response to Reviewer #2, below) to the supplemental materials to help clarify interpretation of the results.

      Reviewer #2:

      …the authors present a biased report of previously published work, largely including only those results that do not overlap with their own findings, but ignoring results that would question the novelty of the data presented here.

      We had no intention of misleading the readers. In fact, we have discussed discrepancies between our results with other studies. However, we inadvertently left out a critical publication in preparing the manuscript. We have added context and citations relating to recent studies that use single cell RNA sequencing in the vomeronasal organ, studies relating to the co-expression of vomeronasal receptors, and studies discussing V1R/V2R lineage determination. In Discussion, we also compared our model with a previous one of genetic determination of VNO neuronal fate.

      Did the authors perform any cell selectivity, or any directed dissection, to obtain mainly neuronal cells? Previous studies reported a greater proportion of non-neuronal cells. For example, while Katreddi and co-workers (ref 89) found that the most populated clusters are identified as basal cells, macrophages, pericytes, and vascular smooth muscle, Hills Jr. et al. in this work did not report such types of cells. Did the authors check for the expression of marker genes listed in Ref 89 for such cell types?

      For VNO dissections, we removed bones and blood vessels from VNO tissue and only kept the sensory epithelium. This procedure removed vascular smooth muscle cells, pericytes, and other non-neuronal cell types, which explains differences in cell proportions between our study and previous studies. We used a DAPI/Draq5 assay to sort live/nucleated cells for sequencing and no specific markers were used for cell selection. All cells in the experiment were successfully annotated using the cell-type markers shown in Fig. 1B, save for cells from the sVSN cluster, which were novel, and required further analysis to characterize.

      The authors should report the marker genes used for cell annotation.

      Marker genes used for cell annotation are shown in figure 1B. A full list of all marker genes used in the cell annotation process has been added to the Methods section.

      The authors reported no differences between juvenile and adult samples, and between male and female samples. It is not clear how they evaluate statistically significant differences, which statistical test was used, or what parameters were evaluated.

      The claims made about male/female mice and P14/P56 mice directly pertain to the distribution of clusters and cells in UMAP space as seen in Figure 1 C & D. We have performed differential gene expression analysis for male/female and P14/P56 comparisons using the FindMarkers function from the Seurat R package. Although we have found significant differential expression between male and female, and between P14 and P56 animals, the genes in this list do not appear to be influential for the neuronal lineage and cell type specification or related to cell adhesion molecules, which are the main focuses of this study. Nevertheless, we have added these results to the supplemental materials.

      ‘Based on our transcriptomic analysis, we conclude that neurogenic activity is restricted to the marginal zone.’ This conclusion is quite a strong statement, given that this study was not directed to carefully study neurogenesis distribution, and when neurogenesis in the basal zone has been proposed by other works, as stated by the authors.

      We have used fourteen slides from whole VNO sections in our Molecular Cartography analysis to quantify the number of GBCs, INPs, and iVSNs predicted in the marginal zone, the intermediate zone, and main/medial zone. We have performed a Wilcoxon signed-rank test to check for the significant presence of GBCs, INPs, and iVSNs in the marginal zone over their presence in the main/medial zone. The results are included in new Figure S3. The result from this analysis justifies our claim that neurogenesis is restricted to the MZ. This claim is also supported by the 2021 study by Katreddi & Forni.

      The authors report at least two new types of sensory neurons in the mouse VNO, a finding of huge importance that could have a substantial impact on the field of sensory physiology. However, the evidence for such new cell types is based solely on this transcriptomic dataset and, as such, is quite weak, since many crucial morphological and physiological aspects would be missing to clearly identify them as novel cell types. As stated before, many control and confirmatory experiments, and a careful evaluation of the results presented in this work must be performed to confirm such a novel and interesting discovery. The reported "novel classes of sensory neurons" in this work could represent previously undescribed types of sensory neurons, but also previously reported cells (see below) or simply possible single-cell sequencing artefacts.

      The reviewer is correct that detailed morphological and physiological studies are needed to further understand these cells. This is an opinion we share. Our paper is primarily intended as a resource paper to provide access to a large-scale single-cell RNA-sequenced dataset and discoveries based on the transcriptomic data that can support and inspire ongoing and future experiments in the field. Nonetheless, we are confident that neither of the novel cell clusters are the result of sequencing artefacts. We performed a robust quality-control protocol, including count correction for ambient RNA with the R package, SoupX, multiplet cell detection and removal with the Python module, Scrublet, and a strict 5% mitochondrial gene expression cut-off. Furthermore, the cell clusters in question show no signs of being the result of sequencing artefacts, as they are physically connected in a reasonable orientation to the rest of the neuronal lineage in modular clusters in 2D and 3D UMAP space. The OSN and sVSN  cell clusters each show distinct and self-consistent expressions of genes (new Figure S4H). Gene ontology (GO) analysis reveals significant GO term enrichment for both the sVSN (Fig. 2G) and mOSN clusters when compared to mature V1R and V2R VSNs, indicating functional differences. We have performed  pseudotime analysis of sVSNs, differential gene expression and gene ontology analysis of mOSNs. The results are shown in the new Figure S6.

      The authors report the co-expression of V2R and Gnai2 transcripts based on sequencing data. That could dramatically change classical classifications of basal and apical VSNs. However, did the authors find support for this co-expression in spatial molecular imaging experiments?

      Genes with extremely high expression levels overwhelm signals from other genes, and therefore had to be removed from the experiment. This is a limitation of the Molecular Cartography platform. Unfortunately, Gnai2 was determined to be one of these genes and was not evaluated for this purpose.

      Canonical OSNs: The authors report a cluster of cells expressing neuronal markers and ORs and call them canonical OSN. However, VSNs expressing ORs have already been reported in a detailed study showing their morphology and location inside the sensory epithelium (References 82, 83). Such cells are not canonical OSNs since they do not show ciliary processes, they express TRPC2 channels and do not express Golf. Are the "canonical OSNs" reported in this study and the OR-expressing VSNs (ref 82, 83) different? Which parameters, other than Gnal and Cnga2 expression, support the authors' bold claim that these are "canonical OSNs"? What is the morphology of these neurons? In addition, the mapping of these "canonical OSNs" shown in Figure 2D paints a picture of the negligible expression/role of these cells (see their prediction confidence).

      We observe OR expression in VSNs in our data; these cells cluster with VSNs. The putative mOSN cluster exhibits its own trajectory, distinct from VSN clusters. These cells express Gnal (Golf), which is not expressed in VSNs expressing ORs, nor in any other cell-type in the data. After performing differential gene expression on the putative mOSN cluster, comparing with V1R and V2R VSNs, independently, GO analysis returned the top significantly enriched GO cellular component, ‘cilium’. This new piece of data is presented in the updated Figure S6. Because we were limited to list of 100 genes in Molecular Cartography probe panel, we have prioritized the detection of canonical VNO cell-types, vomeronasal receptor co-expression, and the putative sVSNs, and were not able to include a robust analysis of the putative OSNs.

      Secretory VSN: The authors report another novel type of sensory neurons in the VNO and call them "secretory VSNs". Here, the authors performed an analysis of differentially expressed genes for neuronal cells (dataset 2) and found several differentially expressed genes in the sVSN cluster. However, it would be interesting to perform a gene expression analysis using the whole dataset including neuronal and non-neuronal cells. Could the authors find any marker gene that unequivocally identifies this new cell type?

      We did not find unequivocal marker genes for sVSNs. We did perform differential analysis of the sVSN cluster with whole VNO data and with the neuronal subset, as well as against specific cell-types. We could not find a single gene that was perfectly exclusive to sVSNs. We used a combinatorial marker-gene approach to predicting sVSNs in the Molecular Cartography data. This required a larger subset of our 100 gene panel to be dedicated to genes for detecting sVSNs.

      When the authors evaluated the distribution of sVSN using the Molecular Cartography technique, they found expression of sVSN in both sensory and non-sensory epithelia. How do the authors explain such unexpected expression of sensory neurons in the non-sensory epithelium?

      In our scRNA-Seq experiment, blood vessels were removed, limiting the power to distinguish between certain cell types. Because of the limited number of genes that we can probe using Molecular Cartography, the number of genes associated with sVSNs may be present in the non-sensory epithelium. This could lead to the identification of cells that may or may not be identical to the sVSNs in the non-neuronal epithelium. Indeed, further studies will need to be conducted to determine the specificity of these cells.

      The low total genes count and low total reads count, combined with an "expression of marker genes for several cell types" could indicate low-quality beads (contamination) that were not excluded with the initial parameter setting. It looks like cells in this cluster express a bit of everything V1R, V2R, OR, secretory proteins.

      We are confident that the putative sVSN cell cluster is not the result of low-quality cells. We performed a robust quality-control protocol, including count correction for ambient RNA with the R package, SoupX, multiplet cell detection and removal with the Python module, Scrublet, and a strict 5% mitochondrial gene expression cut-off. Furthermore, the cell clusters in question show no signs of being the result of sequencing artefacts, as they are connected in a reasonable orientation to the rest of the neuronal lineage in modular clusters in 2D and 3D UMAP space. The OSN and sVSN cell clusters each show distinct and self-consistent expressions of genes (Fig. S1H). Gene ontology (GO) analysis reveals significant GO term enrichment for both the sVSN (Fig. 2G) and mOSN clusters when compared to mature V1R and V2R VSNs, indicating functional differences. Moreover, while some genes were expressed at a lower level when compared to the canonical VSNs, others were expressed at higher levels, precluding the cause of discrepancy as resulting from an overall loss of gene counts.

      The authors wrote ‘...the transcriptomic landscape that specifies the lineages is not known...’. This statement is not completely true, or at least misleading. There are still many undiscovered aspects of the transcriptomics landscape and lineage determination in VSNs. However, authors cannot ignore previously reported data showing the landscape of neuronal lineages in VSNs (Ref ref 88, 89, 90, 91 and doi.org/10.7554/eLife.77259). Expression of most of the transcription factors reported by this study (Ascl1, Sox2, Neurog1, Neurod1...) were already reported, and for some of them, their role was investigated, during early developmental stages of VSNs (Ref ref 88, 89, 90, 91 and doi.org/10.7554/eLife.77259). In summary, the authors should fully include the findings from previous works (Ref ref 88, 89, 90, 91 and doi.org/10.7554/eLife.77259), clearly state what has been already reported, what is contradictory and what is new when compared with the results from this work.

      This is a difference in opinion about the terminology. Transcriptomic landscape in our paper refers to the genome-wide expression by individual cells, not just individual genes. The reviewer is correct that many of the genetic specifiers have been identified, which we cited and discussed. We consider these studies as providing a “genetic” underpinning, rather than the “transcriptomic landscape” in lineage progression. To avoid confusion, we have revised the statement to “… the transcriptional program that specifies the lineages is not known.” 

      …the co-expression of specific V2Rs with specific transcription factors does not imply a direct implication in receptor selection. Directed experiments to evaluate the VR expression dependent on a specific transcription factor must be performed.

      The reviewer is correct, and we did not claim that the co-expression of specific transcription factors indicates a direct relationship with receptor selection. We agree that further directed experiments are required to investigate this question.

      This study reports that transcription factors, such as Pou2f1, Atf5, Egr1, or c-Fos could be associated with receptor choice in VSNs. However, no further evidence is shown to support this interaction. Based on these purely correlative data, it is rather bold to propose cascade model(s) of lineage consolidation.

      The reviewer is correct. As any transcriptomic study will only be correlative, additional studies will be needed to unequivocally determine the mechanistic link between the transcription factors with receptor choice. Our model provides a basis for these studies.

      The authors use spatial molecular imaging to evaluate the co-expression of many chemosensory receptors in single VNO cells. […] However, it is difficult to evaluate and interpret the results due to the lack of cell borders in spatial molecular imaging. The inclusion of cell border delimitation in the reported images (membrane-stained or computer-based) could be tremendously beneficial for the interpretation of the results.

      The most common practice for cell segmentation of spatial transcriptomics data is to determine cell borders based on nuclear staining with expansion. We have tested multiple algorithms based on recent studies, but each has its own caveat.

      It is surprising that the authors reported a new cell type expressing OR, however, they did not report the expression of ORs in Molecular Cartography technique. Did the authors evaluate the expression of OR using the cartography technique?

      We were limited to a 100-gene probe panel and only included one OR. The expression was not high enough for us to substantiate any claims.

      Reviewer #3:

      (1) The authors claim that they have identified two new classes of sensory neurons, one being a class of canonical olfactory sensory neurons (OSNs) within the VNO. This classification as canonical OSNs is based on expression data of neurons lacking the V1R or V2R markers but instead expressing ORs and signal transduction molecules, such as Gnal and Cnga2. Since OR-expressing neurons in the VNO have been previously described in many studies, it remains unclear to me why these OR-expressing cells are considered here a "new class of OSNs." Moreover, morphological features, including the presence of cilia, and functional data demonstrating the recognition of chemosignals by these neurons, are still lacking to classify these cells as OSNs akin to those present in the MOE. While these cells do express canonical markers of OSNs, they also appear to express other VSN-typical markers, such as Gnao1 and Gnai2 (Figure 2B), which are less commonly expressed by OSNs in the MOE. Therefore, it would be more precise to characterize this population as atypical VSNs that express ORs, rather than canonical OSNs.

      We observe OR expression in VSNs in our data; these cells cluster with VSNs. The putative mOSN cluster exhibits its own trajectory, distinct from VSN clusters. These cells express Gnal (Golf), which is not expressed in VSNs expressing ORs, nor in any other cell-type in the data. We have performed differential gene expression analysis on the putative mOSN cluster to compare with V1R and V2R VSNs. GO analysis returned the top significantly enriched GO terms, including many related to “cilium”., further supporting that these are OSNs. Because we were limited to list of 100 genes in Molecular Cartography probe panels, we have prioritized the detection of canonical VNO cell-types, vomeronasal receptor co-expression, and the putative sVSNs, and were not able to include a robust analysis of the putative OSNs. With regard to Gnai2 and Go expression, we have examined our data from the OSNs dissociated from the olfactory epithelium and detected substantial expression of both. This new analysis provides additional support for our claim. We now present differentially expressed genes and GO term analysis of the mOSN class in the updated Figure S6.

      (2) The second new class of sensory neurons identified corresponds to a group of VSNs expressing prototypical VSN markers (including V1Rs, V2Rs, and ORs), but exhibiting lower ribosomal gene expression. Clustering analysis reveals that this cell group is relatively isolated from V1R- and V2R-expressing clusters, particularly those comprising immature VSNs. The question then arises: where do these cells originate? Considering their fewer overall genes and lower total counts compared to mature VSNs, I wonder if these cells might represent regular VSNs in a later developmental stage, i.e., senescent VSNs. While the secretory cell hypothesis is compelling and supported by solid data, it could also align with a late developmental stage scenario. Further data supporting or excluding these hypotheses would aid in understanding the nature of this new cell cluster, with a comparison between juvenile and adult subjects appearing particularly relevant in this context.

      We wholeheartedly agree with this assessment. Our initial thought was that these were senescent VSNs, but the trajectory analysis did not support this scenario, leading us to propose that these are putative secretive cells. Our analysis also shows that overall, 46% of the putative sVSNs were from the P14 sample and 54% from P56. These cells comprise roughly 6.4% of all P14 cells and 8.5% of P56 cells. In comparison, 28.4% of all cells are mature V1R VSNs at P14, but the percentage rise to 46.7% at P56. The significant presence of sVSNs at P14, and the disproportionate increase when compared with mature VSNs indicate that these are unlikely to be late developmental stage or senescent cells, although we cannot exclude these possibilities.

      We have included the sVSNs in a trajectory inference analysis and found that the pseudotime values of the sVSNs are within the range of those cells within the V1R and V2R lineages, indicating a similar maturity (Fig. S6).

      (3) The authors' decision not to segregate the samples according to sex is understandable, especially considering previous bulk transcriptomic and functional studies supporting this approach. However, many of the highly expressed VR genes identified have been implicated in detecting sex-specific pheromones and triggering dimorphic behavior. It would be intriguing to investigate whether this lack of sex differences in VR expression persists at the single-cell level. Regardless of the outcome, understanding the presence or absence of major dimorphic changes would hold broad interest in the chemosensory field, offering insights into the regulation of dimorphic pheromone-induced behavior. Additionally, it could provide further support for proposed mechanisms of VR receptor choice in VSNs. 

      The reviewer raised a good point. We did not observe differences between male and female, or between P14 and P56 mice in the distribution of clusters and cells in UMAP space. Indeed, our differential expression analysis has revealed significantly differentially expressed genes in both comparisons. Results from these analyses are presented in the new Figures S1 and S2.   

      (4) The expression analysis of VRs and ORs seems to have been restricted to the cell clusters associated with the neuronal lineage. Are VRs/ORs expressed in other cell types, i.e. sustentacular, HBC, or other cells?

      Sparsely expressed low counts of VR and OR genes were observed in non-neuronal cell-types. When their expression as a percentage of cell-level gene counts is considered, however, the expression is negligible when compared to the neurons. The observed expression may be explained by stochastic base-level expression, or it may be the result of remnant ambient RNA that passed filtering.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:  

      Reviewer #1 (Public Review): 

      Summary: 

      The fungal cell wall is a very important structure for the physiology of a fungus but also for the interaction of pathogenic fungi with the host. Although a lot of knowledge on the fungal cell wall has been gained, there is a lack of understanding of the meaning of ß-1,6-glucan in the cell wall. In the current manuscript, the authors studied in particular this carbohydrate in the important humanpathogenic fungus Candida albicans. The authors provide a comprehensive characterization of cell wall constituents under different environmental and physiological conditions, in particular of ß-1,6glucan. Also, β-1,6-glucan biosynthesis was found to be likely a compensatory reaction when mannan elongation was defective. The absence of β-1,6-glucan resulted in a significantly sick growth phenotype and complete cell wall reorganization. The manuscript contains a detailed analysis of the genetic and biochemical basis of ß-1,6-glucan biosynthesis which is apparently in many aspects similar to yeast. Finally, the authors provide some initial studies on the immune modulatory effects of ß-1,6-glucan. 

      Strengths: 

      The findings are very well documented, and the data are clear and obtained by sophisticated biochemical methods. It is impressive that the authors successfully optimized methods for the analyses and quantification of ß-1-6-glucan under different environmental conditions and in different mutant strains. 

      Weaknesses: 

      However, although already very interesting, at this stage there are some loose ends that need to be combined to strengthen the manuscript. For example, the immunological studies are rather preliminary and need at least some substantiation. Also, at this stage, the manuscript in some places remains a bit too descriptive and needs the elucidation of potential causalities.

      Reviewer #2 (Public Review): 

      Summary: 

      The authors provide the first (to my knowledge) detailed characterization of cell wall b-1,6 glucan in the pathogen Candida albicans. The approaches range from biochemistry to genetics to immunology. The study provides fundamental information and will be a resource of exceptional value to the field going forward. Highlights include the construction of a mutant that lacks all b-1,6 glucan and the characterization of its cell wall composition and structure. Figure 5a is a feast for the eyes, showing that b-1,6 glucan is vital for the outer fibrillar layer of the cell wall. Also much appreciated was the summary figure, Figure 7, which presents the main findings in digestible form.

      Strengths: 

      The work is highly significant for the fungal pathogen field especially, and more broadly for anyone studying fungi, antifungal drugs, or antifungal immune responses.

      The manuscript is very readable, which is important because most readers will be cell wall nonspecialists.

      The authors construct a key quadruple mutant, which is not trivial even with CRISPR methods, and validate it with a complemented strain. This aspect of the study sets the bar high. The authors develop new and transferable methods for b-1,6 glucan analysis. 

      Weaknesses: 

      The one "famous" cell type that would have been interesting to include is the opaque cell. This could be included in a future paper.

      Reviewer #3 (Public Review): 

      Summary: 

      The cell wall of human fungal pathogens, such as Candida albicans, is crucial for structural support and modulating the host immune response. Although extensively studied in yeasts and molds, the structural composition has largely focused on the structural glucan b,1,3-glucan and the surface exposed mannans, while the fibrillar component β-1,6-glucan, a significant component of the well wall, has been largely overlooked. This comprehensive biochemical and immunological study by a highly experienced cell wall group provides a strong case for the importance of β-1,6-glucan contributing critically to cell wall integrity, filamentous growth, and cell wall stability resulting from defects in mannan elongation. Additionally, β-1,6-glucan responds to environmental stimuli and stresses, playing a key role in wall remodeling and immune response modulation, making it a potential critical factor for host-pathogen interactions.

      Strengths: 

      Overall, this study is well-designed and executed. It provides the first comprehensive assessment of β-1,6-glucan as a dynamic, albeit underappreciated, molecule. The role of β-1,6-glucan genetics and biochemistry has been explored in molds like Aspergillus fumigatus, but this work shines an important light on its role in Candida albicans. This is important work that is of value to Medical Mycology, since β-1,6-glucan plays more than just a structural role in the wall. It may serve as a PAMP and a potential modulator of host-pathogen interactions. In keeping with this important role, the manuscript rigor would benefit from a more physiological evaluation ex vivo and preferably in vivo, assessment on stimulating the immune system within in the cell wall and not just as a purified component. This is a critical outcome measure for this study and gets squarely at its importance for host-pathogen interactions, especially in response to environmental stimuli and drug exposure.

      Response to reviewers (Public reviews):

      We thank all the three reviewers for their opinion on our work on Candida albicans β-1,6-glucan, which highlights the importance of this cell wall component in the biology of fungi. Here are our responses to their comments for public reviews:

      (1) Indeed, the data presented for immunological studies is preliminary. It has been acknowledged by the reviewers that our analysis providing insights into the biosynthetic pathways involved in comprehensive in dealing with organization and dynamics of the β-1,6-glucan polymer in relation with other cell wall components and environmental conditions (temperature, stress, nutrient availability, etc.). However, we anticipated that there would be immediate curiosity as to what the immunological contribution of β-1,6 glucan and we therefore felt we needed to initiative these studies and include them. We therefore performed immunological studies to assess whether β-1,6-glucans act as a pathogen-associated molecular pattern (PAMP), and if so, what its immunostimulatory potential is. Our data clearly suggest that β-1,6-glucan is a PAMP, and consequently lead to several questions: (a) what are the host immune receptors involved in the recognition of this polysaccharide, and thereby the downstream signaling pathways, (b) how is β-1,6-glucan differentially recognized by the host when C. albicans switches from a commensal to an opportunistic pathogen, and (c) how does the host environment impact the exposure of this polysaccharide on the fungal surface. We believe addressing these questions is beyond the scope of the present manuscript and aim to present new data in future manuscript. Nonetheless, in the revised manuscript, suggest approaches that we can take to identify the receptor that could be involved in the recognition of β-1,6-glucan. Moreover, we have modified the discussion presenting it based on the data rather than being descriptive.  

      (2) It will be interesting to assess the organization of β-1,6-glucan and other cell wall components in the opaque cells. It is documented that the opaque cells are induced at acidic pH and in the presence of N-acetylglucosamine and CO2. Our data shows that pH has an impact on β-1,6-glucan, which suggests that there will be differential organization of this polysaccharide in the cell wall of opaque cells. As suggested by the reviewer, we will include analysis of opaque cells (and other C. albicans cell types) in future studies. 

      With the exception of these major new avenues for this research, our revision can address each of the comments provided by the reviewers.

      Recommendations for the authors

      Reviewer #1 (Recommendations For The Authors):

      Although the study is very interesting, there are some loose ends that need to be combined to strengthen the manuscript. For example, the immunological studies are rather preliminary and need at least some substantiation. Also, at this stage, the manuscript in some places remains a bit too descriptive and needs the elucidation of potential causalities.

      Specifically: 

      (1) As you showed, defects in chitin content led to a decrease in the cross-linking of β-glucans in the inner wall that corresponded to the effect of nikkomycin-treated C. albicans phenotype; conversely, an increase in chitin content led to more cross-linking of β-glucans as observed in the FKS1 mutant or in the presence of caspofungin. What is the mechanistic reason for these observations? 

      On one hand, yeast cell wall chitin occurs in three forms: free and covalently linked to β-1,3-glucan or β-1,6-glucan; crosslinked β-glucan-chitin forms core fibrillar structure resistant to alkali. A decrease in the chitin content, therefore, affect β-glucan-chitin crosslinking thereby making β-glucan alkali-soluble. On the other hand, a decrease in the β-glucan content, as in FKS1 mutant or upon caspofungin treatment, results in increased cell wall chitin and β-glucan-chitin contents. A decrease in the β-1,3-glucan biosynthesis is associated with upregulation of CRH1 involved in the β-glucan-chitin crosslinking, which explains an increased β-glucan-chitin content in the FKS1 mutant or upon caspofungin treatment. We have included in this discussion in the revised manuscript (p14, lines 2-10).     

      (2) The β-1,6-glucan biosynthesis is stimulated via a compensatory pathway when there is a defect in O- and N-linked cell wall mannan biosynthesis. Why? causality? Hypothesis?  

      Two phenomena were observed related to β-1,6-glucan and mannan biosynthesis: 1) a defect in the elongation of N-mannan led to an increase in the β-1,6-glucan content; 2) a defect of O-mannan elongation resulted in the reduce size of β-1,6-glucan chains, however, increased their branching. These observations of our study suggest a global rescue program of the cell wall damage that could occur due to defect in one of the cell wall contents. We have discussed this in the revised manuscript (p14, last paragraph, p15 first paragraph). Moreover, β-1,3-glucan and chitin are synthesized by respective membrane bound synthases, and a defect in of their synthesis is compensated by the other. In line, although need to be validated for β-1,6-glucan, biosynthesis of mannan and β-1,6-glucan seem to initiate intracellularly. Therefore, possibility is that the defective mannan biosynthesis could be compensated by β-1,6-glucan biosynthesis, but need to be further validated experimentally. 

      (3) You showed that the removal of β-1,6-glucan by periodate oxidation (AI-OxP) led to a significant decrease in the IL-8, IL-6, IL-1β, TNF-α, C5a, and IL-10 released, suggesting that their stimulation was in part β-1,6-glucan dependent. What is the consequence of the stimulation, e.g. better phagocytosis, etc.? This needs some more experiments, otherwise the data is purely descriptive, as the conclusion. Also, what do you want to show with the activation of the complement system? Is ß1,6-glucan detected by complement receptors? I think this is really a loose end. I think it is necessary to provide more data on this observation, which I think lacks control with serum lacking complement, this should then be moved to the main manuscript. 

      In this study, our aim was to assess whether β-1,6-glucan acts as a pathogen-associated molecular pattern (PAMP) of C. albicans, and if yes, what is its immunostimulatory capacity/potential. Our data confirms that, indeed, β-1,6-glucan acts as a PAMP, and its removal significantly reduces the immunostimulatory capacity of the fibrillar core structure of the C. albicans cell wall. On the other hand, data provided in the revised manuscript (see updated Figure S14, discussion p13 lines 16-21) indicate that the human serum factors significantly enhance the immunostimulatory capacity of β1,6-glucan and that β-1,6-glucan interacts with the complement component C3b. However, addressing the role of β-1,6-glucan in phagocytosis using β-1,6-glucan deletion mutant will not be possible as the cell wall of this mutant is modified, and β-1,6-glucan is not the only cell wall component interacting with C3b. Alternate is to coat β-1,6-glucan on beads and use to study phagocytosis and identify immune receptors; however, these are beyond the scope of our present study/focus.      

      (4) Also, you suggested that β-1,6-glucan and β-1,3-glucan stimulate innate immune cells in distinct ways. Please provide more data on this interesting suggestion. You can block the dectin-1 receptor for example or use dectin-1 deficient macrophages from mice. The part on the immune stimulation needs to be optimized. 

      Stimulation of immune cells by pustulan (insoluble linear β-1,6-glucan) via a dectin-1independent pathway has been described previously (PMIDs: 18005717, 16371356) as discussed in the manuscript. Our preliminary data indicate that dectin-1 blocking on immune cells (using antidectin-1 antibodies) has no effect on the immunostimulatory potential of β-1,6-glucan, unlike AI and AI-OxP that showed significantly reduced cytokine secretion by the immune cells upon dectin-1 blocking. Deciphering the β-1,6-glucan recognition and its immunomodulatory pathways are underway, and will be the subject of our future study/manuscript.   

      (5) β-1,6-glucan and mannan productions are coupled. What is the hypothesis? Is it due to the necessity of mannan residues in ß-1,6-glucan biosynthesis enzymes from the ER? Can that be experimentally proven? 

      β-1,6-glucan and mannan synthesis should be coupled in two ways. First, as mentioned above (Response 2), defects in mannan elongation led to an alteration of β-1,6-glucan production. Second, early steps of N-glycosylation led to a strong reduction of β-1,6-glucan size and its cell wall content. However, we do not believe that the synthesis of N-glycan is required for the synthesis of an acceptor essential to β-1,6-glucan synthesis. Defect in N-mannan elongation led to a global cell wall remodeling as described above. Kre5, Rot2 and Cwh41 are part of the calnexin cycle involved in the control of N-glycoprotein folding in the ER, suggesting that some protein directly involved in the β-1,6-glucan synthesis required a folding quality control to be active. We modified our discussion, accordingly, highlighting these points (p14, last paragraph, p15 second paragraph).

      (6) As PHR1 and PHR2 genes are strongly regulated by external pH, the compensatory differences described may be explained by pH-dependent regulation of β-1,6-glucan synthesis.' Please check. Also, could the pH regulation form the basis of e.g. differences you found for ß-1,6-glucan under different environmental conditions, i.e., growth on different carbon sources leads to different external pH values, as shown for many fungi?  

      We agree that environmental pH is dependent on carbon source and pH varies during growth curve. To test the effect of pH we buffered the medium with 100 mM MOPS or MES. Clearly, Fig. 2 and S1 show that the pH has an effect on the cell wall composition and polymer exposure as previously described (PMID: 28542528). Here, we show that pH has an impact on the β-1,6-glucan size as well as its branching. However, in buffered medium, addition of organic acid (such as acetate, propionate, butyrate or lactate) had an impact on cell wall composition, showing that not only pH has an effect on cell wall composition. About _phr1_Δ/Δ and _phr2_Δ/Δ mutants, we believe that the difference in the cell wall composition observed between mutants is mainly due to the pH-dependent regulation, which we indicated in the discussion (p14, end of first paragraph).

      Minor: 

      (1) In Figure 7B: dynamism should be replaced by dynamic and in term is rather in terms.  

      Modified as suggested.

      (2) Replace molecular size with molecular mass when you give daltons. 

      Molecular size has been replaced by molecular weight, when presented as daltons.

      (3) Page 7: for explanation, please add that nikkomycin is a chitin biosynthesis inhibitor.   

      As suggested, explained that nikkomycin is a chitin biosynthesis inhibitor.

      Reviewer #2 (Recommendations For The Authors):

      (1) I wondered if the increased chitin content of hyphae might reflect growth on the precursor GlcNAc. Have you tested hyphae that are induced in other ways? (2) Related to point 1, did you look at the relative abundance of yeast vs hyphae in the preparation? I wonder if yeast contamination might have reduced the extent of the composition changes observed. 

      We used GlcNAc as hyphae inducer as: 1) in presence of GlcNAc, hyphae are produced without any yeast contamination; in this condition, we observed an increase in the chitin content, as described, in hyphae (PMID: 16423067); 2) we excluded using of serum, another condition inducing hyphal formation, as we could not control serum factors that may impact cell wall composition. We now indicate in the methods section that hyphae induced by GlcNAc were not contaminated by yeast (p17, line 3). 

      (3) I recommend rephrasing the first sentence of the Figure 2 legend: "Cells were grown in liquid SD medium at 37oC at exponential phase under different growth conditions." The conditions varied extensively - stationary is not exponential; biofilm is probably not exponential. Also, the "D" in "SD" stands for dextrose, and the carbon source varied a good deal. Perhaps you could say: "Cells were grown in liquid synthetic medium at 37oC under different growth conditions, as specified in Methods." 

      Sentences have been rephrased.  

      (4) Figure 7b has a typo: "dependant" for "dependent".

      Typo-error has been corrected.

      Reviewer #3 (Recommendations For The Authors):

      To explore the biochemical composition of the cell wall, the authors fractionated the wall component into three categories based on polymer properties and reticulations: sodium-dodecyl-sulphate-βmercaptoethanol (SDS-β-ME) extract, alkali-insoluble (AI), and alkali-soluble (AS) fractions, and they developed several independent methods to distinguish between β-1,3-glucans and β-1,6-glucans. The composition and surface exposure of fungal cell wall polymers is known to depend on environmental growth conditions. It was shown that the cell wall of C. albicans hyphae increased chitin content (10% vs. 3%) and decreased β-1,6-glucan (18% vs. 23%) and mannan (13% vs. 20%) compared to the yeast form, and the reduced β-1,6-glucan content was associated with a smaller β1,6-glucan size (43 vs. 58 kDa), suggesting that both the content and structure of β-1,6-glucan are regulated during growth and cellular morphogenesis. Similar behavior was observed when exposing cells to acid and neutral medium pH. The most significant cell wall alteration occurred in a lactatecontaining medium, which led to a sharp reduction in structural core polysaccharides: chitin (-43%), β-1,3-glucan (-48%), and β-1,6-glucan (-72%). This reduction aligns with the previously observed decreases in inner cell wall layer thickness. As expected, the authors found that modulating chitin content genetically (chs3Δ/Δ knockout mutant) led to an increase of both β-1,3-glucan and β-1,6glucan. An increase in chitin content following genetic alteration of FKS genes impacting glucan synthase or after exposure to the echinocandin caspofungin led to enhanced cross-linking of βglucans. A slight increase in the β-1,3-glucan branching was also observed in the mnt1/mnt2Δ/Δ double mutant, suggesting that β-1,6-glucan and mannan synthesis may be coupled.

      - This effect is not that pronounced, and the relationship appears somewhat overstated and may reflect an indirect interaction. The authors should address accordingly. 

      We agree that this sentence was overstated. To make it clearer and less pronounced, we divided this sentence into to two with less pronounced statements (p8, line 34).

      The genetics of β-1,6-glucan biosynthesis appear complex and a figure describing putative roles for specific genes would be beneficial. For example, KRE6 is a glucosyl hydrolase required for beta1,6-glucan biosynthesis.

      - It would be valuable to better understand the overall biosynthetic process. Please elaborate more in a figure. 

      Although proteins/enzymatic activities directly involved in the β-1,6-glucan biosynthesis have not yet been identified, as suggested by this reviewer, we included a schematic representation of this process based on our hypothesis (Figure S15, and p15 lines 17-22 in revised manuscript), indicating the possible involvement of Kre6p.  

      The deletion of KRE6 homologs, essential for β-1,6-glucan biosynthesis, resulted in the absence of β-1,6-glucan production, and significant structural alterations of the cell wall. This result nicely confirms the important role of β-1,6-glucan in regulating cell wall homeostasis. The absence of β1,6-glucan was associated with increased (mutant v. WT) chitin content (9.5% vs. 2.5%) and highly branched β- β-1,6-glucan 1,3-glucan (48% vs. 20%). TEM ultrastructure studies nicely showed the change in cell wall overall architecture. From a drug discovery perspective, since the blockade of β1,6-glucan did not block growth, it may have more value as a potential virulence target. This would be valuable but needs to be assessed in animal model challenge competition experiments.

      - The authors may want to elaborate more. 

      We agree and modified “antifungal target” as “potential virulence target”.

      It is well known that β-1,3-glucan, mannan, and chitin function serve as PAMPs, which induce immune responses. The role of β-1,6-glucan as a PAMP is not well understood, and the authors provide evidence that different cell wall extracted fractions with enriched constituents induce immune responses invoking cytokines, chemokines, and acute phase proteins, as well as the complement system. While this data clearly shows that β-1,6-glucan is immunologically active and potentially important for host-pathogen interactions, the analysis is preliminary and falls short of making this case. 

      - This is a critical point in getting at the potential host signaling of β-1,6-glucan contained in the cell wall or shed by the cell (is this known?)

      - This analysis would be bolstered significantly by examining stimulation relative to other cell wall components, and most importantly, whole cell modulation of β-1,6-glucan exposure for immune presentation, and not just unnatural concentrated extracts. This can be readily accomplished with the various mutants in hand, as well as after exposure to various antifungal agents echinocandins and nikkomycins) (see Hohl et al. 2008 JID). Additional validation would benefit from animal model studies to examine in vivo immune modulation.

      We agree with the reviewer. However, the main focus of our present work was to study the organization and dynamics of C. albicans cell wall β-1,6-glucan, and to explore its possible role as pathogen-associated molecular pattern (PAMP). Our study indicates that, indeed, β-1,6-glucan acts as a PAMP with immunostimulatory potential. As pointed by this reviewer, and similar to β-1,3glucans, the exposure of β-1,6-glucan is probably a key point in immune response. However, this investigation beyond the scope of this study, underway and will be presented in our future work.

      - The Discussion would also benefit from an analysis of how β-1,6-glucan in Aspergillus fumigatus, which was largely elucidated by the same primary authors. 

      To our knowledge, β-1,6-glucan has never been identified, either by chemical analysis (PMID: 10869365; PMID: 36836270) or solid-state NMR (PMID: 34732740), in the cell wall of A. fumigatus, although a homolog of KRE6 is present in A. fumigatus but with unknown function.

    1. Author response:

      The following is the authors’ response to the original reviews.

      We thank the reviewers for their detailed comments. Several comments revolved around potential improvements in the 3D reconstructions that are obtained in later steps of the image processing pipelines for single-particle cryoEM and cryo-electron tomography. We have not investigated how our improvements in CTFFIND5 affect these downstream results and can therefore not make specific and quantitative statements in this regard. However, CTFFIND5 provided additional information about the sample that users will find useful (thickness, tilt) for selecting the data they would like to include in later processing, and how to process them. Furthermore, when the sample tilt of a thin specimen is known, local defocus estimates (e.g., per-particle defocus estimates) will be more accurate compared to estimates that ignore tilt information. In the following, we provide point-by-point responses to the reviewers’ comments.

      Reviewer #1 (Public Review):

      This work presents CTFFIND5, a new version of the software for determination of the Contrast Transfer Function (CTF) that models the distortions introduced by the microscope in cryoEM images. CTFFIND5 can take acquisition geometry and sample thickness into consideration to improve CTF estimation.

      To estimate tilt (tilt angle and tilt axis), the input image is split into tiles and correlation coefficients are computed between their power spectra and a local CTF model that includes the defocus variation according to a tilted plane. As a final step, by applying a rescaling factor to the power spectra of the tiles, an average tilt-corrected power spectrum is obtained and used for diagnostic purposes and to estimate the goodness of fit. This global procedure and the rescaling factor resemble those used in Bsoft, Warp, etc, with determination of the tilt parameters being a feature specific of CTFFIND5 (and formerly CTFTILT). The performance of the algorithm is evaluated with tilted 2D crystals and tiltseries, demonstrating accurate tilt estimation in some cases and some limitations in others. Further analysis of CTF determination with tilt-series, particularly showing whether there is accurate or stable estimation at high tilts, might be helpful to show the robustness of CTFFIND5 in cryoET.

      CTFFIND5 represents the first CTF determination tool that considers the thickness-related modulation envelope of the CTF firstly described by McMullan et al. (2015) and experimentally confirmed by Tichelaar et al. (2020). To this end, CTFFIND5 uses a new CTF model that takes the sample thickness into account. CTFFIND5 thus provides more accurate CTF estimation and, furthermore, gives an estimation of the sample thickness, which may be a valuable resource to judge the potential for high resolution. To evaluate the accuracy of thickness estimation in CTFFIND5, the authors use the Lambert-Beer law on energy-filtered data and also tomographic data, thus demonstrating that the estimates are reasonable for images with exposure around 30 e/A2. While consideration of sample thickness in CTF determination sounds ideally suited for cryoET, practical application under the standard acquisition protocols in cryoET (exposure of 3-5 e/A2 per image) is still limited. In this regard, the authors are honest in the conclusions and clearly identify the areas where thickness-aware CTF determination will be valuable at present: e.g. in situ single particle analysis and in vitro single particle cryoEM of purified samples at low voltages.

      In conclusion, the manuscript introduces novel methods inside CTFFIND5 that improve CTF estimation, namely acquisition geometry and sample thickness. The evaluation demonstrates the performance of the new tool, with fairly accurate estimates of tilt axis, tilt angle and sample thickness and improved CTF estimation. The manuscript critically defines the current range of application of the new methods in cryoEM.

      Reviewer #2 (Public Review):

      Summary:

      This paper describes the latest version of the most popular program for CTF estimation for cryo-EM images: CTFFIND5. New features in CTFFIND5 are the estimation of tilt geometry, including for samples, like FIB-milled lamellae, that are pre-tilted along a different axis than the tilt axis of the tomographic experiment, plus the estimation of sample thickness from the expanded CTF model described by McMullan et al (2015). The results convincingly show the added value of the program for thicker and tilted images, such as are common in modern cryo-ET experiments. The program will therefore have a considerable impact on the field.

      I have only minor suggestions for improvement below:

      Abstract: "[CTF estimation] has been one of the key aspects of the resolution revolution"-> This is a bit over the top. Not much changed in the actual algorithms for CTF estimation during the resolution revolution.

      We have removed this statement in the abstract.

      L34: "These parameters" -> Cs is typically given, only defocus (and if relevant phase shift) are estimated.

      We have modified the introduction to reflect this. Page 3, L30-35

      L110-116: The text is ambiguous: are rotations defined clockwise or counter-clockwise? It would be good to explicitly state what subsequent rotations, in which directions and around which axes this transformation matrix (and the input/output angles in CTFFIND5) correspond to.

      Thank you for pointing this out. We have revised the Methods section, Page 4 L57-61,  to explicitly define the convention for the tilt axis and tilt angle. We have also modified Fig. 1b to illustrate our convention for the tilt axis.

      L129-130: As a suggestion: it would be relatively easy, and possibly beneficial to the user, to implement a high-resolution limit that varies with the accumulated dose on the sample. One example of this exists in the tomography pipeline of RELION-5.

      We appreciate the suggestion. However, since CTFFIND5 currently has no concept of a tilt-series and treats every micrograph independently, this would not be trivial to implement. As detailed below, CTFFIND5 in its current form is not targeted toward tomography processing, but its features might be useful for its use in pipelines for tomography processing, such as RELION-5. We made this more explicit in the conclusion section. Page 16 L390-399

      Substituting Eq (7) into Eq (6) yields ksi=pi, which cannot be true. If t is the sample thickness, then how can this be a function of the frequency g of the first node of the CTF function? The former is a feature of the sample, the latter is a parameter of the optical system. This needs correction.

      We have rewritten the text describing equations 7 and 6 to avoid this confusion (Page 7, L146-153). The reviewer is right that inserting Eq. 7 into Eq. 6 yields ksi=psi, as in fact Eq. 7 is derived from Eq. 6, by substituting ksi=psi, since this describes the condition for the first node. Also, in this context, nodes in the CTF function refer to the places where the term sinc(ksi) becomes zero and therefore the CTF is apparently "flat". The frequency at which this occurs is sample-thickness dependent. As explained below, the previous version of our manuscript did not point out the difference between the first zero and first node in the power spectrum. We have amended Fig. 3a to make this difference clearer.

      Reviewer #3 (Public Review):

      In this manuscript, the authors detail improvements in the core CTFFIND (CTFFIND5 as implemented in cisTEM) algorithm that better estimates CTF parameters from titled micrographs and those that exhibit signal attenuation due to ice thickness. These improvements typically yield more accurate CTF values that better represent the data. Although some of the improvements result in slower calculations per micrograph, these can be easily overcome through parallelization.

      There are some concerns outlined below that would benefit from further evaluation by the authors.

      For the examples shown in Figure 3b, given the small differences in estimated defocus1 and 2, what type of improvements would be expected in the reconstructed tomograms? Do such improvements in estimates manifest in better tilt-series reconstruction?

      As explained in our preface, we do not believe that these difference would manifest in any improvements during tilt-series reconstruction and would not create any meaningful differences, even when tomograms are reconstructed with CTF correction. They might become meaningful during subtomogram averaging, but subtomograms are usually corrected using per-particle CTF estimation, similar to single-particle processing. We have included a new paragraph in the discussion to describe potential benefits of CTFFIND5 for cryo-tomography, Page 16 L390-399.

      Similarly, the data shown in Figure 3C shows minimal improvements in the CTF resolution estimate (e.g., 4.3 versus 4.2 Å), but exhibited several hundred Å difference in defocus values. How do such differences impact downstream processing? Is such a difference overcame by per-particle (local) CTF refinements (like the authors mention in the discussion, see below)?

      The difference in the defocus estimate (~600A) is substantially smaller than the thickness of the sample (2000A). Hence both estimates may be valid, depending on which particles inside the sample are considered. Particles with larger defocus errors could certainly be corrected by per-particle CTF refinement as long as the search range is chosen to be large enough. The main benefit of using CTFFIND5 is information for the user regarding the sample thickness to set the defocus search range appropriately.

      At which point does the thickness of the specimen preclude the ice thickness modulation to be included for "accurate" estimate? 500Å? 1000Å? 2000Å? Based on the data shown in Figure 3B, as high as 969 Å thick specimens benefit moderately (4.6 versus 3.4 Å fit estimate), but perhaps not significantly, from the ice thickness estimation. Considering the increased computational time for ice thickness estimation, such an estimate of when to incorporate for single-particle workflows would be beneficial.

      As explained in our preface, the main benefit for single-particle workflows will be sample tilt estimation. This will provide more accurate per-particle defocus estimates, compared to estimates that do not take the tilt into account. For single-particle samples, the ice thickness in holes is probably more efficiently monitored using the Beer-Lambert law.

      It would seem that this statement could be evaluated herein: "the analysis of images of purified samples recorded at lower acceleration voltages, e.g., 100 keV (McMullan et al., 2023), may also benefit since thickness-dependent CTF modulations will appear at lower resolution with longer electron wavelengths". There are numerous examples of 300kV, 200kV, and 100kV EMPIAR datasets to be compared and recommendations would be welcomed.

      Publicly available datasets recorded at 100kV and 200kV were collected in very thin ice, making it difficult to demonstrate the stated benefits. We have removed this statement.

      Although logical, this statement is not supported by the data presented in this manuscript: "The improvements of CTFFIND5 will provide better starting values for this refinement, yielding better overall CTF estimation and recovery of high-resolution information during 3D reconstruction."

      We have revised this statement and now explain that the sample tilt information will provide more accurate per-particle defocus estimates, compared to estimates that do not take the tilt into account, Page 17, L400-409. We did not investigate how this will affect downstream processing results.

      Moreso, the lack of single-particle data evaluation does present a concern. Naively, these improvements would benefit all cryoEM data, regardless of modality.

      We agree with the reviewer that all cryoEM modalities should benefit from more accurate defocus value estimates and have amended our concluding statement. However, how improved defocus values will benefit downstream processing results will depend on the processing pipeline, which includes various points of user input and data-dependent choices. We have therefore limited our analysis to the outputs of CTFFIND5.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) CTFFIND5 in cryo-ET

      (1.1) CTFFIND4 is prone to unreliable CTF estimates at high tilts in cryoET, a situation that can be identified by high variability or 'unstable' estimates as a function of the tilt angle. Prof. Mastronarde recently illustrated this situation in his article JSB 216:108057, 2024 (Fig. 7). Therefore, the authors could add results to show whether the improvements to tilt estimation introduced in CTFFIND5 overcome this problem. So, in addition to the estimation of tilt angle and tilt axis in Figure 2, the estimated defocus could also be shown.

      We have worked with Prof. Mastronarde to help him use CTFFIND as a tool in his cryoET processing pipeline. Mastronarde chose CTFFIND because it contains algorithms and architecture that he could optimize for his purposes. CTFFIND5 is currently lacking the concept of a tilt series and can therefore not take advantage of the additional information that comes with tilt series. Our own applications for CTFFIND5 currently do not include tomography, and our results presented in Fig. 2 were obtained for validation of the tilt estimation feature. We did not attempt to duplicate Mastronarde’s optimization for reliable tilt series processing.

      Figure 2b of this manuscript already suggests that CTFFIND5 may exhibit some variability of defocus estimates at high tilts (in view of the variability of tilt axis angle). A strategy used in IMOD and TOMOCTF is to consider the tiles of a group of consecutive images (typically 35; especially at high tilts) to add more signal to the average spectrum, thus providing more reliable estimates (illustrated in Mastronarde's article JSB 216:108057, 2024, Fig. 8). Will the authors think that CTFFIND5 might include a strategy like this for cryoET tilt-series?

      We currently do not have plans to develop CTFFIND5 as a tool for tomography as there are already other excellent tools available, some of them based on CTFFIND’s basic algorithms (see previous comment).

      (1.2) In cryoET, the CTF is often determined on the aligned tilt-series, with the tilt axis typically running along the Y axis. Has CTFFIND5 got the option to exclude estimation of the tilt geometry (tilt angle and/or axis) and, instead, take tilt geometry directly from the alignment and/or from the microscope??. This would significantly speed up determination of the CTF (in 1-2 seconds per image, according to Table 2) while still taking advantage of all power spectra in tilted images (as described in their tilt estimation algorithm) for improved CTF estimation. This strategy would be similar to what it is done in Bsoft and IMOD.

      This is an excellent idea and we may implement this in an updated version. The current version is primarily meant for lamellae and single-particle samples where we usually have a single tilt in an unknown direction. For these cases, the suggested feature will have less benefit. 

      Thus, I suggest that the authors should also include results comparing CTF estimation in aligned tilt-series with CTFFIND4 and with CTFFIND5 (with no tilt estimation but indeed taking the tilt information from the alignment or the microscope into account). The results would show that CTFFIND5 is more robust than CTFFIND4, especially at high tilts.

      Thank you for this suggestion. We are now showing a comparison of defocus estimates from CTFFIND4 and CTFFIND5 in Fig. 2. Indeed, in one case CTFFIND5 seems to report more robust defocus values at high tilt.

      (1.3) The newer improvements in CTFFIND5 seem to be especially tailored to cryoET. The cryoET community will be highly attracted by these improvements. However, the current standard acquisition protocols (exposure of 3-5 e/A2 per image, tilts up to 60 degrees, etc) limit their full exploitation, particularly the thickness-aware CTF determination. I believe that adding a paragraph exclusively focused on cryoET and describing the potential benefits from CTFFIND5 and their limitations could enrich the Conclusion section. In this paragraph, the authors could highlight the great benefits from the tilt-aware CTF estimation. They could also discuss the current standard acquisition protocols (e.g. exposure 3-5 e/A2 per image, nominal defocus 3-5 microns, cellular thickness from 150 nm up to 200-300 nm that, at a tilt of 60 degrees, become 300 nm up to 400-600 nm) and their implications for the potential benefit from the improvements available in CTFFIND5.

      This reviewer is clearly excited about the potential application of CTFFIND5 in cryoET. We are sorry that we are currently not developing CTFFIND5 in this direction.

      (1.4) Apologies for insisting on cryoET in the previous points. I am just trying to suggest ideas to make CTFFIND5 even more helpful in cryoET. You can consider them now, or for a future version of the software, or just ignore them.

      Thanks for your suggestions. Since there is clearly demand for tools to process tomographic tilt series, we will keep these suggestions in mind for the future development of CTFFIND.

      (2) Tilt estimation

      (2.1) Page 4. Tiles for the initial steps in tilt estimation are of size 128x128.  At which point tiles of larger size (e.g. 512x512) are used?. Please, define.

      Thank you for pointing out this lack of clarity. For the tilt estimation, we used a tile size 128 x 128, which has been hard-coded in our program, as mentioned in line 68 on page4. For generating the final power spectrum, we usually use size 512 x 512. This tile size can be defined by the user when running the program. We have now clarified this on Page 4, L74-76.

      (2.2) Page 6 and/or page 11: evaluation of tilt estimation with tilt-series.

      Please indicate the acquisition details of the tilt-series used for the evaluation, especially the exposure per image. This information is neither available in this manuscript nor in Elferich et al., 2022.

      Please, add these acquisition details similarly to page 9 in this manuscript (evaluation of sample thickness estimation using tomography): pixel size, exposure per image and total exposure, number of images, tilt range and interval

      The same tilt-series were used to verify tilt-estimation and sample thickness. We have revised the Methods section to make this clear on Page5, L98-105 and Page 10, L202.

      (2.3) Page 10. Section Results. Subsection Tilt estimation.

      The authors use "defocus correction" to refer to their method for scaling the power spectra. "Defocus correction" might perhaps be a misleading term. In contrast, in page 4 the authors use the term "tilt correction". Please, revise and make it consistent throughout the manuscript.

      We agree and now use “tilt correction” throughout the manuscript.

      (2.4) Legend of Figure 2.

      Please add what the red dashed curve represents. Also, please note there might be an error in the estimated stage tilt axis angle: the legend states "171.8" where in the main text it is "178.2" (apparently, the latter is the correct one).

      Thank you for pointing this out. We have modified the legend and changed the number in the legend to 178.2°.

      (3) Thickness estimation

      (3.1) Line 141, page 7. The sentence reads: "The modulation of the CTF due to sample thickness t is described by the function E (current Equation 6), "  I believe that the modulation envelope of the CTF due to sample thickness is not really E (current Equation 6), but the function sinc(E). Please, revise.

      We have revised the manuscript as advised, Page 7, L148.

      (3.2) Line 148, page 7. The sentence reads "an estimate of the frequency g of the first node of the CTF_t function "

      The concept of 'node' was introduced by Tichelaar et al. (2020). The authors should not assume that this concept is familiar to the readership. So, it is suggested that the authors should introduce this concept in this section. For instance, just after Equation 6 they could add a sentence like this: "This sinc modulation envelope increasingly attenuates the amplitude of the Thon rings with increasing spatial frequencies in an oscillatory fashion, with locations where the amplitude is zero known as nodes (Tichelaar et al., 2020)."

      Thank you for this suggestion. We have revised the manuscript accordingly (Page 7, L151-156) and also marked the position of the first node in Fig. 3a.

      (3.3) Line 154, page 8: A citation is lacking: "(corrected for astigmatism, as described in )". Perhaps the authors refer to the EPA (EquiPhase Averaging) method introduced by Zhang, JSB 193:1-12, 2016, 10.1016/j.jsb.2015.11.003.

      Thanks for spotting this omission. We have added the appropriate reference.

      (3.4) Figure 3.

      (3.4.1) Perhaps, the EPA (EquiPhase Averaging) method is used to reduce the 2D CTF to 1D curves, as represented in Figure 3b and 3c. Please, mention this in the legend of the figure or in the main text referring to Figure 3. The same might apply to Figure 1c.

      Thanks for spotting this omission. We have clarified that this is indeed an EPA in the figure legends.

      (3.4.2) Please indicate what the colored curves represent in 3b and 3c: The fitted CTF model (dashed red) and the EPA or astimatism-corrected radial average of power spectrum (solid black) ?

      Thanks for spotting this omission. We have added descriptions of the colored lines in these plots (red = modeled CTF, blue = goodness of fit).

      (3.4.3) Please note that the power spectrum (solid black curves in Figure 3b and 3c) does not look the same in the top and bottom panels: Without thickness estimation (top panels), the power spectrum is in the range [0,1] in Y, as expected. However, with thickness estimation (bottom panels), the power spectrum seems to have undergone a frequencydependent transformation (a rescaling or something that makes the power spectrum oscillates around 0.5 in Y). This transformation of the power spectrum resembles the thickness-induced sinc modulation of the CTF and seems to be appropriate to better fit the new thickness-aware CTF_t model in CTFFIND5 to the (transformed) power spectrum. However, this transformation of the power spectrum is not mentioned in the manuscript at all. Instead, according to the main text (page 8), the fitting method is based on the crosscorrelation between the new CTF model and the power spectrum, so I was expecting to see the same power spectrum black curve in the top and bottom panels. Please, clarify.

      Indeed, CTFFIND5 displays the power spectrum differently after thickness estimation. We have revised the methods to explain this (page8, L178-181). The reviewer is also correct that the 1D lines plots of the Thon ring patterns in Fig. 3b and 3c are not identical. These 1D plots are generated from the 2D plots according to the fitted CTF, which is needed to follow the astigmatic rings and avoid blurring of the oscillations in the radial average. This means that different CTF fits will also result in somewhat different 1D plots. However, these differences only affect the 1D EPA plots shown to the user. The actual fitting is performed against the same 2D spectra.

      (3.4.4) Line 319, Page 14. "A linear fit revealed .." It would be good to add a line with the linear fit in Figure 5.

      Agreed. The revised Fig. 5 now shows a line for the linear fit.

      (3.5) New CTF Model

      It is not clear from the text if the new CTF_t model is used at all times in CTFFIND5 or only when the user requests thickness estimation. Related to this, if the user requests both tilt estimation and thickness estimation, how is the CTF estimation process carried out in CTFFIND5?: Tilt and thickness are estimated at the same time? or one after the other (i.e. first the tilt is estimated, then followed by thickness estimation)?. Please, clarify.

      The new CTF_t model is only used when the user requests thickness estimation. When both tilt-estimation and thickness estimation are requested, the tilt is estimated first and the corrected power spectrum is then fitted using the CTF_t model. We have revised the Methods section to explain this better, Page 8, L158-159.

      (4) Pages 14-15. Section "CTF estimation and correction assists "

      This section just shows that correction of a highly underfocused image for the CTF with phase flipping or a Wiener filter reduces the CTF-induced fringes. I do not really understand the inclusion of this section to the manuscript. There is no contribution related to CTFFIND5.  

      The ability to apply a CTF correction to the input image according to Tegunov & Cramer is a new feature of apply_ctf, a program included with cisTEM. We think that this section fits into the theme of CTFFIND5 because the correction adds valuable information about the samples, such as FIB-milled lamellae.

      If the authors prefer to keep this section, then please take the following points into account:

      (4.1) Figure 6b: This is the only time that the term "EPA" (EquiPhase Averaging, I guess) is used in the manuscript. Please, spell it out somewhere in the manuscript, define what it means and add a proper citation, if convenient. This point is related to point 3.3 above.

      We have added the appropriate reference and defined EPA in the methods section as indicated in the reply to point 3.3.

      (4.2) Figure 6d. The contrast of this image is poor. Please, increase the contrast (to be similar to Figure 6c) so that the details can be better discerned. The image also shows a grainy texture, likely artefacts from the Wiener filter due to excessive amplification. Maybe the 'strength parameter' S of the deconvolution Wiener filter (Tegunov & Cramer, 2019) should be tuned down or the 'fall-off parameter' F tuned up to try to attenuate these artefacts.

      Agreed. The revised figure shows panel d with increased contrast with the custom fall-off parameter set to 1.3 and the custom strength parameter set to 0.7.

      (5) CTFFIND5 runtimes

      Table 2 shows that estimation of tilt increases the runtime up to 39 s in an image of 4070x2892 and to 208 s in one of 2880x2046. There is a significant difference between these two cases (39 s vs. 208 s) and the first image is much larger than the second. Why does CTFFIND5 on the smaller image take so long compared to the larger image?

      During tilt estimation, the images are binned to a pixel size of 5 Å. This causes micrograph 1 to be substantially smaller (in pixels) than micrographs 2 and 3, resulting in the faster runtime.

      (6) Conclusions

      (6.1) In the Conclusion section, the authors could elaborate a bit the insights about the sample quality provided by CTFFIND5. This is stated in the title of the manuscript, but it was hardly mentioned in the manuscript.

      We have revised the conclusion to make this clearer (Page 16, L389-396). CTFFIND5 helps in estimating sample quality since (1) the sample thickness is an important determinant in the amount of high-resolution signal in a micrograph and (2) the estimated fit-resolution reflects more accurately the amount of signal present in a micrograph after tilt and sample thickness have been taken into account.

      (6.2) The authors nicely identify and describe the applications where thickness-aware CTF determination will be valuable: in situ single particle analysis and in vitro single particle cryoEM of purified samples at low voltages. Perhaps, CTFFIND5 will also be of great interest for single particle cryoEM of thick specimens (e.g. capsid of large viruses with diameter in the range 120-200 nm such as PBCV-1 or HSV-1).

      Agreed. We have added this case to our Conclusions. (Fig. 3d)

      (7) Typographical errors:

      line 161, page 8. "1.5 time" should be "1.5 times"

      lines 185-191. All exposures are given in 'electrons/Angstrom', not in 'electrons/square Angstrom'

      line 206, page 10. With "slides" the authors seem to mean "slices"

      line 338, page 14: "describeD by Tegunov"

      line 349, page 15. "power spectra"

      lines 366 and 368, page 15: Note that Square Angstrom is written as "A2". Put "2" with superscript.

      Thank you for pointing out these errors. They have been corrected.

      (8) References:

      Reference: Lucas et al., eLife 10 e68946. Year is lacking. Add year: 2021.

      Reference: Yan et al. 2015 cited in line 169, page 8, does not appear in Bibliography. The authors may mean: Yan et al. 2015 JSB 192:287-296, 2015  

      It would be good to cite Bsoft, as it has a procedure similar to tilt-corrected CTF estimation: Heymann, Protein Science, 2021,  

      Thank you for carefully checking the cited references. We have revised the manuscript as suggested.

      Reviewer #2 (Recommendations For The Authors):

      I have only minor suggestions for improvement below:

      L218: "these option"

      Corrected

      L243: "chevron-shape" -> V-shape would be more accessible language for non-native speakers.

      Changed

      L281: "Based on these results we conclude that CTFFIND5 will provide more accurate CTF parameters" -> Given that the maximum resolutions of the fits by the old model and the new model are nearly the same, how big would the actual advantage of the new model be for subsequent sub-tomogram averaging?

      Please see our response above, Reviewer #3 (Public Review), 

      L376: The correct reference for RELION per-particle CTF estimation is Zivanov et al, (2018) [https://elifesciences.org/articles/42166]. Also, the cryoSPARC paper referenced does not describe per-particle CTF estimation and should thus be removed from this context.

      Thanks for pointing out these mistakes, which we have now corrected. We have chosen to keep the citation for CryoSPARC to reference the general software, but have added Ziavanov et.al. 2020 as suggested by the CryoSPARC website.

      Reviewer #3 (Recommendations For The Authors):

      Minor:

      Figure 1A legend - authors mention boxes but only 1 box is shown.

      Thank you for pointing this out. For visual clarity we decided to only show one box. We have corrected the legend.

      Figure 1B - it would be nice if the boxes that contributed to the power spectra were mapped on Figure 1A

      The shown power spectra are not actual data. Instead, we show power spectra with exaggerated defocus differences for visual clarity. We have revised the figure legends to make this clear. 

      The Y-axis legends in Figure 2 are not aligned vertically

      Corrected

      Figure 3A - CTFFIND4 is missing an "I"

      Corrected

      Figure 3 - Y-axis legends are not aligned vertically

      Corrected

      Page 16, line 376, Relion should be RELION

      We have revised the manuscript as advised.

      Typo in equation 5, sinc versus sin?

      “sinc” is correct here, since this is a thickness-dependent modulation of the CTF.

      Lambert-Beer's, Lambert-Beer are used variably but curious if Beer-Lambert should be used.

      We have revised the manuscript as advised.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this study by Zhou, Wang, and colleagues, the authors utilize biventricular electromechanical simulations to illustrate how different degrees of ionic remodeling can contribute to different ECG morphologies that are observed in either acute or chronic post-myocardial infarction (MI) patients. Interestingly, the simulations show that abnormal ECG phenotypes - associated with a higher risk of sudden cardiac death - are predicted to have almost no correspondence with left ventricular ejection fraction, which is conventionally used as a risk factor for arrhythmia.

      Strengths:

      The numerical simulations are state-of-the-art, integrating detailed electrophysiology and mechanical contraction predictions, which are often modeled separately. The simulation provides mechanistic interpretation, down to the level of single-cell ionic current remodeling, for different types of ECG morphologies observed in post-MI patients. Collectively, these results demonstrate compelling and significant evidence for the need to incorporate additional risk factors for assessing post-MI patients.

      Weaknesses:

      The study is rigorous and well-performed. However, some aspects of the methodology could be clearer, and the authors could also address some aspects of the robustness of the results. Specifically, does variability in ionic currents inherent in different patients, or the location/size of the infarct and surrounding remodeled tissue impact the presentation of these ECG morphologies?

      We thank the reviewer for their considered evaluation. In response to the reviewer’s comments regarding variability in ionic currents, we have added simulations using a n=17 populations of models with variability in ionic conductances in the baseline ToR-ORd model to the paper, to show the effect of such variation on the post-MI ECG presentation in acute and chronic conditions. This is now described in the Methods [lines 140, 158-161, 242-244, 245-246, 261-263], and shown in the methods Figure 1A, 1B. The ECG results using this population of models are shown in Figure 2C and described in [lines 333-335] and the pressure volume results using the population of models are shown in Figure 5A and 5B and described in [lines 417-418, 442-444, 448-450]. The population of models showed consistent patterns in both the ECG and LVEF as the baseline model, this is discussed in [lines 563-564, 688-690].

      Regarding the effect of scar location and size on the ECG, we refer the reader and reviewer to a related paper where this is explored in depth using a formal sensitivity analysis and deep learning inference (https://pubmed.ncbi.nlm.nih.gov/38373128/). This is better able to do justice to this question rather than overloading this paper with additional investigations. We include a reference to this paper in the discussion section [lines 694-695].

      Reviewer #2 (Public Review):

      Summary:

      The authors constructed multi-scale modeling and simulation methods to investigate the electrical and mechanical properties of acute and chronic myocardial infarction (MI). They simulated three acute MI conditions and two chronic MI conditions. They showed that these conditions gave rise to distinct ECG characteristics that have been seen in clinical settings. They showed that the post-MI remodeling reduced ejection fraction up to 10% due to weaker calcium current or SR calcium uptake, but the reduction of ejection fraction is not sensitive to remodeling of the repolarization heterogeneities.

      Strengths:

      The major strength of this study is the construction of computer modeling that simulates both electrical behavior and mechanical behavior for post-MI remodeling. The links of different heterogeneities due to MI remodeling to different ECG characteristics provide some useful information for understanding complex clinical problems.

      Weaknesses:

      The rationale (e.g., physiological or medical bases) for choosing the 3 acute MI and 2 chronic MI settings is not clear. Although the authors presented a huge number of simulation data, in particular in the supplemental materials, it is not clearly stated what novel findings or mechanistic insights this study gained beyond the current understanding of the problem.

      We thank the reviewer for their careful evaluations of our work. The justification for selecting the 3 acute MI and 2 chronic MI states is based on clinical and experimental reports, as summarised in the Methods section [lines 245-247, 252-256, 264-266].  We have also highlighted the key novelty and significance of the study in the Discussion [lines 579-582].

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) This was clarified very late in the Discussion, but for most of the paper, I was unclear if heart geometry was the same for all simulations. Presumably, this includes the size and location of the infarct, BZ, and RZ. It would be helpful to clarify this in the Methods.

      This has been clarified in the first paragraph of the Methods section [lines 142-145].

      (2) On lines 224-226, the Methods refers to implementing several population members from the ToR-ORd model (in addition to the baseline) into the biventricular EM simulations. Is this in reference to the simulations shown in Figures 6 and 7, or different simulations? Please clarify.

      We now randomly select 17 of the 245 cell models in the population to be embedded in ventricular simulations, to produce a ventricular population of models. This allows us to explore the effect that physiological variability in the baseline ionic conductances has on the phenotypic representation of ionic remodellings in the ECG and LVEF. An explanation of this can be found in the Methods section [lines 241-244].

      For Figures 6 and 7, we selected two arrhythmic cell models from the n=245 population of cell models to be embedded into two ventricular simulations to demonstrate the arrhythmic potential of the cellular model at ventricular scale. This has been clarified in Methods [lines 269-271].

      Additionally, for the cases where a population member is used, are all regions of the ventricles "scaled" in the same manner, or were only the properties of the particular region drawn from the population modified relative to baseline (e.g., mid-myocardial cells in Figure 6)?

      The cells were embedded according to transmural heterogeneity in the remote zone for Figures 6 and 7. This has been clarified in the Methods [line 271-273].

      (3) Interestingly, the study finds that the ionic remodeling in different peri-infarct regions to be most critical in the ECG phenotype, which at least strongly suggests that inherent intra-patient variability in ion channel expression could also be critical.

      This is related to the comment on the use of population members. If the authors utilized one of the ventricular myocyte population members as the 'reference' (instead of the baseline ToR-ORd parameters) and applied the same types of remodeling as in Figures 3 and 4, would they expect the same ECG morphologies?

      We have now performed this test and selected 17 cell models from the population to create a ventricular population of models. On top of this ventricular population, we have applied the remodellings, and showed that the simulated ECG morphologies were mostly consistent across these 20 members (Figure 2C).

      (4) Related, do the authors expect that the location and/or size of the infarct and peri-infarct regions would impact the different ECG morphologies?

      Regarding the effect of scar location and size on the ECG, we refer the reader and reviewer to a related paper where this is explored in depth using a formal sensitivity analysis and deep learning inference (https://pubmed.ncbi.nlm.nih.gov/38373128/). We feel this is better able to do justice to this question rather than overloading this paper with additional investigations. We include a reference to this paper in the discussion section [lines 694-695].

      Reviewer #2 (Recommendations For The Authors):

      (1) Although the authors listed the parameters and cited the papers for the origins of the parameter changes in SM4 and table S4, it should be summarized in the methods section what are the major changes or differences for the 5 conditions. Furthermore, it should be stated what is the rationale for choosing these conditions. Are these choices based on clinical classifications or experimental conditions?

      The major differences between the 5 conditions have now been summarised in the Methods [lines 252-256, 264-266]. These remodellings have been collated from a range of experimental measurements in both human and animal data, which are summarised in Table S4. This has been clarified in Methods [lines 245-247].

      (2) Figure 3C and Figure 4C do not add any additional information beyond the conductance changes listed in Table 4, and I'd suggest removing them from the figures. On the other hand, it took me some time to look at Table 4 to figure out the corresponding changes. As commented above, the remodeling changes should be summarized in the main text to help reading.

      Figure 3C and 4C provide a visual explanation of the ionic remodellings in these conditions to echo the added descriptions in the text [lines 252-256, 264-266]. For this reason, we have elected to keep those figures in the manuscript.

      (3) The authors presented a large amount of data in Supplemental Materials, some may be unnecessary and some are difficult to follow. For example; 1) There is a lot of data in Table S6, there is a simple mention in the main text and Table S6 legend. A summary of the data is needed for the readers to understand the properties of the different conditions, instead of letting the readers figure them out from the table. The same should be done for other tables and figures. There are some format issues for the tables, which mess up some of the numbers and text. 2) The data shown in Figures S25-29 provide almost no new information beyond the well-known effects of ionic currents on EAD genesis, i.e., EADs are promoted by inward currents and suppressed by outward currents. The data for alternans (Figures S18-22) are a little more complex than the cases for EADs, I think that they can be simplified.

      Thanks for the suggestions. We have now extracted the key information from Table S6- S9 and summarized them in the caption. We have also fixed the layout of the tables in this revision. The supplementary sections on alternans and EADs are simplified with the key parameters related to these proarrhythmic phenomena summarized in tables instead of showing all boxplots of parameter distributions (Tables S10 and S11).

      (4) The authors showed two mechanisms of alternans: EAD-driven and Ca-driven alternans in chronic MI. There are several distinct mechanisms of alternans including EAD-induced alternans (see the recent review by Qu and Weiss, Circ Res 132, 127(2023)). Theoretically, calcium alternans can also induce EAD alternans under proper conditions, can you rule out that the EAD alternans are not due to Ca alternans? The results in Fig.7D may say the opposite. There are some chicken-or-egg issues here.

      In Figure 7D, we showed that the epicardial cell type (blue trace) had stable EADs at fast pacing with no calcium alternans, while both the endocardial (red trace) and mid-myocardial (green trace) cell types failed to fully repolarise in every other beat. To explore whether the EAD alternans are driven by calcium alternans, we tested the effects of switching off the alternans related remodelling, and the APs tuned out to be normal. On the other hand, when we turned off the EAD related remodelling, neither EADs nor alternans occurred. Therefore, the results show the two types of ionic current remodelling are both necessary for the generation of EAD alternans (lines 656-659 in the discussion and SM9).

      (5) As for the formation of ectopic beats, it can be caused by EADs but it can caused by repolarization gradient, they are not the same and differ in different AP models (Liu et al, CircAE 12, e007571 (2019), Zhang et al, Biophy J 120, 352(2021)). It is not clear here whether the primary cause is repolarization gradient or EADs. At tissue, EADs tend to be suppressed by repolarization gradient, there is a goldilocks between the EAD amplitude and repolarization gradient for an ectopic beat to form.

      When isolated cells that showed EAD were embedded in ventricular tissue, we saw ectopic wave propagation. This was because the EADs in the RZ generated conduction block, which enabled a large repolarisation gradient to form between the BZ and RZ, thereby leading to ectopy. This has been clarified in the Results [lines 507-510].

      Additionally, we have clarified the presence of the EADs in the ventricular simulations by labelling where this occurs in the green, purple, and yellow traces in Figure 7C. This was easily missed before due to the stretched proportions of the traces in the x-axis, which is necessary to show clearly the repolarisation gradients that drive ectopy.

      (6) The authors showed many population simulations. I guess that they are all in single cells. If the population simulations were done in the whole heart, it should be stated how many models were simulated. If only one of the population models was selected for the whole heart for each case, it should clarify the rationale for choosing one of the many models. If populations of cells were modeled in the whole heart, clarify how the models were distributed in the heart.

      We now randomly select 17 of the 245 cell models in the population to be embedded in ventricular simulations, to produce a ventricular population of models. This allows us to explore the effect that physiological variability in the baseline ionic conductances has on the phenotypic representation of ionic remodellings in the ECG and LVEF. An explanation of this can be found in the Methods section [lines 241-244]. Whenever the cell models are embedded in the relevant zones, they are uniformly distributed according to the transmural heterogeneity [lines 271-273].  

      (7) QRS intervals in the simulations are much wider than the real recordings from patients (Figure 2 and Table S8). At least, a QRS of 120 ms for normal control is too wide and probably not normal.

      We have manually measured QRS duration and updated the delineation method to calculate the other biomarkers. The new values now lie within normal ranges and have been updated in SM Table S7 and S8 and in Figure 2, and the new delineation method has been included in SM2.

    1. Author response:

      Reviewer #1 (Public review):

      Summary:

      Madigan et al. assembled an interesting study investigating the role of the MuSK-BMP signaling pathway in maintaining adult mouse muscle stem cell (MuSC) quiescence and muscle function before and after trauma. Using a full body and MuSC-specific genetic knockout system, they demonstrate that MuSK is expressed on MuSCs and that eliminating the BMP binding domain from the MuSK gene (i.e., MuSK-IgG KO) in mice at homeostasis leads to reduced PAX7+ cells, increased myonuclear number, and increase myofiber size, which may be due to a deficit in maintaining quiescence. Additionally, after BaCl2 injury, MuSK-IgG KO mice display accelerated repair after 7 days post-injury (dpi) in males only. Finally, RNA profiling using nCounter technology showed that MuSK-IgG KO MuSCs express genes that may be associated with the activated state.

      Strengths:

      Overall, the biology regulating MuSC quiescence is still relatively unexplored, and thus, this work provides a new mechanism controlling this process. The experiments discussed in the paper are technically sound with great complementary mouse models (full body versus tissue-specific mouse KO) used to validate their hypothesis. Additionally, the paper is well written with all the necessary information in the legends, methods, and figures being reported.

      Weaknesses:

      While the data largely supports the author's conclusions, I do have a few points to consider when reading this paper.

      (1) For Figure 1, while I appreciate the author's confirming MuSK RNA and protein in MuSCs, I do think they should (a) quantify the RNA using qPCR and (b) determine the percentage of MuSCs expressing MuSK protein in their single fiber system in multiple biological replicates. This information will help us understand if MuSK is expressed in 1/10 or 10/10 PAX7-expressing MuSCs. Also, it will help place their phenotypes into the right context, especially when considering how much of the PAX7-pool is expressing MuSK from the beginning.

      The quantification is a reasonable point; however, we don’t believe that this information is necessary for supporting the interpretation of the findings.

      We agree that determining the proportion of SCs that expressing MuSK is useful information and we will address this question in the Revision.

      (2) Throughout the paper the argument is made that MuSK-IgG KO (full body and MuSC-specific KOs) are more activated and/or break quiescence more readily, but there is no attempt to test directly. Therefore, the authors should consider measuring the activation dynamics (i.e., break from quiescence) of MuSCs directly (EdU assays or live-cell imaging) in culture and/or in muscle in vivo (EdU assays) using their various genetic mouse models

      We agree that this point is of interest and we plan to address it in future studies.

      (3) For Figure 2, given that mice are considered adults by 3 months, it is really surprising how just two months later they are starting to see a phenotype (i.e., reduced PAX7-cells, increased number of myonuclei, and increased myofiber size)-which correlates with getting older. Given that aged MuSCs have activation defects (i.e., stuck somewhere in the quiescence cycle), a pending question is whether their phenotype gets stronger in aged mice, like 18-24 months. If yes, the argument that this pathway should be used in a therapeutic sense would be strengthened.

      We agree that the potential role of the MuSK-BMP pathway in aged SCs is of import and could shed new light on SC dynamics in this context. However, we note that the activation observed between 3-5 months results in improved muscle quality (increased myofiber size and grip strength), which is opposite of what is observed with aging. We agree that activating the MuSK-BMP pathway in aged animals has the potential to activate SCs, promote muscle growth and counter sarcopenia.  Pharmacological and genetic approaches to test that question are underway, but given the time frame they are beyond the scope of the current manuscript.

      (4) For Figure 4, the same question as in point (2), the increase in fiber sizes by 7dpi in MuSK-IgG KO males is minimal (going from ~23 to 27 by eye) and no difference at a later time point when compared to WT mice. However, if older mice are used (18-24 months old) - which are known to have repair deficits-will the regenerative phenotype in MuSK-IgG KO mice be more substantial and longer lasting?

      Again, an interesting point that will be addressed in future studies. 

      (5) For Figure 6, this gene set is not glaringly obvious as being markers of MuSC activation (i.e., no MyoD), so it's hard for the readers to know if this gene set is truly an activation signature. Also, the Shcherbina et al. data presented as a column with * being up or down (i.e. differentially expressed) is not helpful, since you don't know whether those mRNAs in that dataset are going up with the activation process. Addressing this point as well as my point (1) will further strengthen the author's conclusions about the MuSK-IgG KO MuSCs not being able to maintain quiescence as effectively.

      We agree that this Figure should include more information and be formatted in a way more readily convey the point. We will provide these changes in the Revision.

      Reviewer #2 (Public review):

      Summary:

      The work by Madigan et al. provides evidence that the signaling of BMPs via the Ig3 domain of MuSK plays a role during muscle postnatal development and regeneration, ultimately resulting in enhanced contractile force generation in the absence of the MuSK Ig3 domain. They demonstrate that MuSK is expressed in satellite cells initially post-isolation of muscle single fibers both in WT and whole-body deletion of the BMP binding domain of MuSK (ΔIg3-MuSK). In developing mice, ΔIg3-MuSK results in increased muscle fiber size, a reduction in Pax7+ cells, and increased muscle contractile force in 5-month-old, but not 3-month-old, mice. These data are complemented by a model in which the kinetics of regeneration appear to be accelerated at early time points. Of note, the authors demonstrate muscle tibialis anterior (TA) weights and fiber feret are increased during development in a Pax7CreERT2;MuSK-Ig3loxp/loxp model in which satellite cells specifically lack the MuSK BMP binding domain. Finally, using Nanostring transcriptional the authors identified a short list of genes that differ between the WT and ΔIg3-MuSK SCs. These data provide the field with new evidence of signaling pathways that regulate satellite cell activation/quiescence in the context of skeletal muscle development and regeneration.

      On the whole, the findings in this paper are well supported, however additional validation of key satellite cell markers and data analysis need to be conducted given the current claims.

      (1) The Pax7CreERT2;MuSK-Ig3loxp/loxp model is the appropriate model to conduct studies to assess satellite cell involvement in MuSK/BMP regulation. Validation of changes to muscle force production is currently absent using this model, as is quantification of Pax7+ tdT+ cells in 5-month muscle. Given that MuSK is also expressed on mature myofibers at NMJs, these data would further inform the conclusions proposed in the paper.

      As reported in the manuscript, we observed increased myofiber size, length and TA weight in the conditional mutants at five months of age. We did not assess grip strength in those experiments. 

      We demonstrated highly efficient MuSK Ig3-domain recombination by PCR analysis of FACS-sorted SCs from these conditional mutants (Supplemental Fig. S3). However, while we checked for Pax7+ tdT+ cells in 5-month SCs, we did not quantify this finding.

      (2) All Pax7 quantification in the paper would benefit from high magnification images including staining for laminin demonstrating the cells are under the basal lamina.

      The point is reasonable, we observed that these Pax7+ cells were under the basal lamina, but we did not acquire images at higher magnification.   

      (3) The nanostring dataset could be further analyzed and clarified. In Figure 6b, it is not initially apparent what genes are upregulated or downregulated in young and aged SCs and how this compares with your data. Pathway analysis geared toward genes involved in the TGFb superfamily would be informative.

      We agree that further analysis and information regarding the data in this Figure is warranted and we will include it in the Revision.

      (4) Characterizing MuSK expression on perfusion-fixed EDL fibers would be more conclusive to determine if MuSK is expressed in quiescent SCs. Additional characterization using MyoD, MyoG, and Fos staining of SCs on EDL fibers would help inform on their state of activation/quiescent.

      These are all valid points that we intend to address in future experiments.

      (5) Finally, the treatment of fibers in the presence or absence of recombinant BMP proteins would inform the claims of the paper.

      As reported in Jaime et al (2024) we have extensively characterized the differences in BMP response in both cultured WT and DIg3-MuSK myofibers and myoblasts at the level of signaling (pSMAD 1/5/8 nuclear localization and phosphorylation) and gene expression (qRT-PCR).

      Reviewer #3 (Public review):

      Summary:

      Understanding the molecular regulation of muscle stem cell quiescence. The authors evaluated the role of the MuSK-BMP pathway in regulating adult SC quiescence by the deletion of the BMP-binding MuSK Ig3 domain ('ΔIg3-MuSK').

      Strengths:

      A novel mouse model to interrogate muscle stem cell molecular regulators. The authors have developed a nice mouse model to interrogate the role of MuSK signaling in muscle stem cells and myofibers and have unique tools to do this.

      Weaknesses:

      Only minor technical questions remain and there is a need for additional data to support the conclusions.

      (1) The authors claim that dIg3-MuSK satellite cells break quiescence and start fusing, based on the reduction of Pax7+ and increase of nuclei/fiber (Fig 2-3), and maybe the gene expression (Fig6). However, direct evidence is needed to support these findings such as quantifying quiescent (Pax7+Ki67-) or activated (Pax7+Ki67+) satellite cells (and maybe proliferating progenitors Pax7-Ki67+) in the dIg3-MuSK muscle.

      We believe that the data presented strongly supports the conclusion that the SCs break quiescence, activate, and fuse into myofibers in uninjured muscle.  As noted above, the mechanistic studies suggested are of interest and we will address them in future work.

      (2) It is not clear if the MuSK-BMP pathway is required to maintain satellite cell quiescence, by the end of the regeneration (29dpi), how Pax7+ numbers are comparable to the WT (Fig4d). I would expect to have less Pax7+, as in uninjured muscle. Can the authors evaluate this in more detail?

      The reviewer makes an important point. Our current interpretation of the findings is that quiescence is broken in SCs in uninjured muscle, but that ‘stemness’ is preserved, allowing for efficient muscle regeneration and restoration of the SC pool. Whether such properties reflect SC heterogeneity (as suggested in the comments of the other reviewers) and/or different states along a continuum is of particular interest and will be the focus of future studies. 

      (2) Figure 4 claims that regeneration is accelerated, but to claim this at a minimum they need to look at MYH3+ fibers, in addition to fiber size.

      We did not examine MYH3+ fibers in this study. However, we did observe increased in Pax7+ cells at 5dpi (male and female) as well as larger myofiber size (Feret diameter) at 7dpi in the male animals.  In addition, the panels in Figure 4 b,c (H&E and laminin, respectively) showing accelerated differentiation were selected to be representative of the experimental group. 

      (3) The Pax7 specific dIg3-MuSK (Fig5) is very exciting. However, it will be important to quantify the Pax7+ number. Could the authors check the reduction of Pax7+ in this model since it would confirm the importance of MuSK in quiescence?

      In Figure 5c, we assessed the number of Pax7+ cells in the conditional mutant during the course of regeneration (at 3, 5, 7, 14, 22 and 29 dpi). As discussed above, these results confirmed the findings of the constitutive mutant (reduction of Pax7+ cells in uninjured 5-month-old muscle) as well as showing the increased number at 5dpi and return to WT levels at 29 dpi.

      (3) Rescue of the BMP pathway in the model would be further supportive of the authors' findings.

      This point is valid. In a parallel study examining the role of the MuSK-BMP pathway at the NMJ, we have observed that BMP+/- (hypomorphs) recapitulate key phenotypes observed in DIg3-MuSK  NMJs (Fish et al., bioRxiv, 2023). This point will be included in the Revision. 

      (4) Is the stem cell pool maintained long term in the deleted dIg3-MuSK SCs? Or would they be lost with extended treatment since they are reduced at the 5-month experiments? This is an important point and should be considered/discussed relevant to thinking about these data therapeutically.

      We agree that this is an important point for future studies. 

      (5) Without the Pax7-specific targeting, when you target dIg3-MuSK in the entire muscle, what happens to the neuromuscular nuclei?

      A manuscript describing the phenotype of the NMJ in DIg3-MuSK constitutive mice is in bioRxiv (Fish et al., 2024) and is in Revision at another journal.  We anticipate discussing the findings in the Revised version of the current manuscript. 

      (6) Why were differences seen in males and not females? Is XIST downregulation occurring in both sexes? Could the authors explain these findings in more detail?

      The male and female difference in myofiber size is of interest.  The nanostring experiments,  which showed the XIST reduction, were only performed in male mice.

    1. Author response:

      eLife Assessment

      This valuable study reveals extensive binding of eukaryotic translation initiation factor 3 (eIF3) to the 3' untranslated regions (UTRs) of efficiently translated mRNAs in human pluripotent stem cell-derived neuronal progenitor cells. The authors provide solid evidence to support their conclusions, although this study may be enhanced by addressing potential biases of techniques employed to study eIF3:mRNA binding and providing additional mechanistic detail. This work will be of significant interest to researchers exploring post-transcriptional regulation of gene expression, including cellular, molecular, and developmental biologists, as well as biochemists.

      We thank the reviewers for their positive views of the results we present, along with the constructive feedback regarding the strengths and weaknesses of our manuscript, with which we generally agree. We acknowledge our results will require a deeper exploration of the molecular mechanisms behind eIF3 interactions with 3'-UTR termini and experiments to identify the molecular partners involved. Additionally, given that NPC differentiation toward mature neurons is a process that takes around 3 weeks, we recognize the importance of examining eIF3-mRNA interactions in NPCs that have undergone differentiation over longer periods than the 2-hr time point selected in this study. Finally, considering the molecular complexity of the 13-subunit human eIF3, we agree that a direct comparison between Quick-irCLIP and PAR-CLIP will be highly beneficial and will determine whether different UV crosslinking wavelengths report on different eIF3 molecular interactions. Additional comments are given below to the identified weaknesses.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The authors perform irCLIP of neuronal progenitor cells to profile eIF3-RNA interactions upon short-term neuronal differentiation. The data shows that eIF3 mostly interacts with 3'-UTRs - specifically, the poly-A signal. There appears to be a general correlation between eIF3 binding to 3'-UTRs and ribosome occupancy, which might suggest that eIF3 binding promotes protein synthesis, possibly through inducing mRNA closed-loop formation.

      Strengths:

      The study provides a wealth of new data on eIF3-mRNA interactions and points to the potential new concept that eIF3-mRNA interactions are polyadenylation-dependent and correlate with ribosome occupancy.

      Weaknesses:

      (1) A main limitation is the correlative nature of the study. Whereas the evidence that eIF3 interacts with 3-UTRs is solid, the biological role of the interactions remains entirely unknown. Similarly, the claim that eIF3 interactions with 3'-UTR termini require polyadenylation but are independent of poly(A) binding proteins lacks support as it solely relies on the absence of observable eIF3 binding to poly-A (-) histone mRNAs and a seeming failure to detect PABP binding to eIF3 by co-immunoprecipitation and Western blotting. In contrast, LC-MS data in Supplementary File 1 show ready co-purification of eIF3 with PABP.

      We agree the molecular mechanisms underlying the crosslinking between eIF3 and the end of mRNA 3’-UTRs remains to be determined. We also agree that the lack of interaction seen between eIF3 and PABP in Westerns, even from HEK293T cells, is a puzzle. The low sequence coverage in the LC-MS data gave us pause about making a strong statement that these represent direct eIF3 interactions, given the similar background levels of some ribosomal proteins.

      (2) Another question concerns the relevance of the cellular model studied. irCLIP is performed on neuronal progenitor cells subjected to neuronal induction for 2 hours. This short-term induction leads to a very modest - perhaps 10% - and very transient 1-hour-long increase in translation, although this is not carefully quantified. The cellular phenotype also does not appear to change and calling the cells treated with differentiation media for 2 hours "differentiated NPCs" seems a bit misleading. Perhaps unsurprisingly, the minor "burst" of translation coincides with minor effects on eIF3-mRNA interactions most of which seem to be driven by mRNA levels. Based on the ~15-fold increase in ID2 mRNA coinciding with a ~5-fold increase in ribosome occupancy (RPF), ID2 TE actually goes down upon neuronal induction.

      We agree that it will be interesting to look at eIF3-mRNA interactions at longer time points after induction of NPC differentiation. However, the pattern of eIF3 crosslinking to the end of 3’-UTRs occurs in both time points reported here, which is likely to be the more general finding in what we present.

      (3) The overlap in eIF3-mRNA interactions identified here and in the authors' previous reports is minimal. Some of the discrepancies may be related to the not well-justified approach for filtering data prior to assessing overlap. Still, the fundamentally different binding patterns - eIF3 mostly interacting with 5'-UTRs in the authors' previous report and other studies versus the strong preference for 3'-UTRs shown here - are striking. In the Discussion, it is speculated that the different methods used - PAR-CLIP versus irCLIP - lead to these fundamental differences. Unfortunately, this is not supported by any data, even though it would be very important for the translation field to learn whether different CLIP methodologies assess very different aspects of eIF3-mRNA interactions.

      We agree the more interesting aspect of what we observe is the difference in location of eIF3 crosslinking, i.e. the end of 3’-UTRs rather than 5’-UTRs or the pan-mRNA pattern we observed in T cells. The reviewer is right that it will be important in the future to compare PAR-CLIP and Quick-irCLIP side-by-side to begin to unravel the differences we observe with the two approaches.

      Reviewer #2 (Public review):

      Summary:

      The paper documents the role of eIF3 in translational control during neural progenitor cell (NPC) differentiation. eIF3 predominantly binds to the 3' UTR termini of mRNAs during NPC differentiation, adjacent to the poly(A) tails, and is associated with efficiently translated mRNAs, indicating a role for eIF3 in promoting translation.

      Strengths:

      The manuscript is strong in addressing molecular mechanisms by using a combination of next-generation sequencing and crosslinking techniques, thus providing a comprehensive dataset that supports the authors' claims. The manuscript is methodologically sound, with clear experimental designs.

      Weaknesses:

      (1) The study could benefit from further exploration into the molecular mechanisms by which eIF3 interacts with 3' UTR termini. While the correlation between eIF3 binding and high translation levels is established, the functionality of these interactions needs validation. The authors should consider including experiments that test whether eIF3 binding sites are necessary for increased translation efficiency using reporter constructs.

      We agree with the reviewer that the molecular mechanism by which eIF3 interacts with the 3’-UTR termini remains unclear, along with its biological significance, i.e. how it contributes to translation levels. We think it could be useful to try reporters in, perhaps, HEK293T cells in the future to probe the mechanism in more detail.

      (2) The authors mention that the eIF3 3' UTR termini crosslinking pattern observed in their study was not reported in previous PAR-CLIP studies performed in HEK293T cells (Lee et al., 2015) and Jurkat cells (De Silva et al., 2021). They attribute this difference to the different UV wavelengths used in Quick-irCLIP (254 nm) and PAR-CLIP (365 nm with 4-thiouridine). While the explanation is plausible, it remains a caveat that different UV crosslinking methods may capture different eIF3 modules or binding sites, depending on the chemical propensities of the amino acid-nucleotide crosslinks at each wavelength. Without addressing this caveat in more detail, the authors cannot generalize their findings, and thus, the title of the paper, which suggests a broad role for eIF3, may be misleading. Previous studies have pointed to an enrichment of eIF3 binding at the 5' UTRs, and the divergence in results between studies needs to be more explicitly acknowledged.

      We agree with the reviewer that the two methods of crosslinking will require a more detailed head-to-head comparison in the future. However, we do think the title is justified by the fact that we see crosslinking to the termini of 3’-UTRs across thousands of transcripts in each condition. Furthermore, the 3’-UTR crosslinking is enriched on mRNAs with higher ribosome protected fragment counts (RPF) in differentiated cells, Figure 3F.

      (3) While the manuscript concludes that eIF3's interaction with 3' UTR termini is independent of poly(A)-binding proteins, transient or indirect interactions should be tested using assays such as PLA (Proximity Ligation Assay), which could provide more insights.

      This is a good idea, but would require a substantial effort better suited to a future publication. We think our observations are interesting enough to the field to stimulate future experimentation that we may or may not be most capable of doing in our lab.

      Reviewer #3 (Public review):

      Summary:

      In this manuscript by Mestre-Fos and colleagues, authors have analyzed the involvement of eIF3 binding to mRNA during differentiation of neural progenitor cells (NPC). The authors bring a lot of interesting observations leading to a novel function for eIF3 at the 3'UTR.

      During the translational burst that occurs during NPC differentiation, analysis of eIF3-associated mRNA by Quick-irCLIP reveals the unexpected binding of this initiation factor at the 3'UTR of most mRNA. Further analysis of alternative polyadenylation by APAseq highlights the close proximity of the eIF3-crosslinking position and the poly(A) tail. Furthermore, this interaction is not detected in Poly(A)-less transcripts. Using Riboseq, the authors then attempted to correlate eIF3 binding with the translation efficacy of mRNA, which would suggest a common mechanism of translational control in these cells. These observations indicate that eIF3-binding at the 3'UTR of mRNA, near the poly(A) tail, may participate to the closed-loop model of mRNA translation, bridging 5' and 3', and allowing ribosomes recycling. However, authors failed to detect interactions of eIF3, with either PABP or Paip1 or 40S subunit proteins, which is quite unexpected.

      Strength:

      The well-written manuscript presents an attractive concept regarding the mechanism of eIF3 function at the 3'UTR. Most mRNA in NPC seems to have eIF3 binding at the 3'UTR and only a few at the 5'end where it's commonly thought to bind. In a previous study from the Cate lab, eIF3 was reported to bind to a small region of the 3'UTR of the TCRA and TCRB mRNA, which was responsible for their specific translational stimulation, during T cell activation. Surprisingly in this study, the eIF3 association with mRNA occurs near polyadenylation signals in NPC, independently of cell differentiation status. This compelling evidence suggests a general mechanism of translation control by eIF3 in NPC. This observation brings back the old concept of mRNA circularization with new arguments, independent of PABP and eIF4G interaction. Finally, the discussion adequately describes the potential technical limitations of the present study compared to previous ones by the same group, due to the use of Quick-irCLIP as opposed to the PAR-CLIP/thiouridine.

      Weaknesses:

      (1) These data were obtained from an unusual cell type, limiting the generalizability of the model.

      We agree that unraveling the mechanism employed by eIF3 at the mRNA 3’-UTR termini might be better studied in a stable cell line rather than in primary cells.

      (2) This study lacks a clear explanation for the increased translation associated with NPC differentiation, as eIF3 binding is observed in both differentiated and undifferentiated NPC. For example, I find a kind of inconsistency between changes in Riboseq density (Figure 3B) and changes in protein synthesis (Figure 1D). Thus, the title overstates a modest correlation between eIF3 binding and important changes in protein synthesis.

      We thank the reviewer for this question. Riboseq data and RNASeq data are not on absolute scales when comparing across cell conditions. They are normalized internally, so increases in for example RPF in Figure 3B are relative to the bulk RPF in a given condition. By contrast, the changes in protein synthesis measured in Figure 1D is closer to an absolute measure of protein synthesis.

      (3) This is illustrated by the candidate selection that supports this demonstration. Looking at Figure 3B, ID2, and SNAT2 mRNA are not part of the High TE transcripts (in red). In contrast, the increase in mRNA abundance could explain a proportionally increased association with eIF3 as well as with ribosomes. The example of increased protein abundance of these best candidates is overall weak and uncertain.

      We agree that using TE as the criterion for defining increased eIF3 association would not be correct. By “highly translated” we only mean to convey the extent of protein synthesis, i.e. increases in ribosome protected fragments (RPF), rather than the translational efficiency.

      (4) Despite several attempts (chemical and UV cross-linking) to identify eIF3 partners in NPC such as PABP, PAIP1, or proteins from the 40S, the authors could not provide any evidence for such a mechanism consistent with the closed-loop model. Overall, this rather descriptive study lacks mechanistic insight (eIF3 binding partners).

      We agree that it will be important to identify the molecular mechanism used by eIF3 to engage the termini of mRNA 3’-UTRs. Nevertheless, the identification of eIF3 crosslinking to that location in mRNAs is new, and we think will stimulate new experiments in the field.

      (5) Finally, the authors suspect a potential impact of technical improvement provided by Quick-irCLIP, that could have been addressed rather than discussed.

      We agree a side-by-side comparison of eIF3 crosslinks captured by PAR-CLIP versus Quick-irCLIP will be an important experiment to do. However, NPCs or other primary cells may not be the best system for the comparison. We think using an established cell line might be more informative, to control for effects such as 4-thiouridine toxicity.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This work sets out to elucidate mechanistic intricacies in inflammatory responses in pneumonia in the context of the aging process (Terc deficiency - telomerase functionality).

      Strengths:

      Very interesting, conceptually speaking, approach that is by all means worth pursuing. An overall proper approach to the posited aim.

      We want to thank the reviewer for taking the time to review our manuscript and for providing positive feedback regarding our research question.  

      Weaknesses:

      The work is heavily underpowered and may have statistical deficits. This precludes it in its current state from drawing unequivocal conclusions.

      Thank you for this essential and valuable comment. We fully accept that the small sample size of the Tercko/ko mice is a major limitation of our study and transparently discuss this in our manuscript.  However, due to Animal Welfare regulations, only a reduced number of mice were approved because of the strong burden of disease. Consequently, only three non-infected and five infected mice were available to us. This reduced number of mice presents a clear limitation to our study. However, due to ethical considerations related to animal welfare and sustainability, as well as compliance with German animal welfare regulations, it is not possible to obtain additional Tercko/ko mice to increase the dataset.

      The animal studies are an important aspect of our study; however, our hypothesis was also investigated at multiple levels, including in an in vitro co-culture model (Figure 5), to ensure comprehensive analysis. Thus, we clearly demonstrated that S. aureus pneumonia in Tercko/ko mice leads to a more severe phenotype, orchestrated by the dysregulation of both innate and adaptive immune response.

      Reviewer #2 (Public Review):

      Summary:

      The authors demonstrate heightened susceptibility of Terc-KO mice to S. aureus-induced pneumonia, perform gene expression analysis from the infected lungs, find an elevated inflammatory (NLRP3) signature in some Terc-KO but not control mice, and some reduction in T cell signatures. Based on that, They conclude that disregulated inflammation and T-cell dysfunction play a major role in these phenomena.

      Strengths:

      The strengths of the work include a problem not previously addressed (the role of the Terc component of the telomerase complex) in certain aspects of resistance to bacterial infection and innate (and maybe adaptive) immune function.

      We would like to thank the reviewer for the positive feedback regarding our aim to investigate the impact of Terc deletion on the pulmonary immune response to S. aureus.  

      Weaknesses:

      The weaknesses outweigh the strengths, dominantly because conclusions are plagued by flaws in experimental design, by lack of rigorous controls, and by incomplete and inadequate approaches to testing immune function. These weaknesses are as follows

      (1) Terc-KO mice are a genomic knockout model, and therefore the authors need to carefully consider the impact of this KO on a wide range of tissues. This, however, is not the case. There are no attempts to perform cell transfers or use irradiation chimera or crosses that would be informative.

      We thank the reviewer for bringing up this important point. The aim of our study, however; was to investigate the impact of Terc deletion in the lung and on the response to bacterial pneumonia, rather than to provide a comprehensive characterization of the Tercko/ko model itself. This characterization of different tissues and cell types has already been conducted by previous studies. For instance, studies that characterize the general phenotype of the model (Herrera et al., 1999; Lee et al., 1998; Rudolph et al., 1999) but also investigations that shed light on the impact of Terc deletion on specific cell types such as microglia (Khan et al., 2015) or T cells (Matthe et al., 2022). The impact of Terc deletion on T cells is also discussed in our manuscript in lines 89 to 105. Furthermore, a section about the general phenotype of the Terc deletion model is included in the introduction in lines 126 to 138. Thus we discussed the relevant literature regarding Tercko/ko mice in our manuscript and attempted to provide a more in-depth characterization of the lung by investigating the inflammatory response to infection as well as changes in the gene expression (Figure 2-4).  

      (2) Throughout the manuscript the authors invoke the role of telomere shortening in aging, and according to them, their Terc-KO mice should be one potential model for aging. Yet the authors consistently describe major differences between young Terc-KO and naturally aging old mice, with no discussion of the implications. This further confuses the biological significance of this work as presented.

      Thank you for mentioning this relevant point. We want to apologize for the confusion regarding this matter. While Tercko/ko mice are a well-established model for premature aging, these effects become more apparent with increasing generations (G) and thus, G5 and 6 mice are the most affected by Terc deletion (Lee et al., 1998; Wong et al., 2008).

      Thus, while Tercko/ko mice are a common model for premature aging, this accelerated aging phenotype is predominantly apparent in later-generation Tercko/ko (G5 and 6) or aged Tercko/ko mice (Lee et al., 1998; Wong et al., 2008). Since the aim of this study was to analyze the impact of Terc deletion on the lung and its immune response to bacterial infections instead of the impact of telomere shortening and telomerase dysfunction, young G3 Tercko/ko mice (8 weeks) were used in this study. This is also mentioned in the lines 131-134. In this study, Tercko/ko mice were used not as a model of aging, but rather as a model specifically for Terc deletion. The old WT mice function as a control cohort to observe possible common but also deviating effects between aging and Terc deletion. In our sequencing data, we observe that uninfected young WT mice are very similar to uninfected Tercko/ko mice. Other studies have also reported this lack of major differences between uninfected WT and Tercko/ko mice in the G3 knockout mice (Kang et al., 2018). Conversely, uninfected young WT and Tercko/ko mice exhibited great differences, for instance, regarding the numbers of differentially expressed genes (Supplemental Figure 1H). Thus, differences between naturally aged mice and young G3 Tercko/ko mice are not surprising. To clarify this aspect we reconstructed the paragraph discussing the Tercko/ko mice (lines 126-134). Additionally we added a paragraph explaining the purpose of the naturally aged mice to the lines 134 to 138:

      “As control cohort age-matched young WT mice were utilized. To investigate whether Terc deletion, beyond critical telomere shortening, impacts the pulmonary immune response, we used young Tercko/ko mice. Additionally, naturally aged mice (2 years old) were infected to explore the potential link to a fully developed aging phenotype.”

      (3) Related to #2, group design for comparisons lacks a clear rationale. The authors stipulate that TercKO will mimic natural aging, but in fact, the only significant differences seen between groups in susceptibility to S. aureus are, contrary to the authors' expectation, between young Terc-KO and naturally old mice (Figures 1A and B, no difference between young Terc-KO and young wt); or there are no significant differences at all between groups (Figures 1, C, D,).

      We thank the reviewer for this essential comment. As mentioned above the Tercko/ko mice in this study are not selected to model natural aging. To model telomerase dysfunction and accelerated aging selection of later generation or aged Tercko/ko mice would have been more suitable. 

      The lack of statistical significance in some figures is likely due to the heterogeneity of disease phenotype of S. aureus infection in mice, which is a limitation of our study that we discuss in our discussion section in lines 576-582. The phenotype of S. aureus infection can vary greatly within a mouse population, highlighting the limitations of mice as a model for S. aureus infections. To account for this heterogeneity we divided the infected Tercko/ko mice cohort into different degrees of severity based on the clinical score and the presence of bacteria in organs other than the lung (mice with systemic infection). 

      Despite the heterogeneity especially within the Tercko/ko mice cohort the differences between the knockout and young as well as old WT mice were striking. Including the fatal infections, 80% of the Tercko/ko mice had a severe course of disease, while none of the WT mice displayed a severe course (Figure 1A, B and Supplemental Figure 1A, B). This hints towards a clear role of Terc in the response to S. aureus infection in mice. Thus while in some figures the differences are not significant, strong trends towards a more severe phenotype of S. aureus infection in the Tercko/ko mice regarding bacterial load, score and inflammatory response could be observed in our study. 

      Another example of inadequate group design is when the authors begin dividing their Terc-KO groups by clinical score into animals with or without "systemic infection" (the condition where a bacterium spreads uncontrollably across the many organs and via blood, which should be properly called sepsis), and then compare this sepsis group to other groups (Supplementary Figures 1G; Figure 2; lines 374-376 and 389391). This gives them significant differences in several figures, but because they did not clearly indicate where they applied this stratification in the figure legends, the data are somewhat confusing. Most importantly, methodologically it is highly inappropriate to compare one mouse with sepsis to another one without. If Terc-KO mice with sepsis are a comparator group, then their controls have to be wild-type mice with sepsis, who are dealing with the same high bacterial load across the body and are presumably forced to deploy the same set of immune defenses.

      We sincerely appreciate the significant time and effort you have invested in reviewing our manuscript. However, with all due respect, we must point out that the definition of sepsis you have referenced is considered outdated. According to the Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3), sepsis is defined as "a life-threatening organ dysfunction caused by a dysregulated host response to infection" (Marvin Singer, 2016, JAMA). Given this fundamental misunderstanding of our findings, we find the comment regarding the inadequacy of our groups to be both dismissive and lacking in scientific merit. We would like to emphasize that the group size used in our study is consistent with accepted standards in infection research. We strongly reject any insinuations of inadequacy that have been repeatedly mentioned throughout the review.

      In order to provide a nuanced investigation of disease severity in Tercko/ko mice, we added the term “systemic infection” to the figures whenever the mice were divided into groups of mice with and without systemic infection. This is the case for Figure 2A and Supplemental Figure 1C-E. The division into mice with and without systemic infection is also mentioned in the figure legend of Figure 2A in lines 932 to 935 and for Supplemental Figure 1 in lines 1052-1053. We agree that Supplemental Figure 1G is somewhat confusing as the mice with systemic infection are highlighted in this graph but not included as a separate group within our sequencing analysis. We added a sentence to the figure legend clarifying this (lines 1042-1044):

      “Nevertheless, the infected Tercko/ko mice were considered one group for the expression analysis and not split into separate groups for the subsequent analysis.”

      Additionally, we revised the section regarding this grouping in different degrees of severity in our Material and Methods section to clarify that this division was only performed for specific analysis (line 191):

      “…for the indicated analysis.”

      Furthermore, the mice which were classified as systemically infected mice were not septic mice, as mentioned above. Those mice were classified by us as systemically infected based on their clinical score and the presence of bacteria in other organs than the lung as stated in the lines 188-191 and 377-381. Bacteremia is a symptom of very severe cases of hospital-acquired pneumonia with a very high mortality (De la Calle et al., 2016).

      Therefore, the systemically infected mice or rather mice with bacteremia display an especially severe pneumonia phenotype, which is distinct from sepsis. The presence of this symptom in our Tercko/ko mice further highlights the clinical relevance of our study. This aspect was added to the manuscript in the lines 568-570.

      “The detection of bacteria in extra pulmonary organs is of particular interest, as bacteremia is a symptom of severe pneumonia and is associated with high mortality (De la Calle et al., 2016).”

      (4) The authors conclude that disregulated inflammation and T-cell dysfunction play a major role in S. aureus susceptibility. This may or may not be an important observation, because many KO mice are abnormal for a variety of reasons, and until such reasons are mechanistically dissected, the physiological importance of the observation will remain unclear.

      Two points are important here. First, there is no natural counterpart to a Terc-KO, which is a complete loss of a key non-enzymatic component of the telomerase complex starting in utero. 

      Second, the authors truly did not examine the key basic features of their model, including the features of basic and induced inflammatory and immune responses. This analysis could be done either using model antigens in adjuvants, defined innate immune stimuli (e.g. TLR, RLR, or NLR agonists), or microbial challenge. The only data provided along these lines are the baseline frequencies of total T cells in the spleen of the three groups of mice examined (not statistically significant, Figure 4B). We do not know if the composition of naïve to memory T cell subsets may have been different, and more importantly, we have no data to evaluate whether recruitment of the immune response (including T cells) to the lung upon microbial challenge is similar or different. So, what are the numbers and percentages of T cells and alveolar macrophages in the lung following S. aureus challenge and are they even comparable or are there issues in mobilizing the T cell response to the site of infection? If, for example, Terc-KO mice do not mobilize enough T cells to the lung during infection, that would explain the paucity in many T-cellassociated genes in their transcriptomic set that the authors report. That in turn may not mean dysfunction of T cells but potentially a whole different set of defects in coordinating the response in Terc-KO mice.

      We thank the reviewer for highlighting these important aspects. Regarding the first point, indeed there is no naturally occurring deletion of Terc in humans. However, studies reported reduced expression of Terc and Tert in the tissues of aged mice and rats (Tarry-Adkins et al., 2021; Zhang et al., 2018). Terc itself has been found to have several important immunomodulatory functions such as the activation of the NFκB or PI3-kinase pathway (Liu et al., 2019; Wu et al., 2022). As those aforementioned pathways are relevant for the immune response to S. aureus infections, the authors were interested in exploring the impact of Terc deletion on the pulmonary immune response. The potential immunomodulatory functions of Terc are discussed in lines 106-121. To further clarify our rationale we added a sentence to the introduction in lines 121-125.

      “Interestingly, downregulation of Terc and Tert expression in tissues of aged mice and rats has been found (Tarry-Adkins, Aiken, Dearden, Fernandez-Twinn, & Ozanne, 2021; Zhang et al., 2018). Therefore, as a potential immunomodulatory factor reduced Terc expression could be connected to agerelated pathologies.”

      Regarding the second point, as we focused on the effect of Terc deletion in the lung and its role in S. aureus infection, we investigated inflammatory and immune response parameters relevant to this setting. For instance, inflammation parameters in the lungs of all three mice cohorts were measured to investigate differences in the inflammatory response in the non-infected and infected mice (Figure 2A). Those measurements showed no baseline difference in key inflammatory parameters between young WT and Tercko/ko mice, which is consistent with previous findings (Kang et al., 2018). The inflammatory response to infection with S. aureus in the Tercko/ko mice cohort differed significantly from the other cohorts (Figure 2A), hinting towards a dysregulated inflammatory response due to Terc deletion. Furthermore, we investigated general immune cell frequencies such as dendritic cells, macrophages, and B cells in the spleen of all three mice cohorts to gather a baseline understanding of the general immune cell populations. In our manuscript only total T cell frequencies were included due to its relevance for our data regarding T cells (Figure 4B). This data could show that there was no difference of total amount of T cells in the spleen of all three mice cohorts. For a more detailed insight into our analysis we added the frequencies of the other immune cell populations analyzed in the spleen as a Supplemental Figure 3B-F. Additionally, a figure legend for the graphs was added to lines 1075-1094.

      Therefore, while we did not analyze baseline frequencies of specific populations of T cells, we analyzed and characterized the inflammatory and immune response of our model in a way relevant to our research question. 

      The differences observed in T cell marker and TCR gene expression was also partly present between the uninfected and infected Tercko/ko mice such as the complete absence of CD247 expression in infected Tercko/ko, which is however expressed in uninfected mice of this cohort (Figure 4A, C and D). Thus, this effect cannot be solely attributed to an inadequate mobilization of T cells to the lung after infectious challenge. However, we agree that a more detailed insight into recruited immune cells to the lung or frequencies of different T cell populations could contribute to a better understanding of the proposed mechanism and would be an interesting experiment to conduct in further studies. We accept this as a limitation of our study and included it in our discussion section in lines 719-723:

      “As total CD4+ T cells were analyzed in this study, it would be useful to investigate specific T cell populations such as memory and effector T cells to elucidate the potential mechanism leading to T cell dysfunctionality in further detail. Additionally, analysis of differences in immune cell recruitment to the lungs between young WT and Tercko/ko mice would be relevant.”

      (5) Related to that, immunological analysis is also inadequate. First, the authors pull signatures from the total lung tissue, which is both imprecise and potentially skewed by differences, not in gene expression but in types of cells present and/or their abundance, a feature known to be affected by aging and perhaps by Terc deficiency during infection. Second, to draw any conclusions about immune responses, the authors would have to track antigen-specific T cells, which is possible for a wide range of microbial pathogens using peptide-MHC multimers. This would allow highly precise analysis of phenomena the authors are trying to conclude about. Moreover, it would allow them to confirm their gene expression data in populations of physiological interest

      We thank the reviewer for highlighting this important and relevant point. In our study, we aimed to investigate the role of Terc expression in modulating inflammation and the immune response to S. aureus infection in the lung. To address this, we examined the overall impact of age, genotype, and infection on lung inflammation and gene expression. Therefore, sequencing of total lung tissue was essential for addressing the research question posed. Our findings demonstrate that Tercko/ko mice exhibit a more severe phenotype following S. aureus infection, characterized by an increased bacterial load and heightened lung inflammation (Figures 1 and 2). Furthermore, our data suggest that Terc plays a role in regulating inflammation through activation of the NLRP3 inflammasome, along with the dysregulation of several T cell marker genes (Figures 2, 4, and 5). However, this study lacks a detailed analysis of distinct T cell populations, including antigen-specific T cells, as noted earlier. Investigating these aspects in future studies would be valuable to validate and expand upon our findings. We have incorporated these suggestions into the discussion section (lines 719-723)

      “As total CD4+ T cells were analyzed in this study, it would be useful to investigate specific T cell populations such as memory and effector T cells to elucidate the potential mechanism leading to T cell dysfunctionality in further detail. Additionally, analysis of differences in immune cell recruitment to the lungs between young WT and Tercko/ko mice would be relevant.”

      Nevertheless, our study provides first evidence of a potential connection between T cell functionality and Terc expression.  

      Third, the authors co-incubate AM and T cells with S. aureus. There is no information here about the phenotype of T cells used. Were they naïve, and how many S. aureus-specific T cells did they contain? Or were they a mix of different cell types, which we know will change with aging (fewer naïve and many more memory cells of different flavors), and maybe even with a Terc-KO? Naïve T cells do not interact with AM; only effector and memory cells would be able to do so, once they have been primed by contact with dendritic cells bringing antigen into the lymphoid tissues, so it is unclear what the authors are modeling here. Mature primed effector T cells would go to the lung and would interact with AM, but it is almost certain that the authors did not generate these cells for their experiment (or at least nothing like that was described in the methods or the text).

      Thank you for bringing up this important question. For the co-cultivation experiment of T cells and alveolar macrophages, total CD4+ T cells of both young WT and Tercko/ko were used. We did not select for a specific population of T cells. Our sequencing data indicated the complete downregulation of CD247 expression, which is an important part of the T cell receptor, in the lungs of infected Tercko/ko mice (Figure 4A, C and D). Given that this factor is downregulated under chronic inflammatory conditions, we investigated the impact of the inflammatory response in alveolar macrophages on the expression of various T cell-derived cytokines, as well as CD247 expression (Figure 5D, E) (Dexiu et al., 2022). This aspect is also highlighted in the discussion in lines 622-636. Therefore, a co-cultivation model of T cells and alveolar macrophages was established and confronted with heat-killed S. aureus to elicit an inflammatory response of the macrophages. To emphasize this purpose, we have revised our statement about the model setup in lines 516-518 of the manuscript: 

      “An overactive inflammatory response could be a potential explanation for the dysregulated TCR signaling.”

      The authors hope this will clarify the intent behind the model setup.

      (6) Overall, the authors began to address the role of Terc in bacterial susceptibility, but to what extent that specifically involves inflammation and macrophages, T cell immunity, or aging remains unclear at present.

      We thank the reviewer for the helpful and relevant comments. The authors accept the limitations of the presented study such as the reduced number of Tercko/ko mice and the limitations of murine models for S. aureus infection itself and discuss those in the discussion section in the lines 558-560; 576-582; 688-690 and 719-725. However, we hope that our responses have provided sufficient evidence to convince the reviewer that our data supports a clear role for Terc expression in regulating the immune response to bacterial infections, particularly with respect to inflammation and its potential connection to T cell functionality.  

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      The good element first:

      I read this paper with genuine interest and applaud the authors for investigating the posited question. I consider it by all means scientifically relevant in the context of physiological/pathophysiological aging and reaction to a disease (here pneumonia). The Terc deletion model looks very appropriate for the question and the methodology is very advanced/in-depth. The data flow/selection of endpoints and assays is very logical to me. Moreover, I like the breakdown of pneumonia into varying levels of severity.

      We thank the reviewer for their time and effort taken to revise our manuscript. Additionally, we are grateful to receive your positive feedback regarding our study design and research question.

      The weaknesses:

      (1) I cannot help but notice that the study is heavily underpowered. As such, it is inadmissible. The key reason is that it is the first of its kind and seminal findings must be strongly propped by the evidence. It is apparent to me that the data scatter presented in the figures tends to be abnormally distributed (e.g. obvious bimodal distribution in some groups). Therefore, the presented comparisons (even if stat. sign) can be heavily misleading in terms of: i) the true magnitude of the observed effects and ii) possible type 2 error in some cases of p value >0.05. Solution: repeat the study to ensure reasonable power/reliability. This will also make it stronger as it will immediately demonstrate its reproducibility (or lack of it).

      Thank you for bringing up this extremely relevant point. We acknowledge the issue of the small sample size of Tercko/ko mice as a major limitation of our study. This limitation is also included in our discussion section in the lines 558-560. Thus we fully agree with this limitation and transparently discuss this in our manuscript. However, due to the strict German animal welfare regulations it is not possible to obtain more Tercko/ko mice, as mentioned above. Furthermore, since fatal infections occurred in the Tercko/ko mice cohort we had a reduced number of mice available. 

      However, the differences between the Tercko/ko and WT mice were striking. Including the fatal infections 80% of the Tercko/ko mice had a severe course of disease, while none of the WT mice displayed a severe course. This hints towards a clear role of Terc in the response to S. aureus infection in mice.  

      (2) In the stat analysis section of M&Ms, the authors feature only 1 sentence. This cannot be. A detailed stats workup needs to be included there. This is very much related to the above weakness; e.g. it is impossible to test for normality (to choose an appropriate post-hoc test) with n=3. Back to square one: study underpowered.

      We thank the reviewer for highlighting this important aspect. We carefully revised the method section in lines 357-360 to include all relevant information: 

      “Data are presented as mean ± SD, or as median with interquartile range for violin and box plots, with up to four levels of statistical significance indicated. P-values were calculated using Kruskal-Wallis test. Individual replicates are represented as single data points.”

      (3) Pneumonia severity. While I noted that as a strength, I also note it as weakness here. It looks to me like the authors stopped halfway with this. I totally support testing a biological effect(s) such as the one investigated here across a spectrum of a given disease severity. The authors mention that they had various severity phenotypes produced in their model but this is not visible in the data figs. I strongly suggest including that as well; i.e., to study the posited question in the severe and mild pneumonia phenotype. This is a very smart path and previous preclinical research clearly demonstrated that this severe/mild distinction is very relevant in the context of the observed responses (their presence/absence, longevity, dynamics, etc). I realize this is challenging, thus, I would probably use this approach in the Terc k/o model as sort of a calibrator to see whether the exacerbation observed in the current setup (severe?) will be also present in a mild pneumonia phenotype. S. aureus can be effectively titrated to produce pneumonia of varying severity.

      We thank the reviewer for bringing up this relevant point. 

      In our study, we could observe heterogeneity within the infected Tercko/ko cohort. Therefore as pointed out by the reviewer we assigned different degrees of severity to those groups based on clinical scores, the fatal outcome of the disease (fatal subgroup), and the presence of bacteria in organs other than the lungs (systemic infection subgroup) as stated in our materials and methods part in the lines 188-191 (Supplemental Figure 1A and B). Moreover, we highlighted this difference in a number of our figures. For example, when categorizing the mice into groups with and without systemic infection, we noticed that the mice with systemic infection demonstrated a higher bacterial load, significant body weight loss, and increased lung weight (see Supplemental Figure 1C-E). Interestingly, the two mice with systemic infection clustered separately from the other mice, indicating that the mice with systemic infection are transcriptomically distinct from the other mice cohorts (Supplemental Figure 1G). Additionally, the inflammatory response was exclusively elevated in the lungs of mice with systemic infection (Figure 2C). Thus, we included this distinction in several figures and attempted to study the differences between those subgroups but also their similarities. For instance, we could observe that some changes in the transcriptome were present in all three infected Tercko/ko mice such as the complete absence of CD247 expression at 24 hpi (Figure 4D). This distinction therefore provided a more detailed insight into the underlying mechanisms of disease severity in Tercko/ko mice and is lacking in other studies. We agree with the reviewer, that a study investigating mild and severe pneumonia phenotypes would be clinically relevant. However, as noted above, due to ethical considerations related to animal welfare and sustainability, as well as compliance with German animal welfare regulations, it is not possible to obtain additional Tercko/ko mice to carry out the proposed experiment. 

      (4) Please read ARRIVE guidelines and note the relevant info in M&Ms as ARRIVE guidelines point out.

      Thank you for emphasizing this crucial aspect. We revised our materials and methods section according to the ARRIVE guidelines (lines 179-206).

      “Tercko/ko mice aged 8 weeks, were used for infection studies (n = 8; non-infected = 3; infected = 5). Female young WT (age 8 weeks) and old WT (age 24 months) C57Bl/6 mice (both n = 10; non-infected = 5; infected = 5) were purchased from Janvier Labs (Le Genest-Saint-Isle, France). All infected mouse cohorts were compared to their respective non-infected controls, as well as to the infected groups from other cohorts. Additionally, comparisons were made between the non-infected cohorts across all groups.

      All mice were anesthetized with 2% isoflurane before intranasal infection with S. aureus USA300 (1x108 CFU/20µl) per mouse. After 24 hours, the mice were weighed and scored as previously described (Hornung et al., 2023). Infected Tercko/ko mice were grouped into different degrees of severity based on their clinical score, fatal outcome of the disease (fatal) and the presence of bacteria in organs other than the lung (systemic infection) for the indicated analysis. Mice with fatal infections were excluded from subsequent analyses, with only their final scores being reported. The mice were sacrificed via injection of an overdose of xylazine/ketamine and bleeding of axillary artery after 24 hpi. BAL was collected by instillation and subsequent retrieval of PBS into the lungs. Serum and organs were collected. Bacterial load in the BAL, kidney and liver was determined by plating of serially diluted sample as described above. For this organs were previously homogenized in the appropriate volume of PBS. Gene expression was analyzed in the right superior lung lobe. Lobes were therefore homogenized in the appropriate amount of TriZol LS reagent (Thermo Fisher Scientific, Waltham, MA, US) prior to RNA extraction. The left lung lobe was embedded into Tissue Tek O.C.T. (science services, Munich, Germany) and stored at 80°C until further processing for histological analysis. Cytokine measurements were performed using the right inferior lung lobe. Lobes were previously homogenized in the appropriate volume of PBS. Remaining organs were stored at -80°C until further usage. Mouse studies were conducted without the use of randomization or blinding.“

      (5) There are also some other descriptive deficits but they are of a much smaller caliber so I do not list them.

      We thank the reviewer for their valuable and insightful suggestions for improving our manuscript. We hope that our responses and the corresponding revisions address these suggestions satisfactorily.

      Concluding: the investigative idea is great/interesting and the methodological flow is adequate but the low power makes this study of low reliability in its current form. I strongly urge the authors to walk the extra mile with this work to make it comprehensive and reliable. Best of luck!

      Reviewer #2 (Recommendations For The Authors):

      (1) Many legends are uninformative and do not contain critical information about the experiments. For example, Figure 2A with cytokine measurements (in lung homogenates?) is likely showing data from an ELISA or Luminex test, but there is no mention of that in the legend. It stands next to Figure 2B, which is a gene expression map, again, likely from the lung (prepared how, normalized how, etc?) lacking even the most basic information. Further, Figure 2D has no information on the meaning/effect size of gene ratios on the x-axis. Figures 3 and 4 are presumably the subsets of their transcriptome data set (whole lung, harvested on d ?? post-infection), but that is just a guess on my part. Even in the main text, the timing and the controls for the transcriptomic study are not stated (ln. 398 and onwards). The authors really need to revise the figure legends and provide all the details that an average reader would need to be able to interpret the data.

      We thank the reviewer for bringing up this important point. The figure legends of all figures including supplemental figures were revised to ensure they include all relevant data necessary for accurate interpretation of the graphs. Additionally, we clarified the sequenced samples in lines 427-429:

      “We performed mRNA sequencing of the murine lung tissue of infected and non-infected mice at 24 hpi to elucidate potential differentially expressed genes that contribute to the more severe illness of Tercko/ko mice.”

      (2) Telomere shortening affects differentially different cells and its role in aging is nuanced - different in mesenchymal cells with no telomerase induction, in non-replicating cells, and in hematopoietic cells that can readily induce telomerase. The authors should be mindful of that in setting up their introduction and discussion.

      Thank you for mentioning this essential aspect. We revised our introduction and discussion to reflect the nuanced role of telomerase shortening in different tissues (lines 83-92 and 690-695):

      “Telomerase activity is restricted to specific tissues and cell types, largely dependent on the expression of Tert. While Tert is highly expressed in stem cells, progenitor cells, and germline cells, its expression is minimal in most differentiated cells (Chakravarti, LaBella, & DePinho, 2021). Consequently, the impact of telomerase dysfunction on tissues varies according to their self-renewal rate. (Chakravarti et al., 2021). One important aspect of telomere dysfunction is the impact of telomere shortening on the immune system as well as the hematopoietic system. Tissues or organ systems that are highly replicative, such as the skin or the hematopoietic system, are affected first by telomere shortening (Chakravarti et al., 2021).”

      “It is important to note that telomere shortening has a significant impact on the immune system. Although young Tercko/ko mice were used in this study, telomere shortening is still likely to be a contributing factor. Therefore, further experiments investigating the role of T cell senescence in this model should therefore be conducted.”

      (3) Syntax and formulations need to be improved and made more scientifically precise in several spots. Specifically, in 62-63, the authors say that the aged immune system "is also discussed to be more irritable", please change to reflect the common notion that the reaction to infection is dysregulated; in many cases inflammation itself is initially blunted, misdirected, and of different type (e.g. for viruses, the key IFN-I responses are not increased but decreased). In lines 114-117, presumably, the two sentences were supposed to be connected by a comma, although some editing for clarity is probably needed regardless. Line 252, please change "unspecific" to "non-specific". Line 264, please capitalize German.

      We thank the reviewer for bringing these important points to our attention. We revised our introduction regarding the aged immune response in lines 61-69:

      “Age-related dysregulation of the immune response is also characterized by inflammaging, defined as the presence of elevated levels of pro-inflammatory cytokines in the absence of an obvious inflammatory trigger (Franceschi et al., 2000; Mogilenko, Shchukina, & Artyomov, 2022). Additionally, immune cells, such as macrophages, exhibit an activated state that alters their response to infection (Canan et al., 2014). In contrast, the immune response of macrophages to infectious challenges has been shown to be initially impaired in aged mice (Boe, Boule, & Kovacs, 2017). Thus aging is a relevant factor impacting the pulmonary immune response.”

      Sentences were edited to provide more clarity in lines 131-134:

      “Although G3 Tercko/ko mice with shortened telomeres were used in this study, they were infected at a young age (8 weeks). This approach allowed for the investigation of Terc deletion effects rather than telomere dysfunction.”

      “Unspecific was changed to “non-specific” in line 282 and “German” was capitalized in line 293 and 558.

      We appreciate and thank you for your time spent processing this manuscript and look forward to your response.

      References

      De la Calle, C., Morata, L., Cobos-Trigueros, N., Martinez, J. A., Cardozo, C., Mensa, J., & Soriano, A. (2016). Staphylococcus aureus bacteremic pneumonia. European Journal of Clinical Microbiology & Infectious Diseases, 35(3), 497-502. https://doi.org/10.1007/s10096-015-2566-8  

      Dexiu, C., Xianying, L., Yingchun, H., & Jiafu, L. (2022). Advances in CD247. Scand J Immunol, 96(1), e13170. https://doi.org/10.1111/sji.13170  

      Herrera, E., Samper, E., Martín-Caballero, J., Flores, J. M., Lee, H. W., & Blasco, M. A. (1999). Disease

      states associated with telomerase deficiency appear earlier in mice with short telomeres. Embo j, 18(11), 2950-2960. https://doi.org/10.1093/emboj/18.11.2950  

      Hornung, F., Schulz, L., Köse-Vogel, N., Häder, A., Grießhammer, J., Wittschieber, D., Autsch, A., Ehrhardt, C., Mall, G., Löffler, B., & Deinhardt-Emmer, S. (2023). Thoracic adipose tissue contributes to severe virus infection of the lung. International Journal of Obesity, 47(11), 10881099. https://doi.org/10.1038/s41366-023-01362-w  

      Kang, Y., Zhang, H., Zhao, Y., Wang, Y., Wang, W., He, Y., Zhang, W., Zhang, W., Zhu, X., Zhou, Y., Zhang, L., Ju, Z., & Shi, L. (2018). Telomere Dysfunction Disturbs Macrophage Mitochondrial Metabolism and the NLRP3 Inflammasome through the PGC-1α/TNFAIP3 Axis. Cell Reports, 22(13), 3493-3506. https://doi.org/https://doi.org/10.1016/j.celrep.2018.02.071  

      Khan, A. M., Babcock, A. A., Saeed, H., Myhre, C. L., Kassem, M., & Finsen, B. (2015). Telomere dysfunction reduces microglial numbers without fully inducing an aging phenotype. Neurobiology of Aging, 36(6), 2164-2175. https://doi.org/https://doi.org/10.1016/j.neurobiolaging.2015.03.008  

      Lee, H.-W., Blasco, M. A., Gottlieb, G. J., Horner, J. W., Greider, C. W., & DePinho, R. A. (1998). Essential role of mouse telomerase in highly proliferative organs. Nature, 392(6676), 569-574. https://doi.org/10.1038/33345  

      Liu, H., Yang, Y., Ge, Y., Liu, J., & Zhao, Y. (2019). TERC promotes cellular inflammatory response independent of telomerase. Nucleic Acids Research, 47(15), 8084-8095. https://doi.org/10.1093/nar/gkz584  

      Matthe, D. M., Thoma, O. M., Sperka, T., Neurath, M. F., & Waldner, M. J. (2022). Telomerase deficiency reflects age-associated changes in CD4+ T cells. Immun Ageing, 19(1), 16. https://doi.org/10.1186/s12979-022-00273-0  

      Rudolph, K. L., Chang, S., Lee, H. W., Blasco, M., Gottlieb, G. J., Greider, C., & DePinho, R. A. (1999). Longevity, stress response, and cancer in aging telomerase-deficient mice. Cell, 96(5), 701-712. https://doi.org/10.1016/s0092-8674(00)80580-2  

      Tarry-Adkins, J. L., Aiken, C. E., Dearden, L., Fernandez-Twinn, D. S., & Ozanne, S. (2021). Exploring Telomere Dynamics in Aging Male Rat Tissues: Can Tissue-Specific Differences Contribute to Age-Associated Pathologies? Gerontology, 67(2), 233-242. https://doi.org/10.1159/000511608  

      Wong, L. S. M., Oeseburg, H., de Boer, R. A., van Gilst, W. H., van Veldhuisen, D. J., & van der Harst, P. (2008). Telomere biology in cardiovascular disease: the TERC−/− mouse as a model for heart failure and ageing. Cardiovascular Research, 81(2), 244-252. https://doi.org/10.1093/cvr/cvn337  

      Wu, S., Ge, Y., Lin, K., Liu, Q., Zhou, H., Hu, Q., Zhao, Y., He, W., & Ju, Z. (2022). Telomerase RNA TERC and the PI3K-AKT pathway form a positive feedback loop to regulate cell proliferation independent of telomerase activity. Nucleic Acids Res, 50(7), 3764-3776. https://doi.org/10.1093/nar/gkac179  

      Zhang, M. W., Zhao, P., Yung, W. H., Sheng, Y., Ke, Y., & Qian, Z. M. (2018). Tissue iron is negatively correlated with TERC or TERT mRNA expression: A heterochronic parabiosis study in mice. Aging (Albany NY), 10(12), 3834-3850. https://doi.org/10.18632/aging.101676