10,000 Matching Annotations
  1. Last 7 days
    1. Handling type-occurrence relationships

      This already exists in IFC, where property can be overriden by the instance, while instances inherit properties from the type.

    2. Simplified geometry

      How can IFC support distributed authoring and collaboration, if it doesn't support the wide range of geometry representations used in the industry?

    3. Entity Component System (ECS)

      This is what IFC as STEP already IS! The entity itself is merely ID, name and description. Everything else is referenced directly (geometry, placement), or indirectly (all the relationships)

    4. restricted adoption

      For sure, there was a REASON for the restricted adoption. It is usually better practise to follow successfull and widely adopted standards, rather than those proved not usable or useful.

    5. The lack of flexibility in the current schema hinders efficiency and can lead to errors in the design and construction process.

      Fixed schema is crucial to keep the data exchange reliable. Flexible schema means low level of implementation confidence.

    6. distributed authorship in construction projects

      This is the case for the native authoring tools. Is this the case for IFC? Do we envisage the existance of a master IFC model? IFC5 also aims to simplify the geometry definitions. Will this distributed system just work with meshed geometries?

    7. Incorporating new concepts or relationships often requires significant changes

      As part of the standard, adding new entity attributes to the end doesn't break the schema backwards compatibility. In the other direction - making significant changes in any contract, any modelling language will cause breaking changes for those using the contract.

    8. The monolithic structure and advanced modeling techniques used make the schema difficult to manage and extend.

      This is caused by the way buildingSMART has been using STEP. It has native support for modules. buildingSMART had made a deliberate choice of not using them in past.

    1. People who estimate too high often feel they don’t have enough time. They may have time anxiety and often feel frustrated

      so, basically the. people who assume and possibly excuse that they never have time??

    1. Until we gain the ballot and place proper public officials in office this condition will continue to exist. In communities where we confront difficulties in gaining the ballot, we must use all legal and moral means to remove these difficulties.

      We still see a struggle for this today in the form of gerrymandering. Even though minorities have the right to vote, districts are still rigged to put people who do not represent their population in office. Texas is particularly bad about this. We could also connect this to challenges against the voting systems this past election cycle as well as efforts to cut down on mail-in ballots. The laws might have changed, but efforts to keep people from the ballot are still going strong.

    2. Now we are faced with the challenge of making it spiritually one. Through our scientific genius we have made of the world a neighborhood; now through our moral and spiritual genius we must make of it a brotherhood. We are all involved in the single process. Whatever affects one directly affects all indirectly. We are all links in the great chain of humanity.

      King's call to transform a "neighborhood" into a "brotherhood" is a call to close the gap between what we can do and what we should do. It means valuing connection not just as a function of convenience, but as a commitment to mutual respect and shared responsibility.

      Without that, we risk building a world that’s highly advanced, but deeply broken. A world where we can send messages across the planet, but still fail to hear each other.

    3. The new world is a world of geographical togetherness. This means that no individual or nation can live alone.

      This is even more true today with the introduction of social media. Whereas before you would have to wait for the news or a letter from your friend to know what is going on in another part of the world, now we have instant connection to people in many different countries. While this is great for raising awareness, it can also be a bit overwhelming to have access to so much information.

    4. It was in this year that the Supreme Court of this nation, through the Plessy v. Ferguson Decision, established the doctrine of separate-but-equal as the law of the land. Through this decision segregation gained legal and moral sanction. The end results of the Plessy Doctrine was that it lead to a strict enforcement of the “separate,” with hardly the slightest attempt to abide by the “equal.” So the Plessy Doctrine ended up making for tragic inequalities and ungodly exploitation.

      By creating the doctrine of "separate but equal," the Supreme Court gave both legal and moral backing to racial segregation. In practice, however, this ruling focused entirely on keeping races separate, with little to no attempt to guarantee actual equality. As a result, African Americans faced harsh unequal conditions in education, public facilities, and nearly all aspects of daily life. The promise of equality was abandoned, allowing a system of deep injustice, exploitation, and moral wrongdoing to take hold.

    5. Let nobody fool you, all of the loud noises that you hear today from the legislative halls of the South in terms of “interposition” and “nullification,” and of outlawing the NAACP, are merely the death groans from a dying system. The old order is passing away, and the new order is coming into being. We are witnessing in our day the birth of a new age, with a new structure of freedom and justice.

      I appreciate how optimistic King is throughout this address, but, thinking of how our current administration is operating, it's difficult to interpret those actions as the death of the old order. It seems more like the reintroduction.

    6. True peace is not merely the absence of some negative force—tension, confusion, or war; it is the presence of some positive force—justice, goodwill and brotherhood. And so the peace which presently existed between the races was a negative peace devoid of any positive and lasting quality.

      I do like this interpretation of peace being an active, positive force rather than a neutral lack. I feel that most times "keeping the peace" means to not advocate for oneself or, in the case of police response to protests, is actively violent.

    1. ’Tis unmanly grief. 0289  It shows a will most incorrect to heaven, 0290 100 A heart unfortified, ⟨a⟩ mind impatient, 0291  An understanding simple and unschooled. 0292  For what we know must be and is as common 0293  As any the most vulgar thing to sense, 0294  Why should we in our peevish opposition 0295 105 Take it to heart? Fie, ’tis a fault to heaven, 0296  A fault against the dead, a fault to nature, 0297  To reason most absurd, whose common theme 0298  Is death of fathers, and who still hath cried, 0299  From the first corse till he that died today, 0300 110 “This must be so.” We pray you, throw to earth 0301  This unprevailing woe and think of us 0302  As of a father; for let the world take note, 0303  You are the most immediate to our throne, 0304  And with no less nobility of love 0305 115 Than that which dearest father bears his son 0306  Do I impart toward you

      This passage was interesting because of the king's attitude towards his nephew. It is very clear that the king has no empathy. However, even though their relationship is quite strained at this moment of the story, despite there not being a clear conflict just yet, the king still gave advice. Sure, untimely, out of touch advice, but still it is advice. He could simply disengage, or brush Hamlet off, but didn't. Makes me think if there was just a little part of him deep inside to still care for Hamlet.

    1. "Black Panther" and "Harry Potter and the Philosopher’s Stone" are the films Gilliam can't stand. The former because it looks silly and inauthentic, the latter because he should've been the director for it, not Chris Columbus.

    2. The two billion-dollar movies Terry Gilliam can’t stand: “It’s utter bullshit”

      "Black Panther" and "Harry Potter and the Philosopher’s Stone" are the films Gilliam can't stand. The former because it looks silly and inauthentic, the latter because he should've been the director for it, not Chris Columbus.

    1. The capacities to think crit-ically, communicate clearly, and solve complex problems have always been markers of college-educated people; all of these depend on skills of reflection. The results of a 2021 national survey of employers by the Association of American Colleges and Universities found that critical thinking and “encouraging students to think for themselves

      These are all skills developed through liberal education

    2. Liberal education begins with curiosity. When we start asking “why?” and “how does this matter in real life?” we are beginning the process of lib-eral education.

      These are emerging questions

    1. eLife Assessment

      This computational modeling study builds on multiple previous lines of experimental and theoretical research to investigate how a single neuron can solve a nonlinear pattern classification task. The revised manuscript presents convincing evidence that the location of synapses on dendritic branches, as well as synaptic plasticity of excitatory and inhibitory synapses, influences the ability of a neuron to discriminate combinations of sensory stimuli. The ideas in this work are very interesting, presenting an important direction in the computational neuroscience field about how to harness the computational power of "active dendrites" for solving learning tasks.

    2. Reviewer #1 (Public review):

      Summary:

      This computational modeling study builds on multiple previous lines of experimental and theoretical research to investigate how a single neuron can solve a nonlinear pattern classification task. The authors construct a detailed biophysical and morphological model of a single striatal medium spiny neuron, and endow excitatory and inhibitory synapses with dynamic synaptic plasticity mechanisms that are sensitive to (1) the presence or absence of a dopamine reward signal, and (2) spatiotemporal coincidence of synaptic activity in single dendritic branches. The latter coincidence is detected by voltage-dependent NMDA-type glutamate receptors, which can generate a type of dendritic spike referred to as a "plateau potential." In the absence of inhibitory plasticity, the proposed mechanisms result in good performance on a nonlinear classification task when specific input features are segregated and clustered onto individual branches, but reduced performance when input features are randomly distributed across branches. Interestingly, adding inhibitory plasticity improves classification performance even when input features are randomly distributed.

      Strengths:

      The integrative aspect of this study is its major strength. It is challenging to relate low-level details such as electrical spine compartmentalization, extrasynaptic neurotransmitter concentrations, dendritic nonlinearities, spatial clustering of correlated inputs, and plasticity of excitatory and inhibitory synapses to high-level computations such as nonlinear feature classification. Due to high simulation costs, it is rare to see highly biophysical and morphological models used for learning studies that require repeated stimulus presentations over the course of a training procedure. The study aspires to prove the principle that experimentally-supported biological mechanisms can explain complex learning.

      Weaknesses:

      The high level of complexity of each component of the model makes it difficult to gain an intuition for which aspects of the model are essential for its performance, or responsible for its poor performance under certain conditions. Stripping down some of the biophysical detail and comparing it to a simpler model may help better understand each component in isolation.

    3. Reviewer #2 (Public review):

      Summary:

      The study explores how single striatal projection neurons (SPNs) utilize dendritic nonlinearities to solve complex integration tasks. It introduces a calcium-based synaptic learning rule that incorporates local calcium dynamics and dopaminergic signals, along with metaplasticity to ensure stability for synaptic weights. Results show SPNs can solve the nonlinear feature binding problem and enhance computational efficiency through inhibitory plasticity in dendrites, emphasizing the significant computational potential of individual neurons. In summary, the study provides a more biologically plausible solution to single-neuron learning and gives further mechanical insights into complex computations at the single-neuron level.

      Strengths:

      The paper introduces a novel learning rule for training a single multicompartmental neuron model to perform nonlinear feature binding tasks (NFBP), highlighting two main strengths: the learning rule is local, calcium-based, and requires only sparse reward signals, making it highly biologically plausible, and it applies to detailed neuron models that effectively preserve dendritic nonlinearities, contrasting with many previous studies that use simplified models.

    4. Author response:

      The following is the authors’ response to the previous reviews

      Reviewer #1 (Public review):

      Summary:

      This computational modeling study builds on multiple previous lines of experimental and theoretical research to investigate how a single neuron can solve a nonlinear pattern classification task. The authors construct a detailed biophysical and morphological model of a single striatal medium spiny neuron, and endow excitatory and inhibitory synapses with dynamic synaptic plasticity mechanisms that are sensitive to (1) the presence or absence of a dopamine reward signal, and (2) spatiotemporal coincidence of synaptic activity in single dendritic branches. The latter coincidence is detected by voltage-dependent NMDA-type glutamate receptors, which can generate a type of dendritic spike referred to as a "plateau potential." In the absence of inhibitory plasticity, the proposed mechanisms result in good performance on a nonlinear classification task when specific input features are segregated and clustered onto individual branches, but reduced performance when input features are randomly distributed across branches. Interestingly, adding inhibitory plasticity improves classification performance even when input features are randomly distributed.

      Strengths:

      The integrative aspect of this study is its major strength. It is challenging to relate low-level details such as electrical spine compartmentalization, extrasynaptic neurotransmitter concentrations, dendritic nonlinearities, spatial clustering of correlated inputs, and plasticity of excitatory and inhibitory synapses to high-level computations such as nonlinear feature classification. Due to high simulation costs, it is rare to see highly biophysical and morphological models used for learning studies that require repeated stimulus presentations over the course of a training procedure. The study aspires to prove the principle that experimentally-supported biological mechanisms can explain complex learning.

      Weaknesses:

      The high level of complexity of each component of the model makes it difficult to gain an intuition for which aspects of the model are essential for its performance, or responsible for its poor performance under certain conditions. Stripping down some of the biophysical detail and comparing it to a simpler model may help better understand each component in isolation.

      We greatly appreciate your recognition of the study’s integrative scope and the challenges of linking detailed biophysics to high-level computation. We acknowledge that the model’s complexity can obscure the contribution of individual components. However, as stated in the introduction the principles already have been shown in simplified theoretical models for instance  in Tran-Van-Minh et al. 2015. Our aim here was to extend those ideas into a more biologically detailed setting to test whether the same principles still hold under realistic constraints. While simplification can aid intuition, we believe that demonstrating these effects in a biophysically grounded model strengthens the overall conclusion. We agree that further comparisons with reduced models would be valuable for isolating the contribution of specific components and plan to explore that in future work.  

      Reviewer #2 (Public review):

      Summary:

      The study explores how single striatal projection neurons (SPNs) utilize dendritic nonlinearities to solve complex integration tasks. It introduces a calcium-based synaptic learning rule that incorporates local calcium dynamics and dopaminergic signals, along with metaplasticity to ensure stability for synaptic weights. Results show SPNs can solve the nonlinear feature binding problem and enhance computational efficiency through inhibitory plasticity in dendrites, emphasizing the significant computational potential of individual neurons. In summary, the study provides a more biologically plausible solution to single-neuron learning and gives further mechanical insights into complex computations at the single-neuron level.

      Strengths:

      The paper introduces a novel learning rule for training a single multicompartmental neuron model to perform nonlinear feature binding tasks (NFBP), highlighting two main strengths: the learning rule is local, calcium-based, and requires only sparse reward signals, making it highly biologically plausible, and it applies to detailed neuron models that effectively preserve dendritic nonlinearities, contrasting with many previous studies that use simplified models.

      Thank you for highlighting the biological plausibility of our calcium- and dopamine-dependent learning rule and its ability to exploit dendritic nonlinearities. Your positive assessment reinforces our commitment to refining the rule and exploring its implications in larger, more diverse settings.

      Reviewer #1 (Recommendations for the authors):

      Major recommendations:

      P9: When introducing the excitatory learning rule, the reader is referred to the Methods. I suggest moving Figure 7A-D, "Excitatory plasticity" to be more prominently presented in the main body of the paper where the reader needs to understand it. There are errors in the current Figure 7, and wrong/confusing acronyms. The abbreviations "LTP-K" and "MP-K" are not intuitive. In A, I would spell out "LTP kernel" and "Theta_LTP adaptation".  In B, I would spell out "LTD kernel" and "Theta_LTD adaptation".

      We have clarified the terminology in Figure 7 by replacing “LTP-K” with “LTP kernel” and “MP-K” with “metaplasticity kernel”.  While we kept Figure 7 in the Methods section to maintain the flow of the main text, we agree that an earlier introduction of the learning rule improves clarity. To that end, we added a simplified schematic to Figure 3 in the Results section, which provides readers with an accessible overview of the excitatory plasticity mechanism at the point where it is first introduced.

      In C, for simplicity and clarity, I would only show the initial and updated LTP kernel and Calcium and remove the Theta_LTP adaptation curve, it's too busy and not necessary. Similarly in D, I would show only the initial and updated LTD kernel and Calcium and remove the Theta_LTD adaptation curve. In the current version of the Figure, panel B, right incorrectly labels "Theta_LTD" as "Theta_LTP". Panel D incorrectly labels "LTD kernel" as "LTP/MP-K" in the subheading and "MP/LTP-K" in the graph.

      To avoid confusion and better illustrate the interactions between calcium signals, kernels, and thresholds, we have added a movie showing how these components evolve during learning. The figure panels remain as originally designed, since the LTP kernel governs both potentiation and depression through metaplastic threshold adaptation, while the LTD kernel remains fixed.

      P17: Again, instead of pointing the reader to the Methods, I would move Figure 7E, "Inhibitory plasticity" to the main body of the paper where the reader needs to understand it. For clarity, I would label "C_TL" and "Theta_Inh,low" and "C_TH" as "Theta_Inh,high". The right panel could be better labeled "Inhibitory plasticity kernel". The left panel could be better labeled "Theta_Inh adaptation", with again replacing the acronyms "C_TL" and "C_TH". The same applies to Fig. 5D on P19.

      We have updated the labeling in Figures 5D and 7E for clarity, including replacing "C_TL" and "C_TH" with "Theta_Inh,low" and "Theta_Inh,high". In addition, we added a simplified schematic of the inhibitory plasticity rule to Figure 5 to assist the reader’s understanding when presenting the results. Figure 7E remains in the Methods section to preserve the flow of the main text.

      P12: I would suggest simplifying Fig. 3 panels and acronyms as well. Remove "MP-K" from C and D. Relabel "LTP-K" as "LTP kernel". The same applies to Fig. 5E on P19 and Fig. 3 - supplement 1 on P46 and Fig 6 - supplement 1 on P49.

      We have simplified the labeling across all relevant figures by replacing “MP-K” with “metaplasticity kernel” and “LTP-K” with “LTP kernel.” To maintain clarity, we retained these terms in only one panel as a reference.

      Minor recommendations:

      P4: "Although not discussed much in more theoretical work, our study demonstrates the necessity of metaplasticity for achieving stable and physiologically realistic synaptic weights." This sentence is jarring. BCM and metaplasticity has been discussed in hundreds of theory papers! Cite some. This sentence would more accurately read, "Our study corroborates prior theory work (citations) demonstrating that metaplasticity helps to achieve stable and physiologically realistic synaptic weights."

      We have followed the reviewers suggestion and updated the sentence to: Previous theoretical studies (Bienenstock et al., 1982; Fusi et al., 2005; Clopath et al., 2010; Benna & Fusi, 2016; Zenke & Gerstner, 2017) demonstrate the essential role of metaplasticity in maintaining stability in synaptic weight distributions. (page 2 line 49-51, page 3 line 1)

      P9: Grammar. "The neuron model was during training activated..." should read "During training, the neuron model was activated..."

      Corrected

      P17: Lovett-Barron et al., 2012 is appropriately cited here. Milstein et al., Neuron, 2015 also showed dendritic inhibition regulates plateau potentials in CA1 pyramidal cells in vitro, and Grienberger et al., Nat. Neurosci., 2017 showed it in vivo.

      P19 vs P16 vs P21. Fig. 4B, Fig. 5B, and Fig. 6B choose different strategies to show variance across seeds. Please choose one strategy and apply to all comparable plots.

      We thank the reviewer for these helpful points.

      We have added the suggested citations (Milstein et al., 2015; Grienberger et al., 2017) alongside Lovett-Barron et al., 2012. 

      Variance across seeds is now displayed uniformly (mean is solid line STD is shaded area) in Figures 4B, 5B, and 6B.

      Reviewer #2 (Recommendations for the authors):

      Major Points:

      (1)  Quality of Scientific Writing:

      i. Mathematical and Implementation Details:

      I appreciate the authors' efforts in clarifying the mathematical details and providing pseudocode for the learning rule, significantly improving readability and reproducibility. The reference to existing models via GitHub and ModelDB repositories is acceptable. However, I suggest enhancing the presentation quality of equations within the Methods section-currently, they are low-resolution images. Please consider rewriting these equations using LaTeX or replacing them with high-resolution images to further improve clarity.

      We appreciate the reviewer’s comment regarding clarity and reproducibility. In response, we have rewritten all equations in LaTeX to improve their readability and presentation quality in the Methods section.

      ii. Figure quality.

      I acknowledge the authors' effort to improve figure clarity and consistency throughout the manuscript. However, I notice that the x-axis label "[Ca]_v (μm)" in Fig. 7E still appears compressed and unclear. Additionally, given the complexity and abundance of hyperparameters or artificial settings involved in your experimental design and learning rule (such as kernel parameters, metaplasticity kernels, and unspecific features), the current arrangement of subfigures (particularly Fig. 3C, D and Fig. 5D, E) still poses readability challenges. I recommend reordering subfigures to present primary results (e.g., performance outcomes) prominently upfront, while relegating visualizations of detailed hyperparameter manipulations or feature weight variations to later sections or the discussion, thus enhancing clarity for readers.

      We thank the reviewer for pointing out the readability issue. We have corrected the x-axis label in Figure 7D. We hope this new layout with a simplified rule in Fig 3 and Fig 5   presents the key findings while retaining full mechanistic detail to make it easier to understand the model behavior.  

      iii. Writing clarity.

      The authors have streamlined the "Metaplasticity" section and reduced references to dopamine, which is a positive step. However, the broader issue remains: the manuscript still appears overly detailed and more like a technical report of a novel learning rule, rather than a clearly structured scientific paper. I strongly recommend that the authors further distill the manuscript by clearly focusing on one or two central scientific questions or hypotheses-for instance, emphasizing core insights such as "inhibitory inputs facilitate nonlinear dendritic computations" or "distal dendritic inputs significantly contribute to nonlinear integration." Clarifying and highlighting these primary scientific questions early and consistently throughout the manuscript would substantially enhance readability and impact.

      We appreciate the reviewer’s guidance on improving the manuscript’s clarity and focus.In response, we now highlight two central questions at the end of the Introduction and have retitled the main Results subsections to follow this thread, thereby sharpening the manuscript’s focus while retaining necessary technical detail (page3 line 20-28).We have also removed redundant passages and simplified technical details to improve overall readability .

      Minor:

      (1) The [Ca]NMDA in Figure 2A and 2C can have large values even when very few synapses are activated. Why is that? Is this setting biologically realistic?

      The authors acknowledge that their simulated [Ca²⁺] levels exceed typical biological measurements but claim that the learning rule remains robust across variations in calcium concentrations. However, robustness to calcium variations was not explicitly demonstrated in the main figures. To convincingly address this concern, I recommend the authors explicitly test and present whether adopting biologically realistic calcium concentrations (~1 μM) impacts the learning outcomes or synaptic weight dynamics. Clarifying this point with a supplemental analysis or an additional figure panel would significantly strengthen their argument regarding the model's biological plausibility and robustness.

      We thank the reviewer for the comment. The elevated [Ca<sup>²⁺</sup>]<sub>NMDA</sub> values reflect localized transients in spine heads with narrow necks and high NMDA conductance. These values are not problematic for our model, as the plasticity rule depends on relative calcium differences rather than absolute levels as the metaplasticity kernel will adjust. In future versions of our detailed neuron model, we will likely decrease the spine axial resistance of the spine neck.

    1. point sampling is unique because it allows you to match information collection effort with the desired level of inference. Under point sampling, the minimum data collection effort is called a continuous tally, which means a count of measurement trees is kept across the nnn sampling locations (no additional information is recorded—not even how many measurement trees were observed at each sampling location). At the end of a continuous tally cruise, you have the total number of measurement trees mmm, which is used to compute the mean basal area per unit area estimate as

      In fact there is no difference to other plot designs, you can do exactly the same also in fixed area or nested plots... If you calculate the expansion factor per tree and expand e.g. tree basal area to one hectar, then you can sum this over all of your trees (from multiple plots) and divide by n. Same result! Sum(y_i) (plot aggregate) is here equal to Sum(y_ij) (sum over trees). Only that you need no expansion factor here, since you are already counting on a per ha basis

    2. the constant kkk in feet is

      I am confused: this is only for counting factor k=1? For measurement in meters and cm the ratios for k=1,2,4 are 1:50, 1:34.5 and 1:25 respectively (both in same units). Means, using k=1 (every counted tree is 1m²/ha) a tree with dbh of 35cm (0.35m) has a maximum distance of 0.35*50=17.5 meters SORRY, just realized that k is not counting factor (as used here in Germany). Our "k" is your BAF and what you notate as k is what we call c...

    3. so it’s worth the extra time to conduct the limiting distance calculation

      very right! It is usually doubtless for many trees that are definitely IN, for those close to the border it is difficult to guess and a distance measurement is indicated...

    4. its probability of selection is proportional to its DBH

      No, its proportional to its basal area (it is the radius of a circle that is proportional to dbh, then the circle area is proportional to dbh^2 or basal area)

    5. 3⋅TF

      If you leave out the 3*F it becomes more general. Trees per acre is just the sum of tree expansion factors. Then it can also be used for unequal probability designs in which expansion factors might vary with tree size

    6. Under the plot sampling rule, all trees in the population have equal probability of being measured.

      Only for fixed area plots (rather uncommon). For nested plots or Bitterlich this is not he case

    7. for each location

      "for each location" would only hold for fixed area plots. As soon as a unequal probability design is used (nested plots, Bitterlich), we would expand on the tree level and aggregate to plot level afterwards.

    1. i'm not sure if this is blackpill or whitepill, but my there are a heap of new papers along with my own experiences that are showing "best of N is all you need" for most problems as long as: - sufficient core knowledge was included in the training data - the model is sufficiently large / you use more than 1 model to promote reasonable idea diversity - N is sufficiently large for the complexity of the problem at hand - you have some reasonable discrimination process at the end to determine / approximate "best" result we really haven't come close to leveraging the full potential of existing models, and the antiquated sampling process / approaches are the single biggest culprit

      what if we include some sampling distribution on the output side?

    1. In this preprint, the authors describe and test a method for calculating which articles were covered by transformative agreements. Using a test case of publications likely covered by transformative agreements in the Netherlands, the authors find that their method works well, correctly identifying 89% of the articles in the sample. The preprint was reviewed by two metaresearchers deeply involved in studying transformative agreements. Both commend the paper for its clarity, methodological transparency, and timely contribution to ongoing discussions about transformative agreements. They highlight the paper’s practical value, particularly the Dutch case study’s validation using national research information, and its epistemically modest discussion of the method’s capabilities and limitations. Key suggestions for improvement include clarifying aspects of the methodology, presenting key results earlier in the text, and enhancing the clarity and interpretability of the figures.

    2. This article presents an open method for tracking journal articles under transformative agreements using open metadata. The authors apply their approach to the Dutch context as a case study, including validation against national research information. The well-written paper usefully highlights, in an accessible and easy-to-understand manner, how both researchers and practitioners evaluating this open access licensing model can navigate data gaps. By demonstrating that estimating publications under transformative agreements requires combining multiple data sources, the authors offer practical methodological insights for those interested in this prevalent licensing model but uncertain about data sources and their limitations. They also highlight the progress made in increasing the transparency of transformative agreements, which has often been lacking in previous subscription agreements between libraries and publishers.

      In my view, the key contribution of this study is the validation using Dutch research information. This allows to show that while many articles could be matched, there are shortcomings that do not reflect weaknesses in the open method itself, but rather the current state of the data infrastructure for transparency around transformative agreements and open access. Particularly noteworthy is the finding that there are challenges in using corresponding author information as a proxy to delineate open access funding. While it is already known that open metadata (here OpenAlex) on corresponding authors is not as complete as proprietary databases, the validation also reveals that even when corresponding author data are available, issues can arise, particularly with multiple affiliations. The availability of funding information faces similar limitations.

      This leads me to wonder whether the complex structure of transformative agreements on monitoring should warrant a broader discussion based on the findings of this work. Of course, full disclosure of open access invoicing through a community-owned open data service would help assessment but nevertheless makes comparisons between publishers and countries difficult. Examples that can hardly be controlled by open metadata about publication but would require a thorough analysis of the contracts themselves, have been extensively explored by the authors in their qualitative analysis: Authors can decide whether or not to publish open access, agreements can be capped, not all article types are eligible, time lag between submission and publication, and institutions involved. Funding contexts add to this complexity.

      Given this complexity, I wonder whether a focus on ESAC to disclose articles enabled by transformative agreements, which is a community effort only run by the Max Planck Digital Library, is sufficient. Perhaps the authors can speculate on the role of existing national infrastructures and workflows around subscription-based publishing in libraries (serial cataloguing and license management)? Can they be transformed to increase the transparency and thus the accountability of this licensing model, or are these infrastructure services no longer be needed in favour of international open metadata initiatives that have been set up together with transformative agreements? Another consideration might be the role of discovery services such as Unpaywall/OpenAlex or OpenAIRE. I think the paper provides a very good overview of these different actors from a data perspective, but the case study would benefit from a discussion of how the different actors involved, particularly in the Dutch context, could work together to achieve more streamlined monitoring through the combination of data services and standardised agreements, as much data seems to already exist internally.

      Apart, I have two other considerations:

      I suggest that the results section could benefit from earlier mention of the number and proportion of articles that could not be matched. Although these are effectively summarised in the conclusion (last paragraph, page 12), incorporating this information earlier would improve the presentation of findings. I consider the identification of publications missed by the open method due to limitations in the availability of corresponding author data and funding information to be an essential outcome of this research.

      Regarding methodology, I had some difficulty understanding where the disambiguation of ISSN variants took place. The text indicates that this information was obtained from the JCT ("The data from the Journal Checker tool is exposed through a publicly available API. It used ISSN (more precisely ISSN-L) to identify journals and RoR-IDs to identify institutions",page 6).  However, to my knowledge, ISSN-L retrieval is not supported by the JCT API? Upon examination of the code, it appears that ISSN linking to ISSN-L may have been established using Unpaywall data, while Figure 2 refers to Crossref in this context.

      In summary, I would like to congratulate the authors on this important contribution and recommend that all those concerned with open access business models, and those involved in improving the evidence base for transformative agreements, read this important work and adopt the open method presented.

    3. Through a case study of analyzing research publications supported by funding of the Dutch Research Council NOW this manuscript provides a concise description and assessment of a methodology based on utilizing open data sources for identifying which funded publications have been made available within so-called transformative open access agreements. The research gap that is addressed is a relevant and interesting one, as exact and measurable aspects of transformative agreements are still scarce despite the massive financial investments made into them and the breadth of research outputs that are impacted by them.

      The introduction and literature review are good and appropriate at framing the context of the study and provide a thorough positioning of this study in relation to previous work in this area. I especially appreciate that the authors have included and given credit to the recently published study by Jahn (2025), and provide a clear argument for how the two studies are within the same topic area but come with different contributions.

      The methods section provides a transparent description of the workflow utilized for the study, the data collection and analysis requires a few different steps and working with data on different levels (agreement, journal, article) but the provided narrative provides sufficient detail for the reader to follow the process.

      The results section is the only area where I see some room for improvement in terms of presentation, the results themselves are valid and in my view interpreted correctly.  Figure 3 is a central visualization of the results of the project but it is hard to interpret independently from reading the text, and even by reading the text some aspects remain unclear. I would suggest to consider the following changes 1) make the venn-areas proportional, now the areas are not sized in relation to the data they represent (use e.g. https://www.deepvenn.com/, https://www.meta-chart.com/venn or a similar tool), 2) insert data labels and legends so that one can get a grip of what the different areas and colors of the visualization mean without reading the text, 3) revise the text that refers to the figure so that each part of the figure would be mention in consecutive order following a predictable structure. Currently only some of the lettered areas are mentioned and those which are not mentioned require quite a lot of effort from the reader to figure out.

      The discussion and conclusions are good and I like that the authors are not over-selling the contribution in any way, rather being very realistic in what this method can and cannot achieve.

      Overall I think this is a strong paper that provides a valuable and focused contribution to intersecting area of bibliometric research and science policy that still contains many unanswered questions due to the lack of comprehensive data, to which this study provides one more piece to the puzzle.

    4. We would like to thank Najko Jahn and Mikael Laakso for their very positive and thoughtful reviews, which significantly improved our article. In response to the reviewers' specific comments, we have corrected all identified errors and made a number of improvements. In particular, regarding the presentation of results (Figure 3) and a more comprehensive speculation on the role of national and international infrastructure providers for unlocking article level metadata on transformative agreements. These are the changes we made in response to the reviewer’s comments:

      Response to Reviewer #1: Najko Jahn

      We agree that a broader discussion of transformative agreement complexity would be valuable, and this paper provides important insights—particularly regarding author opt-out possibilities, contract caps, and exclusion of non-research articles. However, we believe such discussion extends beyond the scope of this paper which we have meant primarily as a study to validate a method for tracking / analysing transformative agreements using open data.

      Following the reviewer's advice, we have updated the concluding section to include discussion of how various national and international infrastructures could facilitate more open availability of article-level transformative agreement data.

      We have implemented the reviewer's suggestion to present main results earlier in the paper. The key findings regarding matched and unmatched articles are now introduced on pages 3-4.

      The reviewer correctly identified our imprecise description of journal identification methodology. We have corrected both the figure and text to accurately reflect our use of ISSN (obtained from Crossref) rather than ISSN-L.

      Response to Reviewer #2: Mikael Laakso

      We thank the reviewer for their insightful feedback. We have implemented all suggested improvements to our results presentation, enhancing the Venn diagram in Figure 3 by: 1) making areas proportional to data, 2) improving label clarity, 3) providing clearer caption explanations, and 4) revising the accompanying text to follow the figure's label sequence.

      The revised version of the article is available here.

    1. the film we’re talking about is a much-forgotten and even less loved Stephen King adaptation named Sleepwalkers from 1992

      Sleepwalkers (1992) is the film Mark Hamill forgot he appeared in. It was a tiny cameo.

    1. “Amrish is my favourite villain,” Spielberg once gushed about Amrish Puri, who played the infamous heart-ripping voodoo priest Mola Ram in Indiana Jones and the Temple of Doom. “The best the world has ever produced and ever will.”

      Amrish Puri is Steven Spielberg’s favorite bad guy actor.

    1. eLife Assessment

      This important computational study investigates homeostatic plasticity mechanisms that neurons may employ to achieve and maintain stable target activity patterns. The work extends previous analyses of calcium-dependent homeostatic mechanisms based on ion channel density by considering activity-dependent shifts in channel activation and inactivation properties that operate on faster and potentially variable timescales. The model simulations convincingly demonstrate the potential functional importance of these mechanisms.

    2. Reviewer #1 (Public review):

      This revision of the computational study by Mondal et al addresses several issues that I raised in the previous round of reviews and, as such, is greatly improved. The manuscript is more readable, its findings are more clearly described, and both the introduction and the discussion sections are tighter and more to the point. And thank you for addressing the three timescales of half activation/inactivation parameters. It makes the mechanism clearer.

      Some issues remain that I bring up below.

      Comment:

      I still have a bone to pick with the claim that "activity-dependent changes in channel voltage-dependence alone are insufficient to attain bursting". As I mentioned in my previous comment, this is also the case for the gmax values (channel density). If you choose the gmax's to be in a reasonable range, then the statement above is simply cannot be true. And if, in contrast, you choose the activation/inactivation parameters to be unreasonable, then no set of gmax's can produce proper activity. So I remain baffled what exactly is the point that the authors are trying to make.

    3. Reviewer #2 (Public review):

      Summary:

      In this study, Mondal and co-authors present the development of a computational model of homeostatic plasticity incorporating activity-dependent regulation of gating properties (activation, inactivation) of ion channels. The authors show that, similar to what has been observed for activity-dependent regulation of ion channel conductances, implementing activity-dependent regulation of voltage sensitivity participates in the achievement of a target phenotype (bursting or spiking). The results however suggest that activity-dependent regulation of voltage sensitivity is not sufficient to allow this and needs to be associated with the regulation of ion channel conductances in order to reliably reach target phenotype. Although the implementation of this biologically relevant phenomenon is undeniably relevant, a few important questions are left unanswered.

      Strengths:

      (1) Implementing activity-dependent regulation of gating properties of ion channels is biologically relevant.

      (2) The modeling work appears to be well performed and provides results that are consistent with previous work performed by the same group.

      Weaknesses:

      (1) The main question not addressed in the paper is the relative efficiency and/or participation of voltage-dependence regulation compared to channel conductance in achieving the expected pattern of activity. Is voltage-dependence participating to 50% or 10%. Although this is a difficult question to answer (and it might even be difficult to provide a number), it is important to determine whether channel conductance regulation remains the main parameter allowing the achievement of a precise pattern of activity (or its recovery after perturbation).

      (2) Another related question is whether the speed of recovery is significantly modified by implementing voltage-dependence regulation (it seems to be the case looking at Figure 3). More generally, I believe it would be important to give insights into the overall benefit of implementing voltage-dependence regulation, beyond its rather obvious biological relevance.

      (3) Along the same line, the conclusion about how voltage-dependence regulation and channel conductance regulation interact to provide the neuron with the expected activity pattern (summarized and illustrated in Figure 6) is rather qualitative. Consistent with my previous comments, one would expect some quantitative answers to this question, rather than an illustration that approximately places a solution in parameter space.

    4. Reviewer #3 (Public review):

      Mondal et al. use computational modeling to investigate how activity-dependent shifts in voltage-dependent (in)activation curves can complement changes in ion channel conductance to support homeostatic plasticity. While it is well established that the voltage-dependent properties of ion channels influence neuronal excitability, their potential role in homeostatic regulation, alongside conductance changes, has remained largely unexplored. The results presented here demonstrate that activity-dependent regulation of voltage dependence can interact with conductance plasticity to enable neurons to attain and maintain target activity patterns, in this case, intrinsic bursting. Notably, the timescale of these voltage-dependent shifts influences the final steady-state configuration of the model, shaping both channel parameters and activity features such as burst period and duration. A major conclusion of the study is that altering this timescale can seamlessly modulate a neuron's intrinsic properties, which the authors suggest may be a mechanism for adaptation to perturbations.

      While this conclusion is largely well-supported, additional analyses could help clarify its scope. For instance, the effects of timescale alterations are clearly demonstrated when the model transitions from an initial state that does not meet the target activity pattern to a new stable state. However, Fig. 6 and the accompanying discussion appear to suggest that changing the timescale alone is sufficient to shift neuronal activity more generally. It would be helpful to clarify that this effect primarily applies during periods of adaptation, such as neurodevelopment or in response to perturbations, and not necessarily once the system has reached a stable, steady state. As currently presented, the simulations do not test whether modifying the timescale can influence activity after the model has stabilized. In such conditions, changes in timescale are unlikely to affect network dynamics unless they somehow alter the stability of the solution, which is not shown here. That said, it seems plausible that real neurons experience ongoing small perturbations which, in conjunction with changes in timescale, could allow gradual shifts toward new solutions. This possibility is not discussed but could be a fruitful direction for future work.

      Editor's note: The authors have adequately addressed the concerns raised in the public reviews above, as well as the previous recommendations, and revised the manuscript where necessary.

    5. Author response:

      The following is the authors’ response to the previous reviews

      Reviewer #1 (Public review):

      I still have a bone to pick with the claim that "activity-dependent changes in channel voltage-dependence alone are insufficient to attain bursting". As I mentioned in my previous comment, this is also the case for the gmax values (channel density). If you choose the gmax's to be in a reasonable range, then the statement above is simply cannot be true. And if, in contrast, you choose the activation/inactivation parameters to be unreasonable, then no set of gmax's can produce proper activity. So I remain baffled what exactly is the point that the authors are trying to make.

      We thank the reviewer for this clarification. We did not intend to imply that voltage-dependence modulation is universally incapable of supporting bursting or that conductance changes alone are universally sufficient. To avoid any overstatement, we now write:

      “…activity-dependent changes in channel voltage-dependence alone did not assemble bursting from these low-conductance initial states (cf. Figure 1B)”.

      Reviewer #2 (Public review):

      (1) The main question not addressed in the paper is the relative efficiency and/or participation of voltage-dependence regulation compared to channel conductance in achieving the expected pattern of activity. Is voltage-dependence participating to 50% or 10%. Although this is a difficult question to answer (and it might even be difficult to provide a number), it is important to determine whether channel conductance regulation remains the main parameter allowing the achievement of a precise pattern of activity (or its recovery after perturbation).

      We appreciate the reviewer’s interest in a quantitative partitioning of the contributions from voltage-dependence regulation versus conductance regulation. We agree that this would be an important analysis in principle. In practice, obtaining this would be difficult.

      Our goal here was to establish the principle: that half-(in)activation shifts can meaningfully influence recovery. This is not an obvious result, given that these two processes can act on vastly different timescales.

      That said, our current dataset does provide partial quantitative insight. Eight of the twenty models required some form of voltage-dependence modulation to recover; among these, two only recovered under fast modulation and two only under slow modulation. This demonstrates that voltage-dependence regulation is essential for recovery in some neurons, and its timescale critically shapes the outcome.

      (2) Another related question is whether the speed of recovery is significantly modified by implemeting voltage-dependence regulation (it seems to be the case looking at Figure 3). More generally, I believe it would be important to give insights into the overall benefit of implementing voltage-dependence regulation, beyond its rather obvious biological relevance.

      Our current results suggest that voltage-dependence regulation can indeed accelerate recovery, as illustrated in Figure 3 and supported by additional simulations (not shown). However, a fully quantitative comparison (e.g., time-to-recovery distributions or survival analysis) would require a much larger ensemble of degenerate models to achieve sufficient statistical power across all four conditions. Generating and simulating this expanded model set is computationally intensive, requiring stochastic searches in a high-dimensional parameter space, full time-course simulations, and a subsequent selection process that may succeed or fail.

      The principal aim of the present study is conceptual: to demonstrate that this multi-timescale homeostatic model—built here for the first time—can capture interactions between conductance regulation and voltage-dependence modulation during assembly (“neurodevelopment”) and perturbation. Establishing the conceptual framework and exploring its qualitative behavior were the necessary first steps before pursuing a large-scale quantitative study.

      (3) Along the same line, the conclusion about how voltage-dependence regulation and channel conductance regulation interact to provide the neuron with the expected activity pattern (summarized and illustrated in Figure 6) is rather qualitative. Consistent with my previous comments, one would expect some quantitative answers to this question, rather than an illustration that approximately places a solution in parameter space.

      We appreciate the reviewer’s interest in a more quantitative characterization of the interaction between voltage-dependence and conductance regulation (Fig. 6). As noted in our responses to Comments 1 and 2, some of the facets of this interaction—such as the ability to recover from perturbations and the speed of assembly—can be measured.

      However, fully quantifying the landscape sketched in Figure 6 would require systematically mapping the regions of high-dimensional parameter space where stable solutions exist. In our model, this space spans 18 dimensions (maximal conductances and half‑(in)activations). Even a coarse grid with three samples per dimension would entail over 100 million simulations, which is computationally prohibitive and would still collapse to a schematic representation for visualization.

      For this reason, we chose to present Figure 6 as a conceptual summary, illustrating the qualitative organization of solutions and the role of multi-timescale regulation, rather than attempting an exhaustive mapping. We view this figure as a necessary first step toward guiding future, more quantitative analyses.

      Reviewer #3 (Public review):

      Mondal et al. use computational modeling to investigate how activity-dependent shifts in voltage-dependent (in)activation curves can complement changes in ion channel conductance to support homeostatic plasticity. While it is well established that the voltage-dependent properties of ion channels influence neuronal excitability, their potential role in homeostatic regulation, alongside conductance changes, has remained largely unexplored. The results presented here demonstrate that activity-dependent regulation of voltage dependence can interact with conductance plasticity to enable neurons to attain and maintain target activity patterns, in this case, intrinsic bursting. Notably, the timescale of these voltage-dependent shifts influences the final steady-state configuration of the model, shaping both channel parameters and activity features such as burst period and duration. A major conclusion of the study is that altering this timescale can seamlessly modulate a neuron's intrinsic properties, which the authors suggest may be a mechanism for adaptation to perturbations.

      While this conclusion is largely well-supported, additional analyses could help clarify its scope. For instance, the effects of timescale alterations are clearly demonstrated when the model transitions from an initial state that does not meet the target activity pattern to a new stable state. However, Fig. 6 and the accompanying discussion appear to suggest that changing the timescale alone is sufficient to shift neuronal activity more generally. It would be helpful to clarify that this effect primarily applies during periods of adaptation, such as neurodevelopment or in response to perturbations, and not necessarily once the system has reached a stable, steady state. As currently presented, the simulations do not test whether modifying the timescale can influence activity after the model has stabilized. In such conditions, changes in timescale are unlikely to affect network dynamics unless they somehow alter the stability of the solution, which is not shown here. That said, it seems plausible that real neurons experience ongoing small perturbations which, in conjunction with changes in timescale, could allow gradual shifts toward new solutions. This possibility is not discussed but could be a fruitful direction for future work.

      We thank the reviewer for this thoughtful comment and for highlighting an important point about the scope of our conclusions regarding timescale effects. The reviewer is correct that our simulations demonstrate the influence of voltage-dependence timescale primarily during periods of adaptation—when the neuron is moving from an initial, target-mismatched state toward a final target-satisfying state. Once the system has reached a stable solution, simply changing the timescale of voltage-dependent modulation does not by itself shift the neuron’s activity, unless a new perturbation occurs that re-engages the homeostatic mechanism. We have clarified this point in the revised Discussion.

      The confusion likely arose from imprecise phrasing in the original text describing Figure 6. Previously, we wrote:

      “When channel gating properties are altered quickly in response to deviations from the target activity, the resulting electrical patterns are shown in Figure 6 as the orange bubble labeled 𝝉<sub>𝒉𝒂𝒍𝒇</sub> = 6 s”. 

      We have revised this sentence to emphasize that the orange bubble represents the eventual stable state, rather than implying that timescale changes alone drive activity shifts:

      ”When channel gating properties are altered quickly in response to deviations from the target activity, the neuron ultimately settles into a stable activity pattern. The resulting electrical patterns are shown in Figure 6 as the orange bubble labeled 𝝉<sub>𝒉𝒂𝒍𝒇</sub> = 6 s”.

      Reviewer #1 (Recommendations for the authors):

      Unless I am missing something, Figure 2 should be a supplement to Figure 1. I would prefer to see panel B in Figure 1 to indicate that the findings of that figure are general. Panel A really is not showing anything useful to the reader.

      We appreciate the suggestion to combine Figure 2 with Figure 1, but we believe keeping Figure 2 separate better preserves the manuscript’s flow. Figure 1 illustrates the mechanism in a single model, while Figure 2 presents the population-level summary that generalizes the phenomenon across all models.

      Also, I find Figure 6 unnecessary and its description in the Discussion more detracting than useful. Even with the descriptions, I find nothing in the figure itself that clarifies the concept.

      We appreciate the reviewer’s feedback on Figure 6. The purpose of this figure is to conceptually illustrate that multiple degenerate solutions can satisfy the calcium target and that the timescale of voltage‑dependence modulation can influence which region of this solution space is accessed during the acquisition of the activity target. Reviewer 3 noted some confusion about this point. We made a small clarifying edit.

      At the risk of being really picky, I also don't see the purpose of Figure 7. And I find it strange to plot -Vm just because that's the argument of findpeaks.

      We appreciate the reviewer’s comment on Figure 7. The purpose of this figure is to illustrate exactly what the findpeaks function is detecting, as indicated by the red arrows on the traces. For readers unfamiliar with findpeaks, it may not be obvious how the algorithm interprets the waveform. Showing the peaks directly ensures that the measurements used in our analysis align with what one would intuitively expect.

      Reviewer #2 (Recommendations for the authors):

      The writing of the article has been much improved since the last version. It is much clearer, and the discussion has been improved and better addresses the biological foundations and relevance of the study. However, conclusions are rather qualitative, while one would expect some quantitative answers to be provided by the modeling approach.

      We appreciate the reviewer’s concern regarding quantification and share this perspective. As noted above, our study is primarily conceptual. Many aspects of the model, such as calcium handling and channel regulation, are parameterized based on incomplete biological data. These uncertainties make robust quantitative predictions difficult, so we focus on qualitative outcomes that are likely to hold independently of specific parameter choices.

    1. the rise of reality television as a core genre and the pervasive spread of serial narrative across a wide range of fictional formats.

      This part stood out to me because it shows how TV in the 2000s wasn’t just about typical scripted shows. Reality tv became very popular, and a lot of shows started having ongoing storylines that kept viewers interested and coming back.

    2. the ways that DVDs and their popularity allow television to be consumed and collected has drastically changed the place of the television series in the cultural landscape as well as altering the narrative possibilities available to creators.

      This makes sense because DVDs let people binge shows before streaming existed. I think that it definitely gave more freedom to TV creators in how they told their stories since they could plan longer storylines knowing people could watch it all at their own speed.

    3. n the 21st century, we should not abandon the sustained values of scheduled flow and serialized spectatorship in a full embrace of boxed aesthetics. Hopefully the two models of viewing and storytelling can coexist for years to come, serving distinct but equally valid cultural functions and values.

      Even though in the last paragraph, the author hopes for the two models of viewing and storytelling to survive, in my opinion, as more ways of watching videos, TV shows, or movies continue to come up, this won't be possible, as viewers now have the option to control the flow of what, where, and when to watch their favorite shows or movies.

    1. The sample variance is, for the most part, out of our control—it’s a characteristic of the population

      Sorry, but this is completely wrong! We have full control over the population variance (except practical constrains). If the population consist of plots, then we are the ones who define the character of this population! Imagine you make your n=100 plots twice as large as the area of interest (means: 100 times full census), then the population variance among these plots is zero! Any plot size smaller than the area of interest will introduce variability among the observations. And in your case of subdividing the area into cells: Of you make them larger, the variance among them will get smaller...

    2. which is an astonishingly large number!

      And now assume you would allow such a quadratic plot at any location, then is getting infinite. BUT: contrary to finite population in which we expect a different response in each cell, this does not hold in an infinite population. Many sampling locations (also infinite) would lead to the same included trees. See https://youtu.be/-8CVcXOKRxM?si=MpWdojUe3FbcU-Mn

    3. Finite means there’s a limited, i.e., countable, number of units in the population

      Ok, somehow you like to stick to this finite definition of the population. In the next sentence, however, you say (correctly) that each possible sample must have a non-zero probability. In real life: you could select a sample plot at any point in the total forest area. In a finite population view this is not the case (and somehow you are limited to quadratic cells as plots, which is not in line with what you explain later)?? Why is it so important to define the population as finite here? Imagine we conduct SRS by random selection of x,y coordinates in the area (and install plots there). Then we are outside of your population concept (the population of possible plots is infinite).

    4. for finite populations

      Which means you need to consider finite population correction in many estimators. Why not infinite populations (which is what we are doing in reality)? The concepts you introduce later (SRS, stratified, double sampling, ... ) would hold in exactly the same way, only that you can ignore the fpc

    1. First, this is another Chromium-based browser. Second, Chromium/Chrome-based browsers are awful at portablization. Passwords are locked to a single PC, extensions routinely get lost, it doesn't fully work from Unicode paths, etc. They're only just barely kinda portable and only held together with duct tape. And this is entirely due to the Chrome/Chromium code underneath. Firefox, for instance, is *wonderfully* portable and, in terms of portability, is a Ferrari compared to Chrome/Chromium's Yugo. Third, Brave's speed improvement claims are vs a Chromium browser without adblocking. If you use Chrome with uBlock or AdBlockPlus, you'll get the same performance as Brave which negates its only current advantage since the publisher-based micropayment system isn't a thing yet.
      • !!!
    1. It is another Chromium browser Please search. Brave has been discussed many times before. It is another Chromium-based browser. We won't make any more Chromium-based browsers portable unless one of them makes their browser significantly more portable-friendly. Brave is actually worse for portability than most of the others because of some of the Chrome features they've limited and/or removed.
      • !!!
    1. eLife Assessment

      This study presents a valuable investigation into cell-specific microstructural development in the neonatal rat brain using diffusion-weighted magnetic resonance spectroscopy. The evidence supporting the core claims is solid, with innovative in vivo data acquisition and modeling, noting residual caveats with regard to the limitations of diffusion-weighted magnetic resonance spectroscopy for strict validation of cell-type-specific metabolite compartmentation. In addition, the study provides community resources that will benefit researchers in this field. The work will be of interest to researchers studying brain development and biophysical imaging methods.

    2. Reviewer #1 (Public review):

      In this work, Ligneul and coauthors implemented diffusion-weighted MRS in young rats to follow longitudinally and in vivo the microstructural changes occurring during brain development. Diffusion-weighted MRS is here instrumental in assessing microstructure in a cell-specific manner, as opposed to the claimed gold-standard (manganese-enhanced MRI) that can only probe changes in brain volume. Differential microstructure and complexification of the cerebellum and the thalamus during rat brain development were observed non-invasively. In particular, lower metabolite ADC with increasing age were measured in both brain regions, reflecting increasing cellular restriction with brain maturation. Higher sphere (representing cell bodies) fraction for neuronal metabolites (total NAA, glutamate) and total creatine and taurine in the cerebellum compared to the thalamus were estimated, reflecting the unique structure of the cerebellar granular layer with a high density of cell bodies. Decreasing sphere fraction with age was observed in the cerebellum, reflecting the development of the dendritic tree of Purkinje cells and Bergmann glia. From morphometric analyses, the authors could probe non-monotonic branching evolution in the cerebellum, matching 3D representations of Purkinje cells expansion and complexification with age. Finally, the authors highlighted taurine as a potential new marker of cerebellar development.

      From a technical standpoint, this work clearly demonstrates the potential of diffusion-weighted MRS at probing microstructure changes of the developing brain non-invasively, paving the way for its application in pathological cases. Ligneul and coauthors also show that diffusion-weighted MRS acquisitions in neonates are feasible, despite the known technical challenges of such measurements, even in adult rats. They also provide all necessary resources to reproduce and build upon their work, which is highly valuable for the community.

      From a biological standpoint, claims are well supported by the microstructure parameters derived from advanced biophysical modelling of the diffusion MRS data.

      Specific strengths:

      (1) The interpretation of dMRS data in terms of cell-specific microstructure through advanced biophysical modelling (e.g. the sphere fraction, modelling the fraction of cell bodies versus neuronal or astrocytic processes) is a strong asset of the study, going beyond the more commonly used signal representation metrics such as the apparent diffusion coefficient, which lacks specificity to biological phenomena.

      (2) The fairly good data quality despite the complexity of the experimental framework should be praised: diffusion-weighted MRS was acquired in two brain regions (although not in the same animals) and longitudinally, in neonates, including data at high b-values and multiple diffusion times, which altogether constitutes a large-scale dataset of high value for the diffusion-weighted MRS community.

      (3) The authors have shared publicly data and codes used for processing and fitting, which will allow one to reproduce or extend the scope of this work to disease populations, and which goes in line with the current effort of the MR(S) community for data sharing.

      Specific weaknesses:

      Ligneul and coauthors have convincingly addressed and included my comments from the first and second round in their revised manuscript.

      I believe the following conceptual concerns, which are inherent to the nature of the study and do not require further adjustments of the manuscript, remain:

      (1) Metabolite compartmentation in one cell type or the other has often been challenged and is currently impossible to validate in vivo. Here, Ligneul and coauthors did not use this assumption a priori and supported their claims also with non-MR literature (eg. for Taurine), but the interpretation of results in that direction should be made with care.

      (2) Longitudinal MR studies of the developing brain make it difficult to extract parameters with an "absolute" meaning. Indirect assumptions used to derive such parameters may change with age and become confounding factors (brain structure, cell distribution, concentrations normalizing metabolites (here macromolecules), relaxation times...). While findings of the manuscript are convincing and supported with literature, the true underlying nature of such changes might be difficult to access.

      (3) Diffusion MRI in addition to diffusion MRS would have been complementary and beneficial to validate some of the signal contributions, but was unfeasible in the time constraints of experiments on young animals.

    3. Author response:

      The following is the authors’ response to the previous reviews

      We thank the reviewers once again for their careful evaluation of the revised manuscript and for their constructive suggestions. In response to the remaining recommendations, we have made minor amendments to the manuscript. The main changes are as follows:

      • Metabolite Concentrations: we now report them more conventionally, i.e. normalised by water content. The original normalisation by the absolute MM content has been retained in the supplementary information, as MMs are an endogenous tissue probe (i.e., not dependent on cerebrospinal fluid).  The fact that both water and MM normalisation provide similar trends supports the robustness of our conclusions. We have also updated Figure S2 to include the absolute MM concentrations, raw water content, and the MM-to-water ratios for each time point.

      • Taurine Interpretation: We have revised the wording related to the interpretation of taurine findings to clarify that we present a set of converging observations suggesting taurine may serve as a marker of early cerebellar neurodevelopment, rather than asserting it as a definitive conclusion.

      Comments to the editor & reviewers:

      We sincerely thank the reviewers and the editor for their valuable feedback, which has significantly improved the manuscript since its initial submission.

      Please note a correction in Figure S2 (added during the previous revision round): the reported evolution of metabolite/water concentrations has changed due to an earlier error in calculating the water peak integral, which has now been corrected.

      While we recognise that a study and manuscript can always be improved, we prefer not to make further changes at this stage. We cannot conduct new experiments, and redesigning the model falls outside the scope of this work. Additionally, we believe that further altering the manuscript’s structure could lead to unnecessary confusion rather than clarity.

    1. After all, the first true use of theopen-ended series format would seem to be the news bulletin, endlesslyupdating events and never synthesising them

      This is interesting because it shows how TV and storytelling started with the news. News bulletins keep updating events and don’t have a real ending. I can see how this helped TV later create shows with storylines that continue on over time.

    2. Broadcast TV on the other hand carries large amounts of non-fiction:news, documentaries, announcements, weather forecasts, various kinds ofsegments that are purely televisual in their characteristic forms

      TV is really different from movies because it shows a lot of real life content, such as news and documentaries. I think this is why TV can feel like it’s more diverse, since it mixes real life content with fictional stories.

    3. Commercial entertainment cinema is overwhelmingly a narrative fictionmedium

      This makes sense because most movies we watch are stories with character and plots. TV, on the other hand, can have documentaries or news, while movies usually just have fictional stories. I wonder if this is why movies feel more complete while TV can feel like a mix of different kinds of content.

    1. wasourfriend,and hadneitherbowesnorarrowes;whatdidweedoubt?ItoldhimitwasthecustomeofourCountry,notdoubtingofhiskindnesanywaies:wherewiththoughheeseemedsatisfied,yetCaptameNuportcausedallourmentoretiretothewaterside,whichwassomethirtiescore*fromthence.

      The native King asked John Smith why they're carrying weapons. Differences in culture and customs, and they're sorting things out in a civilized manner.

    2. Within five or sixe dayes after the arrivall of the Ship, bya mischaunce our Fort was burned, and the most of ourapparell, lodging and private provision.

      Ship of supplies was burned down and there are some serious consequences.

    3. hus surprised, I resolved to trie their mercies: my armesI caste from me, till which none durst approch me.

      It seemed like he found it weird and almost uncomfortable to sort of surrender his weapons. This proved to be helpful to him considering he met with the King of the tribe.

    4. Supposing them surprised, and that the Ind-ians had betraid us, presently I seazed him and bound hisarme fast to my hand in a garter, with my pistoll ready bentto be revenged on him: he advised me to fly, and seemedignorant of what was done.

      It seemed like the native was trying to help John in this situation when they got ambushed. John Smith doesn't seem to trust the natives still.

    5. InmyreturnetoPaspahegh,Itradedwiththatchurlishandtrecherousnation:havingloaded10or12bushelsofcorne,theyoffredtotakeourpiecesandswords,yetbystelth,but[we]seemingtodislikeit,they werereadytoassaultus:yetstandinguponourguard,

      Lot's of tension here between the settlers and the natives.

    6. The Indiansthinkingusnearefamished,withcarelessekindnes,oftreduslittlepiecesofbread andsmallhandfullsofbeanesorwheat,forahatchetorapieceofcopper

      "carelesse kindnes" I what stood out to me in this passage, because he's describing the natives.

    7. ndians the day beforehad assalted the fort, and supprised it, had not God (beyondal their expectations) by meanes of the shippes, at whom' they shot with their Ordinances and Muskets, caused themto retire, they had entred the fort with our own men, whichwere then busied in setting Corne,

      They experienced attacked from natives.

    8. where the people shewed us themanner of their diving for Mussels, in which they finde Pearles.

      The natives seemed to welcome the settlers by showing their techniques in diving for mussels.

    Annotators

    1. Think about it this way: if all the students in your class were told to explain a complex concept, none of them would do it in the same way.

      The diversity in writing styles is something that should be embraced, and everyone should strive to have their own writing style.

    2. After selecting an audience and a purpose, you must choose what information will make it to the page.

      Not only do you have to consider the tone of your writing depending on the audience, but one must also consider the topics and content discussed in a piece of literature.

    1. eLife Assessment

      This valuable work explores how synaptic activity encodes information during memory tasks. All reviewers agree that the work is of very high quality and that the methodological approach is praiseworthy. Although the experimental data support the possibility that phospholipase diacylglycerol signaling and synaptotagmin 7 (Syt7) dynamically regulate the vesicle pool required for presynaptic release, a concern remains that the central finding of paired-pulse depression at very short intervals could be due to a mechanism that does not depend on exocytosis, such as Ca²⁺ channel inactivation, rather than vesicle pool depletion. Overall, this is a solid study although the results still warrant consideration of alternative interpretations.

    2. Reviewer #3 (Public review):

      To summarize: The authors' overfilling hypothesis depends crucially on the premise that the very-quickly reverting paired-pulse depression seen after unusually short rest intervals of << 50 ms is caused by depletion of release sites whereas Dobrunz and Stevens (1997) concluded that the cause was some other mechanism that does not involve depletion. The authors now include experiments where switching extracellular Ca2+ from 1.2 to 2.5 mM increases synaptic strength on average, but not by as much as at other synapse types. They contend that the result supports the depletion hypothesis. I didn't agree because the model used to generate the hypothesis had no room for any increase at all, and because a more granular analysis revealed a mixed population with a subset where: (a) synaptic strength increased by as much as at standard synapses; and yet (b) the quickly reverting depression for the subset was the same as the overall population.

      The authors raise the possibility of additional experiments, and I do think this could clarify things if they pre-treat with EGTA as I recommended initially. They've already shown they can do this routinely, and it would allow them to elegantly distinguish between pv and pocc explanations for both the increases in synaptic strength and the decreases in the paired pulse ratio upon switching Ca2+ to 2.5 mM. Plus/minus EGTA pre-treatment trials could be interleaved and done blind with minimal additional effort.

      Showing reversibility would be a great addition too, because, in our experience, this does not always happen in whole-cell recordings in ex-vivo tissue even when electrical properties do not change. If the goal is to show that L2/3 synapses are less sensitive to changes in Ca2+ compared to other synapse types - which is interesting but a bit off point - then I would additionally include a positive control, done by the same person with the same equipment, at one of those other synapse types using the same kind of presynaptic stimulation (i.e. ChRs).

      Specific points (quotations are from the Authors' rebuttal)

      (1) Regarding the Author response image 1, I was instead suggesting a plot of PPR in 1.2 mM Ca2+ versus the relative increase in synaptic strength in 2.5 versus in 1.2 mM. This continues to seem relevant.

      (2) "Could you explain in detail why two-fold increase implies pv < 0.2?"

      a. start with power((2.5/(1 + (2.5/K1) + 1/2.97)),4) = 2*power((1.3/(1 + (1.3/K1) + 1/2.97)),4);

      b. solve for K1 (this turns out to be 0.48);

      c. then implement the premise that pv -> 1.0 when Ca2+ is high by calculating Max = power((C/(1 + (C/K1) + 1/2.97)),4) where C is [Ca] -> infinity.

      d. pv when [Ca] = 1.3. mM must then be power((1.3/(1 + (1.3/K1) + 1/2.97)),4)/Max, which is <0.2.

      Note that modern updates of Dodge and Rahamimoff typically include a parameter that prevents pv from approaching 1.0; this is the gamma parameter in the versions from Neher group.

      (3) "If so, we can not understand why depletion-dependent PPD should lead to PPF."

      When PPD is caused by depletion and pv < 0.2, the number of occupied release sites should not be decreased by more than one-fifth at the second stimulus so, without facilitation, PPR should be > 0.8. The EGTA results then indicate there should be strong facilitation, driving PPR to something like 1.2 with conservative assumptions. And yet, a value of < 0.4 is measured, which is a large miss.

      (4) Despite the authors' suggestion to the contrary, I continue to think there is a substantial chance that Ca2+-channel inactivation is the mechanism underlying the very quickly reverting paired-pulse depression. However, this is only one example of a non-depletion mechanism among many, with the main point being that any non-depletion mechanism would undercut the reasoning for overfilling. And, this is what Dobrunz and Stevens claimed to show; that the mechanism - whatever it is - does not involve depletion. The most effective way to address this would be affirmative experiments showing that the quickly reverting depression is caused by depletion after all. Attempting to prove that Ca2+-channel inactivation does not occur does not seem like a worthwhile strategy because it would not address the many other possibilities.

      (5) True that Kusick et al. observed morphological re-docking, but then vesicles would have to re-prime and Mahfooz et al. (2016) showed that re-priming would have to be slower than 110 ms (at least during heavy use at calyx of Held).

    3. Author response:

      The following is the authors’ response to the previous reviews

      Reviewer #3 (Public review):

      The central issue for evaluating the overfilling hypothesis is the identity of the mechanism that causes the very potent (>80% when inter pulse is 20 ms), but very quickly reverting (< 50 ms) paired pulse depression (Fig 1G, I). To summarize: the logic for overfilling at local cortical L2/3 synapses depends critically on the premise that probability of release (pv) for docked and fully primed vesicles is already close to 100%. If so, the reasoning goes, the only way to account for the potent short-term enhancement seen when stimulation is extended beyond 2 pulses would be by concluding that the readily releasable pool overfills. However, the conclusion that pv is close to 100% depends on the premise that the quickly reverting depression is caused by exocytosis dependent depletion of release sites, and the evidence for this is not strong in my opinion. Caution is especially reasonable given that similarly quickly reverting depression at Schaffer collateral synapses, which are morphologically similar, was previously shown to NOT depend on exocytosis (Dobrunz and Stevens 1997). Note that the authors of the 1997 study speculated that Ca2+-channel inactivation might be the cause, but did not rule out a wide variety of other types of mechanisms that have been discovered since, including the transient vesicle undocking/re-docking (and subsequent re-priming) reported by Kusick et al (2020), which seems to have the correct timing.

      Thank you for your comments on an alternative possibility besides Ca<sup>2+</sup> channel inactivation. Kusick et al. (2020) showed that transient destabilization of docked vesicle pool is recovered within 14 ms after stimulation. This rapid recovery implies that post-stimulation undocking events might be largely resolved before the 20 ms inter-stimulus interval (ISI) used in our paired-pulse ratio (PPR) experiments, arguing against the possibility that post-AP undocking/re-docking events significantly influence PPR measured at 20 ms ISI. Furthermore, Vevea et al. (2021) showed that post-stimulus undocking is facilitated in synaptotagmin-7 (Syt7) knockout synapses. In our study, Syt7 knockdown did not affect PPR at 20 ms ISI, suggesting that the undocking process described in Kusick et al. may not be a major contributor to the paired-pulse depression observed at 20 ms interval in our study. Therefore, it is unlikely that transient vesicle undocking primarily underlies the strong PPD at 20 ms ISI in our experiments. Taken together, the undocking/redocking dynamics reported by Kusick et al. are too rapid to affect PPR at 20 ms ISI, and our Syt7 knockdown data further argue against a significant role of this process in the PPD observed at 20 ms interval.

      In an earlier round of review, I suggested raising extracellular Ca<sup>2+</sup>, to see if this would increase synaptic strength. This is a strong test of the authors' model because there is essentially no room for an increase in synaptic strength. The authors have now done experiments along these lines, but the result is not clear cut. On one hand, the new results suggest an increase in synaptic strength that is not compatible with the authors' model; technically the increase does not reach statistical significance, but, likely, this is only because the data set is small and the variation between experiments is large. Moreover, a more granular analysis of the individual experiments seems to raise more serious problems, even supporting the depletion-independent counter hypothesis to some extent. On the other hand, the increase in synaptic strength that is seen in the newly added experiments does seem to be less at local L2/3 cortical synapses compared to other types of synapses, measured by other groups, which goes in the general direction of supporting the critical premise that pv is unusually high at L2/3 cortical synapses. Overall, I am left wishing that the new data set were larger, and that reversal experiments had been included as explained in the specific points below.

      Specific Points:

      (1) One of the standard methods for distinguishing between depletion-dependent and depletion-independent depression mechanisms is by analyzing failures during paired pulses of minimal stimulation. The current study includes experiments along these lines showing that pv would have to be extremely close to 1 when Ca<sup>2+</sup> is 1.25 mM to preserve the authors' model (Section "High double failure rate ..."). Lower values for pv are not compatible with their model because the k<sub>1</sub> parameter already had to be pushed a bit beyond boundaries established by other types of experiments.

      It should be noted that we did not arbitrarily pushed the k<sub>1</sub> parameter beyond boundaries, but estimated the range of k<sub>1</sub> based on the fast time constant for recovery from paired pulse depression as shown in Fig. 3-S2-Ab.

      The authors now report a mean increase in synaptic strength of 23% after raising Ca to 2.5 mM. The mean increase is not quite statistically significant, but this is likely because of the small sample size. I extracted a 95% confidence interval of [-4%, +60%] from their numbers, with a 92% probability that the mean value of the increase in the full population is > 5%. I used the 5% value as the greatest increase that the model could bear because 5% implies pv < 0.9 using the equation from Dodge and Rahamimoff referenced in the rebuttal. My conclusion from this is that the mean result, rather than supporting the model, actually undermines it to some extent. It would have likely taken 1 or 2 more experiments to get above the 95% confidence threshold for statistical significance, but this is ultimately an arbitrary cut off.

      Our key claim in Fig. 3-S3 is not the statistical non-significance of EPSC changes, but the small magnitude of the change (1.23-fold). This small increase is far less than the 3.24-fold increase predicted by the fourth-power relationship (D&R equation, Dodge & Rahamimoff, 1967), which would be valid under the conditions that the fusion probability of docked vesicles (p<sub>v</sub>) is not saturated. We do not believe that addition of new experiments would increase the magnitude of EPSC change as high as the Dodge & Rahamimoff equation predicts, even if more experiments (n) yielded a statistical significance. In other words, even a small but statistically significant EPSC changes would still contradict with what we expect from low p<sub>v</sub> synapses. It should be noted that our main point is the extent of EPSC increase induced by high external [Ca<sup>2+</sup>], not a p-value. In this regard, it is hard for us to accept the Reviewer’s request for larger sample size expecting lower p-value.

      Although we agree to Reviewer’s assertion that our data may indicate a 92% probability for the high Ca<sup>2+</sup> -induced EPSC increases by more than 5%, we do not agree to the Reviewer’s interpretation that the EPSC increase necessarily implies an increase in p<sub>v</sub>. We are sorry that we could not clearly understand the Reviewer’s inference that the 5% increase of EPSCs implies p<sub>v</sub> < 0.9. Please note that release probability (p<sub>r</sub>) is the product of p<sub>v</sub> and the occupancy of docked vesicles in an active zone (p<sub>occ</sub>). We imagine that this inference might be under the premise that p<sub>occ</sub> is constant irrespective of external [Ca<sup>2+</sup>]. Contrary to the Reviewer’s premise, Figure 2c in Kusick et al. (2020) showed that the number of docked SVs increased by c. a. 20% upon increasing external [Ca<sup>2+</sup>] to 2 mM. Moreover, Figure 7F in Lin et al. (2025) demonstrated that the number of TS vesicles, equivalent to p<sub>occ</sub> increased by 23% at high external [Ca<sup>2+</sup>]. These extents of p<sub>occ</sub> increases are similar to our magnitude of high external Ca<sup>2+</sup> -induced increase in EPSC (1.23-fold). Of course, it is possible that both increase of p<sub>occ</sub> and p<sub>v</sub> contributed to the high [Ca<sup>2+</sup>]<sub>o</sub>-induced increase in EPSC. The low PPR and failure rate analysis, however, suggest that p<sub>v</sub> is already saturated in baseline conditions of 1.3 mM [Ca<sup>2+</sup>]<sub>o</sub> and thus it is more likely that an increase in p<sub>occ</sub> is primarily responsible for the 1.23-fold increase. Moreover, the 1.23-fold increase, does not match to the prediction of the D&R equation, which would be valid at synapses with low p<sub>v</sub>. Therefore, interpreting our observation (1.23-fold increase) as a slight increase in p<sub>occ</sub> is rather consistent with recent papers (Kusick et al.,2020; Lin et al., 2025) as well as our other results supporting the baseline saturation of p<sub>v</sub> as shown in Figure 2 and associated supplement figures (Fig. 2-S1 and Fig. 2-S2).

      (2) The variation between experiments seems to be even more problematic, at least as currently reported. The plot in Figure 3-figure supplement 3 (left) suggests that the variation reflects true variation between synapses, not measurement error.

      Note that there was a substantial variance in the number of docked or TS vesicles at baseline and its fold changes at high external Ca<sup>2+</sup> condition in previous studies too (Lin et al., 2025; Kusick et al., 2020). Our study did not focus on the heterogeneity but on the mean dynamics of short-term plasticity at L2/3 recurrent synapses. Acknowledging this, the short-term plasticity of these synapses could be best explained by assuming that vesicular fusion probability (p<sub>v</sub>) is near to unity, and that release probability is regulated by p<sub>occ</sub>. In other words, even though p<sub>v</sub> is near to unity, synaptic strength can increase upon high external [Ca<sup>2+</sup>], if the baseline occupancy of release sites (p<sub>occ</sub>) is low and p<sub>occ</sub> is increased by high [Ca<sup>2+</sup>]. Lin et al. (2025) showed that high external [Ca<sup>2+</sup>] induces an increase in the number of TS vesicles (equivalent to p<sub>occ</sub>) by 23% at the calyx synapses. Different from our synapses, the baseline p<sub>v</sub> (denoted as p<sub>fusion</sub> in Lin et al., 2025) of the calyx synapse is not saturated (= 0.22) at 1.5 mM external [Ca<sup>2+</sup>], and thus the calyx synapses displayed 2.36-fold increase of EPSC at 2 mM external [Ca<sup>2+</sup>], to which increases in p<sub>occ</sub> as well as in p<sub>v</sub> (from 0.22 to 0.42) contributed. Therefore, the small increase in EPSC (= 23%) supports that p<sub>v</sub> is already saturated at L2/3 recurrent synapses.

      And yet, synaptic strength increased almost 2-fold in 2 of the 8 experiments, which back extrapolates to pv < 0.2.

      We are sorry that we could not understand the first comment in this paragraph. Could you explain in detail why two-fold increase implies pv < 0.2?

      If all of the depression is caused by depletion as assumed, these individuals would exhibit paired pulse facilitation, not depression. And yet, from what I can tell, the individuals depressed, possibly as much as the synapses with low sensitivity to Ca<sup>2+</sup>, arguing against the critical premise that depression equals depletion, and even arguing - to some extent - for the counter hypothesis that a component of the depression is caused by a mechanism that is independent of depletion.

      For the first statement in this paragraph, we imagine that ‘the depression’ means paired pulse depression (PPD). If so, we can not understand why depletion-dependent PPD should lead to PPF. If the paired pulse interval is too short for docked vesicles to be replenished, the first pulse-induced vesicle depletion would result in PPD. We are very sorry that we could not understand Reviewer’s subsequent inference, because we could not understand the first statement.

      I would strongly recommend adding an additional plot that documents the relationship between the amount of increase in synaptic strength after increasing extracellular Ca<sup>2+</sup> and the paired pulse ratio as this seems central.

      We found no clear correlation of EPSC<sub>1</sub> with PPR changes (ΔPPR) as shown in the figure below.

      Author response image 1.

      Plot of PPR changes as a function of EPSC1.<br />

      (3) Decrease in PPR. The authors recognize that the decrease in the paired-pulse ratio after increasing Ca<sup>2+</sup> seems problematic for the overfilling hypothesis by stating: "Although a reduction in PPR is often interpreted as an increase in pv, under conditions where pv is already high, it more likely reflects a slight increase in p<sub>occ</sub> or in the number of TS vesicles, consistent with the previous estimates (Lin et al., 2025)."

      We admit that there is a logical jump in our statement you mentioned here. We appreciate your comment. We re-wrote that part in the revised manuscript (line 285) as follows:

      “Recent morphological and functional studies revealed that elevation of [Ca<sup>2+</sup>]<sub>o</sub> induces an increase in the number of TS or docked vesicles to a similar extent as our observation (Kusick et al., 2020; Lin et al., 2025), raising a possibility that an increase in p<sub>occ</sub> is responsible for the 1.23-fold increase in EPSC at high [Ca<sup>2+</sup>]<sub>o</sub> . A slight but significant reduction in PPR was observed under high [Ca<sup>2+</sup>]<sub>o</sub> too. An increase in p<sub>occ</sub> is thought to be associated with that in the baseline vesicle refilling rate. While PPR is always reduced by an increase in p<sub>v,</sub> the effects of refilling rate to PPR is complicated. For example, PPR can be reduced by both a decrease (Figure 2—figure supplement 1) and an increase (Lin et al., 2025) in the refilling rate induced by EGTA-AM and PDBu, respectively. Thus, the slight reduction in PPR is not contradictory to the possible contribution of p<sub>occ</sub> to the high [Ca<sup>2+</sup>]<sub>o</sub> effects.”

      I looked quickly, but did not immediately find an explanation in Lin et al 2025 involving an increase in pocc or number of TS vesicles, much less a reason to prefer this over the standard explanation that reduced PPR indicates an increase in pv.

      Fig. 7F of Lin et al. (2025) shows an 1.23-fold increase in the number of TS vesicles by high external [Ca<sup>2+</sup>]. The same figure (Fig. 7E) in Lin et al. (2025) also shows a two-fold increase of p<sub>fusion</sub> (equivalent to p<sub>v</sub> in our study) by high external [Ca<sup>2+</sup>] (from 0.22 to 0.42,). Because p<sub>occ</sub> is the occupancy of TS vesicles in a limited number of slots in an active zone, the fold change in the number of TS vesicles should be similar to that of p<sub>occ</sub>.

      The authors should explain why the most straightforward interpretation is not the correct one in this particular case to avoid the appearance of cherry picking explanations to fit the hypothesis.

      The results of Lin et al. (2025) indicate that high external [Ca<sub>2+</sub>] induces a milder increase in p<sub>occ</sub> (23%) compared to p<sub>v</sub> (190%) at the calyx synapses. Because the extent of p<sub>occ</sub> increase is much smaller than that of p<sub>v</sub> and multiple lines of evidence in our study support that the baseline p<sub>v</sub> is already saturated, we raised a possibility that an increase in p<sub>occ</sub> would primarily contribute to the unexpectedly low increase of EPSC at 2.5 mM [Ca<sub>2+</sub>]<sub>o</sub>. As mentioned above, our interpretation is also consistent with the EM study of Kusick et al. (2020). Nevertheless, the reduction of PPR at 2.5 mM Ca<sub>2+</sub> seems to support an increase in p<sub>v,</sub> arguing against this possibility. On the other hand, because p<sub>occ</sub> = k<sub>1</sub>/(k<sub>1</sub>+b<sub>1</sub>) under the simple vesicle refilling model (Fig. 3-S2Aa), a change in p<sub>occ</sub> should associate with changes in k<sub>1</sub> and/or b<sub>1</sub>. While PPR is always reduced by an increase in p<sub>v,</sub> the effects of refilling rate to PPR is complicated. For example, despite that EGTA-AM would not increase p<sub>v,</sub> it reduced PPR probably through reducing refilling rate (Fig. 2-S1). On the contrary, PDBu is thought to increase k<sub>1</sub> because it induces two-fold increase of p<sub>occ</sub> (Fig. 7L of Lin et al., 2025). Such a marked increase of p<sub>occ,</sub> rather than p<sub>v,</sub> seems to be responsible for the PDBu-induced marked reduction of PPR (Fig. 7I of Lin et al., 2025), because PDBu induced only a slight increase in p<sub>v</sub> (Fig. 7K of Lin et al., 2025). Therefore, the slight reduction of PPR is not contradictory to our interpretation that an increase in p<sub>occ</sub> might be responsible for the slight increase in EPSC induced by high [Ca<sup>2+</sup>]<sub>o</sub>.

      (4) The authors concede in the rebuttal that mean pv must be < 0.7, but I couldn't find any mention of this within the manuscript itself, nor any explanation for how the new estimate could be compatible with the value of > 0.99 in the section about failures.

      We have never stated in the rebuttal or elsewhere that the mean p<sub>v</sub> must be < 0.7. On the contrary, both of our manuscript and previous rebuttals consistently argued that the baseline p<sub>v</sub> is already saturated, based on our observations including low PPR, tight coupling, high double failure rate and the minimal effect of external Ca<sup>2+</sup> elevation.

      (5) Although not the main point, comparisons to synapses in other brain regions reported in other studies might not be accurate without directly matching experiments.

      Please understand that it not trivial to establish optimal experimental settings for studying other synapses using the same methods employed in the study. We think that it should be performed in a separate study. Furthermore, we have already shown in the manuscript that action potentials (APs) evoked by oChIEF activation occur in a physiologically natural manner, and the STP induced by these oChIEF-evoked APs is indistinguishable from the STP elicited by APs evoked by dual-patch electrical stimulation. Therefore, we believe that our use of optogenetic stimulation did not introduce any artificial bias in measuring STP.

      As it is, 2 of 8 synapses got weaker instead of stronger, hinting at possible rundown, but this cannot be assessed because reversibility was not evaluated. In addition, comparing axons with and without channel rhodopsins might be problematic because the channel rhodopsins might widen action potentials.

      We continuously monitored series resistance and baseline EPSC amplitude throughout the experiments. The figure below shows the mean time course of EPSCs at two different [Ca<sup>2+</sup>]<sub>o</sub>. As it shows, we observed no tendency for run-down of EPSCs during experiments. If any, such recordings were discarded from analysis. In addition, please understand that there is a substantial variance in the number of docked vesicles at both baseline and high external Ca<sup>2+</sup> (Lin et al., 2025; Kusick et al., 2020) as well as short-term dynamics of EPSCs at our synapses.

      Author response image 2.

      Time course of normalized amplitudes of the first EPSCs during paired-pulse stimulation at 20 ms ISI in control and in the elevated external Ca<sup>2+</sup> (n = 8).<br />

      (6) Perhaps authors could double check with Schotten et al about whether PDBu does/does not decrease the latency between osmotic shock and transmitter release. This might be an interesting discrepancy, but my understanding is that Schotten et al didn't acquire information about latency because of how the experiments were designed.

      Schotten et al. (2015) directly compared experimental and simulation data for hypertonicity-induced vesicle release. They showed a pronounced acceleration of the latency as the tonicity increases (Fig. 2-S2), but this tonicity-dependent acceleration was not reproduced by reducing the activation energy barrier for fusion (ΔEa) in their simulations (Fig. 2-S1). Thus, the authors mentioned that an unknown compensatory mechanism counteracting the osmotic perturbation might be responsible for the tonicity-dependent changes in the latency. Importantly, their modeling demonstrated that reducing ΔEa, which would correspond to increasing p<sub>v</sub> results in larger peak amplitudes and shorter time-to-peak, but did not accelerate the latency. Therefore, there is currently no direct explanation for the notion that PDBu or similar manipulations shorten latency via an increase in p<sub>v</sub>.

      (7) The authors state: "These data are difficult to reconcile with a model in which facilitation is mediated by Ca2+-dependent increases in pv." However, I believe that discarding the premise that depression is always caused by depletion would open up wide range of viable possibilities.

      We hope that Reviewer understands the reasons why we reached the conclusion that the baseline p<sub>v</sub> is saturated at our synapses. First of all, strong paired pulse depression (PPD) cannot be attributed to Ca<sup>2+</sup> channel inactivation because Ca<sup>2+</sup> influx at the axon terminal remained constant during 40 Hz train stimulation (Fig.2 -S2). Moreover, even if Ca<sup>2+</sup> channel inactivation is responsible for the strong PPD, this view cannot explain the delayed facilitation that emerges subsequent pulses (third EPSC and so on) in the 40 Hz train stimulation (Fig. 1-4), because Ca<sup>2+</sup> channel inactivation gradually accumulates during train stimulations as directly shown by Wykes et al. (2007) in chromaffin cells. Secondly, the strong PPD and very fast recovery from PPD indicates very fast refilling rate constant (k<sub>1</sub>). Under this high k<sub>1</sub>, the failure rates were best explained by p<sub>v</sub> close to unity. Thirdly, the extent of EPSC increase induced by high external Ca<sup>2+</sup> was much smaller than other synapses such as calyx synapses at which p<sub>v</sub> is not saturated (Lin et al., 2025), and rather similar to the increases in p<sub>occ</sub> estimated at calyx synapses or the EM study (Kusick et al., 2020; Lin et al., 2025).

      Reference

      Wykes et al. (2007). Differential regulation of endogenous N-and P/Q-type Ca<sup>2+</sup> channel inactivation by Ca<sup>2+</sup>/calmodulin impacts on their ability to support exocytosis in chromaffin cells. Journal of Neuroscience, 27(19), 5236-5248.

      Reviewer #3 (Recommendations for the authors):

      I continue to think that measuring changes in synaptic strength when raising extracellular Ca<sup>2+</sup> is a good experiment for evaluating the overfilling hypothesis. Future experiments would be better if the authors would include reversibility criteria to rule out rundown, etc. Also, comparisons to other types of synapses would be stronger if the same experimenter did the experiments at both types of synapses.

      We observed no systemic tendency for run-down of EPSCs during these experiments (Author response image 2). Furthermore, the observed variability is well within the expected variance range in the number of docked vesicles at both baseline and high external Ca²⁺ (Lin et al., 2025; Kusick et al., 2020) and reflects biological variability rather than experimental artifact. Therefore, we believe that additional reversibility experiments are not warranted. However, we are open to further discussion if the Reviewer has specific methodological concerns not resolved by our present data.

      For the second issue, as mentioned above, we think that studying at other synapse types should be done in a separate study.

    1. For example, we might divide the 50-acre property into 250 1/5-th acre non-overlapping plots

      Which would be a very untypical strategy, right? We see such examples in old textbooks (like Shiver and Borders and my former boss Akca used it too). Such a subdivision of the area would result in a finite population of cells. No other overlapping cell in between would have a positive probability to be selected. If you like to avoid this, you can just say that a sample of plots is selected from the total forest area.

    2. In some cases we’re able to observe (meaning measure) all units in the population

      If the population we are looking at does allow, yes. We can e.g. measure all trees if we are interested in tree characteristics. It becomes impossible if the population consists of plots (resp. all possible sample plots in an area).

    1. dplyr

      It is not at all a bad idea to introduce dplyr here. data.table would be an alternative and often more efficient on large datasets, but the code is not as readyble

    1. indicates the log rule

      If these are standard, fine. Since this is just to learn for the students, you can also use simpler formulations like Smalian.

    2. # Assign the species-specific regression coefficients.

      For a simple example ok, but later you want to have a leftjoin here and read the model coefficients for different species from a different related table. But you can do that also in base R:

      seperate table: model coefficients by species

      coef_df <- data.frame( species = c(your species 1, 2,3,...), a = c(0.5, 0.7, 0.6), # coefficients b = c(1.2, 1.5, 1.3) )

      join coefficients on trees

      library(dplyr)

      trees_with_coef <- trees_df %>% left_join(coef_df, by = "species")

      But dplyr was not yet introduced and i understand that they first should do it by hand...

    3. AGB=exp(β0+β1ln(DBH)),

      This is the log transformed linear form that we often use for regression analysis (due to heteroskedasticity in the metric data). As you see the coefficients are here ß and should be estimates by a linear regression. However, if it comes to applying an allometric model we usually use the form AGB=a*DBH^b. ß1 is b and a is exp(ß0). I would recommend to substitute ß0 and ß1 by a and b and use the above formulation. Easier to digest for the students and more common in practice.

    4. basal area is c⋅DBH2c⋅DBH2c\cdot \text{DBH}^2, where ccc is often referred to as the “foresters constant” and, depending on your measurement system, is either 0.005454 or 0.00007854

      I would not recommend to introduce any "foresters constants", it is just simple geometry if we want to calculate the area of a circle from its diameter. Students tend to learn such things by heart and forget about the fundamentals... Here the pi/10.000 comes in because DBH comes in centimeter (conversion from cm to meter) and the strange (DBH/2)^2 substitutes pi/4. I prefer to explain my students how to calculate the area of a circle and remind them that we want to have this in meter, instead of confusing them.

    5. to some degree

      you can have the same basal area with a single big tree or hundreds of small trees. There is not necessarily a relation between BA and density

    1. With the VerifiableCredentials

      Peter and Ana hold, their groups can scale more easily, and nobody has to call up the council to find out if Peter or Ana is in good standing.

      2 2ww w.si deways.earth

      Trust is hard.

      Knowing who to trust is harder. - —Maria V. Snyder |Verifiable Community| tells me who is going to kick you’re a** if you do something wrong. (DAO Expert)

    2. chants about "trustless systems"

      Despite a decade of chants about "trustless systems", Web3 is constantly working on trust... how do we avoid rugs, thugs, drugs and mugs

      Yes, stick it to the Web 3 croud 80% of which is fatally misdirected. Invert the drive for autonomy to build blocks of chains for global enslavement

      We don't need no Decentralized ledgers that cannot possibly be decent

      we need Distributed Hash Tables

      yes we need systems built for trust from trust

    1. if one focused on actual changes inspouse selection, the 484 marriages represent three stages of the trend towardfree-choice matches, but these stages do not always correspond to changes innational policy

      More reasons will be discussed further in the book to showcase the relationship between the change of spouse selection and the development of the society. Changes in national policy are not the simple reason, and I wonder if the other elements could be in piority to the influence of national policy

    2. as a middle-aged man lamented, "Times have changed. The youngergeneration has different ideas about everything, and some of them don't evencare whether a bride is a virgin."

      The author let the middle-aged man (representation of convention) stated out the change of what people are thinking and the change of the Chinese society.

    3. n the early 19603, wife beatingwas still regarded as nothing more than a common way by which a husbandcould make things right at home.

      I wonder if the 1950 Marriage Law and the other legal reforms addressed wife beating. It seems that patriarchal norms persisted despite all of these reforms.

    4. A good friend told me that beforehis parents asked the matchmaker to make a proposal he had already devel-oped a strong emotional bond with his future wife.

      Who were these matchmakers? Were they relatives of the groom or bride? Was it a common practice to have matchmakers in these villages in China?

    5. Pressured by financial difficulties, however, this father was interestedin the potential profits from his daughter's marriage, and this was obvious toeveryone, including the two lovers.

      The financial hardships of these families could explain why arranged marriages were so prevalent in this village. The profit aspect is present in almost all of author's stories of arranged marriages in this village.

    1. or a complete enumeration of all units (e.g., trees) in a population

      Mhhh, here you maybe introduce some confusion for the reader. It is correct that in this specific experimental plot and for the specific ecological perspective the interest is on trees. And yes, if the interest is on tree characteristics, the total population is all the trees. But this is very different to forest inventories (observational studies) that aim at describing the fores area. There, the population is the forest area and the sampling units are subsets of this area (plots). Maybe substitute by "... is a full census or complete enumeration of all trees, which in this case represent the population"

    2. The HTML version of Figure 1.5 shows the PEF LiDAR canopy height surface, forest inventory plot locations, and MU boundaries. Clicking on a plot shows the MU in which the plot resides, its identification number, and current basal area (ft22^2/ac). Clicking in a MU polygon (i.e., between plots) brings up a figure of the MU’s basal area changed over time. The printed version Figure 1.5 shows the PEF MU boundaries and plot locations colored by most current basal area (ft22^2/ac).

      Somehow difficult to consider printed and online at the same time. Some of it might be shifted to the caption.

    3. high spatial resolution wall-to-wall data products such as gridded maps

      Well, this is what everybody says, but if we look at what is used for decision making at the end, I am not sure. Maps are great for communication, but calculations are done in tabular data. If we provide forest managers with high resolution wall-to-wall maps, they usually ask for simpler products showing useful classes. My experience is that we often aggregate our high resolution back to something more simple. In the carbon business emission factors are not provided at the pixel level, but e.g. for a certain forest type.

    4. as well as extensive campaigns to collect field-based calibration data

      In the meanwhile LiDAR data acquisition has emerged to a standard in many countries. The wording "calibration data" reflects a purely model based perspective (common in remote sensing), while we usually look at it as ancillary data

    5. no significant change

      Just comment: Here we look at a typical experimental study that allows hypotheses testing and significance (in contrast to the observational studies mentioned above in which we can maybe look at relationships but never on cause and effect). Such studies do not aim at describing a forest area, but are designed to research into ecological or other relationships. Good to have such different examples!

    6. point

      I stop commenting on point or plot here, since you consistently used point. At the end it is a question of personal style and "point" is also fine as long as there is a mentioning that trees around such a point are included based on a certain rule (and I assume you come to plot designs later). Speaking of points may help to explain the infinite character of our population later.

    7. If you’re reading the HTML version of this book, mousing-over a point gives the point number and a single click gives a list of AGS trees used to calculate the basal area color reflected in the figure legend.

      Considering a later printed book publication you might want to shift this into the figure caption

    8. placement

      or "selection of measurement units". The typical target variables cannot be observed at a dimensionless point (as you say), usually our sampling elements are small outcuts of the forest area.

    9. are best fit lines

      see, we are not looking at strict allometry here (which would be a process model perspective) but look for the best fit. In this case dangerous if you interpolate beyond the range of data. Look at Betula lenta where the best fit to the current data would suggest that biomass is decreasing with increasing dbh. In regard to model choice, a data analyst has both in mind: a possibly good fit to the current data and biological plausibility.

    10. allometric equation

      "Allometry" or allometric relations usually refer to a specific kind of relations (the relation between two relative growth rates in one individual). We tend to call many model "allometric" that are in fact other kinds of relationships. Taking the character of allometric relations serious would mean to apply a power model, but in fact we are often using others. Anyway, I would not change the text here because it is in line with the general understanding.

    11. to identify features for consideration in the subsequent wrangling and analysis

      The target variables we are typically interested in are rarely "measurable" (volume, biomass, biodiversity, ....). It is another important task of the data analyst to to identify and calculate the essential target variables or requested information from the data (features) at hand. In most cases this require the application of models.

    12. For example, field crews collect forest inventory data to answer questions about the amount and location of timber or non-timber resources. Monitoring data are collected to understand change in forest characteristics. Highly detailed individual tree data are collected to understand allometry, which is the growth and size relationship between different parts of an organism. Experimental manipulations of trees, stands, or forests are used to better understand how environmental change and disturbance events impact individual growth rates and trends in population demographics.

      It is maybe helpful for the reader to distinguish between typical observational studies (forest inventories aim at estimating status and change) and experimental studies (investigating effects of treatments or researching into dynamics)

    1. present

      Try to do binary search in a natural way in the tree for recovering the classical approximation \(22/7\) for \(\pi\). Here you actually need to descend to more than depth \(10\) in the tree!

      The famous approximation \(355/113\) is above depth \(25\).

    1. eLife Assessment

      This study tested the specific hypothesis that age-related changes to hearing involve a partial loss of synapse connections between sensory cells in the ear and the nerve fibers that carry information about sounds to the brain, and that this interferes with the ability to discriminate rapid temporal fluctuations in sounds. Physiological, behavioral, and histological analyses provide a powerful combination to test this hypothesis in gerbils. Contrary to previous suggestions, it was found that chemically-induced isolated synaptopathy (at similar levels as observed in aged gerbils) did not result in worse performance on a behavioral task measuring sensitivity to temporal fine-structure, nor did it produce degradations in auditory-nerve fiber encoding of fine structure. Aged gerbils showed degraded behavior and stronger than normal envelope responses, but temporal fine-structure coding was not affected; interpreted by the authors as suggesting central processing contributions to aging effects on discrimination. These findings are important for advancing our knowledge of the mechanistic bases for age-related changes to hearing, and the evidence provided is solid with the results largely supporting the claims made and minor limitations related to possible confounds discussed in reasonable depth.

    2. Reviewer #1 (Public review):

      Summary:

      The authors investigate the effects of aging on auditory system performance in understanding temporal fine structure (TFS), using both behavioral assessments and physiological recordings from the auditory periphery, specifically at the level of the auditory nerve. This dual approach aims to enhance understanding of the mechanisms underlying observed behavioral outcomes. The results indicate that aged animals exhibit deficits in behavioral tasks for distinguishing between harmonic and inharmonic sounds, which is a standard test for TFS coding. However, neural responses at the auditory nerve level do not show significant differences when compared to those in young, normal-hearing animals. The authors suggest that these behavioral deficits in aged animals are likely attributable to dysfunctions in the central auditory system, potentially as a consequence of aging.To further investigate this hypothesis, the study includes an animal group with selective synaptic loss between inner hair cells and auditory nerve fibers, a condition known as cochlear synaptopathy (CS). CS is a pathology associated with aging and is thought to be an early indicator of hearing impairment. Interestingly, animals with selective CS showed physiological and behavioral TFS coding similar to that of the young normal-hearing group, contrasting with the aged group's deficits. Despite histological evidence of significant synaptic loss in the CS group, the study concludes that CS does not appear to affect TFS coding, either behaviorally or physiologically.

      Strengths:

      This study addresses a critical health concern, enhancing our understanding of mechanisms underlying age-related difficulties in speech intelligibility, even when audiometric thresholds are within normal limits. A major strength of this work is the comprehensive approach, integrating behavioral assessments, auditory nerve (AN) physiology, and histology within the same animal subjects. This approach enhances understanding of the mechanisms underlying the behavioral outcomes and provides confidence in the actual occurrence of synapse loss and its effects.The study carefully manages controlled conditions by including five distinct groups: young normal-hearing animals, aged animals, animals with CS induced through low and high doses, and a sham surgery group. This careful setup strengthens the study's reliability and allows for meaningful comparisons across conditions. Overall, the manuscript is well-structured, with clear and accessible writing that facilitates comprehension of complex concepts.

      Weakness:

      The stimulus and task employed in this study are very helpful for behavioral research, and using the same stimulus setup for physiology is advantageous for mechanistic comparisons. However, I have some concerns about the limitations in auditory nerve (AN) physiology. Due to practical constraints, it is not feasible to record from a large enough population of fibers that covers a full range of best frequencies (BFs) and spontaneous rates (SRs) within each animal. This raises questions about how representative the physiological data are for understanding the mechanism in behavioral data. I am curious about the authors' interpretation of how this stimulus setup might influence results compared to methods used by Kale and Heinz (2010), who adjusted harmonic frequencies based on the characteristic frequency (CF) of recorded units. While, the harmonic frequencies in this study are fixed across all CFs, meaning that many AN fibers may not be tuned closely to the stimulus frequencies. If units are not responsive to the stimulus further clarification on detecting mistuning and phase locking to TFS effects within this setup would be valuable. Given the limited number of units per condition-sometimes as few as three for certain conditions-I wonder if CF-dependent variability might impact the results of the AN data in this study and discussing this factor can help with better understanding the results. While the use of the same stimuli for both behavioral and physiological recordings is understandable, a discussion on how this choice affects interpretation would be beneficial. In addition a 60 dB stimulus could saturate high spontaneous rate (HSR) AN fibers, influencing neural coding and phase-locking to TFS. Potentially separating SR groups, could help address these issues and improve interpretive clarity.

      A deeper discussion on the role of fiber spontaneous rate could also enhance the study. How might considering SR groups affect AN results related to TFS coding? While some statistical measures are included in the supplement, a more detailed discussion in the main text could help in interpretation.

      Although Figure S2 indicates no change in median SR, the high-dose treatment group lacks LSR fibers, suggesting a different distribution based on SR for different animal groups, as seen in similar studies on other species. A histogram of these results would be informative, as LSR fiber loss with CS-whether induced by ouabain in gerbils or noise in other animals-is well documented (e.g., Furman et al., 2013).

      Although ouabain effects on gerbils have been explored in previous studies, since these data is already seems to be recorded for the animal in this study, a brief description of changes in auditory brainstem response (ABR) thresholds, wave 1 amplitudes, and tuning curves for animals with cochlear synaptopathy (CS) in this study would be beneficial. This would confirm that ouabain selectively affects synapses without impacting outer hair cells (OHCs). For aged animals, since ABR measurements were taken, comparing hearing differences between normal and aged groups could provide insights into the pathologies besides CS in aged animals. Additionally, examining subject variability in treatment effects on hearing and how this correlates with behavior and physiology would yield valuable insights. If limited space maybe a brief clarification or inclusion in supplementary could be good enough.

      Another suggestion is to discuss the potential role of MOC efferent system and effect of anesthesia in reducing efferent effects in AN recordings. This is particularly relevant for aged animals, as CS might affect LSR fibers, potentially disrupting the medial olivocochlear (MOC) efferent pathway. Anesthesia could lessen MOC activity in both young and aged animals, potentially masking efferent effects that might be present in behavioral tasks. Young gerbils with functional efferent systems might perform better behaviorally, while aged gerbils with impaired MOC function due to CS might lack this advantage. A brief discussion on this aspect could potentially enhance mechanistic insights.

      Lastly, although synapse counts did not differ between the low-dose treatment and NH I sham groups, separating these groups rather than combining them with the sham might reveal differences in behavior or AN results, particularly regarding the significance of differences between aged/treatment groups and the young normal-hearing group.

    3. Reviewer #2 (Public review):

      Summary:

      Using a gerbil model, the authors tested the hypothesis that loss of synapses between sensory hair cells and auditory nerve fibers (which may occur due to noise exposure or aging) affects behavioral discrimination of the rapid temporal fluctuations of sounds. In contrast to previous suggestions in the literature, their results do not support this hypothesis; young animals treated with a compound that reduces the number of synapses did not show impaired discrimination compared to controls. Additionally, their results from older animals showing impaired discrimination suggest that age-related changes aside from synaptopathy are responsible for the age-related decline in discrimination.

      Strengths:

      (1) The rationale and hypothesis are well-motivated and clearly presented.

      (2) The study was well conducted with strong methodology for the most part, and good experimental control. The combination of physiological and behavioral techniques is powerful and informative. Reducing synapse counts fairly directly using ouabain is a cleaner design than using noise exposure or age (as in other studies), since these latter modifiers have additional effects on auditory function.

      (3) The study may have a considerable impact on the field. The findings could have important implications for our understanding of cochlear synaptopathy, one of the most highly researched and potentially impactful developments in hearing science in the past fifteen years.

      Weaknesses:

      (1) I have concerns that the gerbils may not have been performing the behavioral task using temporal fine structure information.

      Human studies using the same task employed a filter center frequency that was (at least) 11 times the fundamental frequency (Marmel et al., 2015; Moore and Sek, 2009). Moore and Sek wrote: "the default (recommended) value of the centre frequency is 11F0." Here, the center frequency was only 4 or 8 times the fundamental frequency (4F0 or 8F0). Hence, relative to harmonic frequency, the harmonic spacing was considerably greater in the present study. However, gerbil auditory filters are thought to be broader than those in human. In the revised version of the manuscript, the authors provide modelling results suggesting that the excitation patterns were discriminable for the 4F0 conditions, but may not have been for the 8F0 conditions. These results provide some reassurance that the 8F0 discriminations were dependent on temporal cues, but the description of the model lacks detail. Also, the authors state that "thus, for these two conditions with harmonic number N of 8 the gerbils cannot rely on differences in the excitation patterns but must solve the task by comparing the temporal fine structure." This is too strong. Pulsed tone intensity difference limens (the reference used for establishing whether or not the excitation pattern cues were usable) may not be directly comparable to profile-analysis-like conditions, and it has been argued that frequency discrimination may be more sensitive to excitation pattern cues than predicted from a simple comparison to intensity difference limens (Micheyl et al. 2013, https://doi.org/10.1371/journal.pcbi.1003336).

      I'm also somewhat concerned that the masking noise used in the present study was too low in level to mask cochlear distortion products. Based on their excitation pattern modelling, the authors state (without citation) that "since the level of excitation produced by the pink noise is less than 30 dB below that produced by the complex tones, distortion products will be masked." The basis for this claim is not clear. In human, distortion products may be only ~20 dB below the levels of the primaries (referenced to an external sound masker / canceller, which is appropriate, assuming that the modelling reported in the present paper did not include middle-ear effects; see Norman-Haignere and McDermott, 2016, doi: 10.1016/j.neuroimage.2016.01.050). Oxenham et al. (2009, doi: 10.1121/1.3089220) provide further cautionary evidence on the potential use of distortion product cues when the background noise level is too low (in their case the relative level of the noise in the compromised condition was only a little below that used in the present study). The masking level used in the present study may have been sufficient, but it would be useful to have some further reassurance on this point.

      (2) The synapse reductions in the high ouabain and old groups were relatively small (mean of 19 synapses per hair cell compared to 23 in the young untreated group). In contrast, in some mouse models of the effects of noise exposure or age, a 50% reduction in synapses is observed, and in the human temporal bone study of Wu et al. (2021, https://doi.org/10.1523/JNEUROSCI.3238-20.2021) the age-related reduction in auditory nerve fibres was ~50% or greater for the highest age group across cochlear location. It could be simply that the synapse loss in the present study was too small to produce significant behavioral effects. Hence, although the authors provide evidence that in the gerbil model the age-related behavioral effects are not due to synaptopathy, this may not translate to other species (including human).

      (3) The study was not pre-registered, and there was no a priori power calculation, so there is less confidence in replicability than could have been the case. Only three old animals were used in the behavioral study, which raises concerns about the reliability of comparisons involving this group. Statistical analyses on very small samples can be unreliable due to problems of power, generalisability, and susceptibility to outliers.

    4. Reviewer #3 (Public review):

      This study is a part of the ongoing series of rigorous work from this group exploring neural coding deficits in the auditory nerve, and dissociating the effects of cochlear synaptopathy from other age-related deficits. They have previously shown no evidence of phase-locking deficits in the remaining auditory nerve fibers in quiet-aged gerbils. Here, they study the effects of aging on the perception and neural coding of temporal fine structure cues in the same Mongolian gerbil model.

      They measure TFS coding in the auditory nerve using the TFS1 task which uses a combination of harmonic and tone-shifted inharmonic tones which differ primarily in their TFS cues (and not the envelope). They then follow this up with a behavioral paradigm using the TFS1 task in these gerbils. They test young normal hearing gerbils, aged gerbils, and young gerbils with cochlear synaptopathy induced using the neurotoxin ouabain to mimic synapse losses seen with age.

      In the behavioral paradigm, they find that aging is associated with decreased performance compared to the young gerbils, whereas young gerbils with similar levels of synapse loss do not show these deficits. When looking at the auditory nerve responses, they find no differences in neural coding of TFS cues across any of the groups. However, aged gerbils show an increase in the representation of periodicity envelope cues (around f0) compared to young gerbils or those with induced synapse loss. The authors hence conclude that synapse loss by itself doesn't seem to be important for distinguishing TFS cues, and rather the behavioral deficits with age are likely having to do with the misrepresented envelope cues instead.

      The manuscript is well written, and the data presented are robust. Some of the points below will need to be considered while interpreting the results of the study, in its current form. These considerations are addressable if deemed necessary, with some additional analysis in future versions of the manuscript.

      Spontaneous rates - Figure S2 shows no differences in median spontaneous rates across groups. But taking the median glosses over some of the nuances there. Ouabain (in the Bourien study) famously affects low spont rates first, and at a higher degree than median or high spont rates. It seems to be the case (qualitatively) in figure S2 as well, with almost no units in the low spont region in the ouabain group, compared to the other groups. Looking at distributions within each spont rate category and comparing differences across the groups might reveal some of the underlying causes for these changes. Given that overall, the study reports that low-SR fibers had a higher ENV/TFS log-z-ratio, the distribution of these fibers across groups may reveal specific effects of TFS coding by group.

      [Update: The revised manuscript has addressed these issues]

      Threshold shifts - It is unclear from the current version if the older gerbils have changes in hearing thresholds, and whether those changes may be affecting behavioral thresholds. The behavioral stimuli appear to have been presented at a fixed sound level for both young and aged gerbils, similar to the single unit recordings. Hence, age-related differences in behavior may have been due to changes in relative sensation level. Approaches such as using hearing thresholds as covariates in the analysis will help explore if older gerbils still show behavioral deficits.

      [Update: The issue of threshold shifts with aging gerbils is still unresolved in my opinion. From the revised manuscript, it appears that aged gerbils have a 36dB shift in thresholds. While the revised manuscript provides convincing evidence that these threshold shifts do not affect the auditory nerve tuning properties, the behavioral paradigm was still presented at the same sound level for young and aged animals. But a potential 36 dB change in sensation level may affect behavioral results. The authors may consider adding thresholds as covariates in analyses or present any evidence that behavioral thresholds are plateaued along that 30dB range].

      Task learning in aged gerbils - It is unclear if the aged gerbils really learn the task well in two of the three TFS1 test conditions. The d' of 1 which is usually used as the criterion for learning was not reached in even the easiest condition for aged gerbils in all but one condition for the aged gerbils (Fig. 5H) and in that condition, there doesn't seem to be any age-related deficits in behavioral performance (Fig. 6B). Hence dissociating the inability to learn the task from the inability to perceive TFS 1 cues in those animals becomes challenging.

      [Update: The revised manuscript sufficiently addresses these issues, with the caveat of hearing threshold changes affecting behavioral thresholds mentioned above].

      Increased representation of periodicity envelope in the AN - the mechanisms for increased representation of periodicity envelope cues is unclear. The authors point to some potential central mechanisms but given that these are recordings from the auditory nerve what central mechanisms these may be is unclear. If the authors are suggesting some form of efferent modulation only at the f0 frequency, no evidence for this is presented. It appears more likely that the enhancement may be due to outer hair cell dysfunction (widened tuning, distorted tonotopy). Given this increased envelope coding, the potential change in sensation level for the behavior (from the comment above), and no change in neural coding of TFS cues across any of the groups, a simpler interpretation may be -TFS coding is not affected in remaining auditory nerve fibers after age-related or ouabain induced synapse loss, but behavioral performance is affected by altered outer hair cell dysfunction with age.

      [Update: The revised manuscript has addressed these issues]

      Emerging evidence seems to suggest that cochlear synaptopathy and/or TFS encoding abilities might be reflected in listening effort rather than behavioral performance. Measuring some proxy of listening effort in these gerbils (like reaction time) to see if that has changed with synapse loss, especially in the young animals with induced synaptopathy, would make an interesting addition to explore perceptual deficits of TFS coding with synapse loss.

      [Update: The revised manuscript has addressed these issues]

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1(Public review)  

      Summary:  

      The authors investigate the effects of aging on auditory system performance in understanding temporal fine structure (TFS), using both behavioral assessments and physiological recordings from the auditory periphery, specifically at the level of the auditory nerve. This dual approach aims to enhance understanding of the mechanisms underlying observed behavioral outcomes. The results indicate that aged animals exhibit deficits in behavioral tasks for distinguishing between harmonic and inharmonic sounds, which is a standard test for TFS coding. However, neural responses at the auditory nerve level do not show significant differences when compared to those in young, normalhearing animals. The authors suggest that these behavioral deficits in aged animals are likely attributable to dysfunctions in the central auditory system, potentially as a consequence of aging. To further investigate this hypothesis, the study includes an animal group with selective synaptic loss between inner hair cells and auditory nerve fibers, a condition known as cochlear synaptopathy (CS).CS is a pathology associated with aging and is thought to be an early indicator of hearing impairment. Interestingly, animals with selective CS showed physiological and behavioral TFS coding similar to that of the young normal-hearing group, contrasting with the aged group's deficits. Despite histological evidence of significant synaptic loss in the CS group, the study concludes that CS does not appear to affect TFS coding, either behaviorally or physiologically.  

      We agree with the reviewer’s summary.

      Strengths:  

      This study addresses a critical health concern, enhancing our understanding of mechanisms underlying age-related difficulties in speech intelligibility, even when audiometric thresholds are within normal limits. A major strength of this work is the comprehensive approach, integrating behavioral assessments, auditory nerve (AN) physiology, and histology within the same animal subjects. This approach enhances understanding of the mechanisms underlying the behavioral outcomes and provides confidence in the actual occurrence of synapse loss and its effects. The study carefully manages controlled conditions by including five distinct groups: young normal-hearing animals, aged animals, animals with CS induced through low and high doses, and a sham surgery group. This careful setup strengthens the study's reliability and allows for meaningful comparisons across conditions. Overall, the manuscript is well-structured, with clear and accessible writing that facilitates comprehension of complex concepts.

      Weaknesses:

      The stimulus and task employed in this study are very helpful for behavioral research, and using the same stimulus setup for physiology is advantageous for mechanistic comparisons. However, I have some concerns about the limitations in auditory nerve (AN) physiology. Due to practical constraints, it is not feasible to record from a large enough population of fibers that covers a full range of best frequencies (BFs) and spontaneous rates (SRs) within each animal. This raises questions about how representative the physiological data are for understanding the mechanism in behavioral data. I am curious about the authors' interpretation of how this stimulus setup might influence results compared to methods used by Kale and Heinz (2010), who adjusted harmonic frequencies based on the characteristic frequency (CF) of recorded units. While, the harmonic frequencies in this study are fixed across all CFs, meaning that many AN fibers may not be tuned closely to the stimulus frequencies. If units are not responsive to the stimulus further clarification on detecting mistuning and phase locking to TFS effects within this setup would be valuable. Since the harmonic frequencies in this study are fixed across all CFs, this means that many AN fibers may not be tuned closely to the stimulus frequencies, adding sampling variability to the results.

      We chose the stimuli for the AN recordings to be identical to the stimuli used in the behavioral evaluation of the perceptual sensitivity. Only with this approach can we directly compare the response of the population of AN fibers with perception measured in behavior.

      The stimuli are complex, i.e., comprise of many frequency components AND were presented at 68 dB SPL. Thus, the stimuli excite a given fiber within a large portion of the fiber’s receptive field. Furthermore, during recordings, we assured ourselves that fibers responded to the stimuli by audiovisual control. Otherwise it would have cost valuable recording time to record from a nonresponsive AN fiber.

      Given the limited number of units per condition-sometimes as few as three for certain conditions - I wonder if CF-dependent variability might impact the results of the AN data in this study and discussing this factor can help with better understanding the results. While the use of the same stimuli for both behavioral and physiological recordings is understandable, a discussion on how this choice affects interpretation would be beneficial. In addition a 60 dB stimulus could saturate high spontaneous rate (HSR) AN fibers, influencing neural coding and phase-locking to TFS. Potentially separating SR groups, could help address these issues and improve interpretive clarity.  

      A deeper discussion on the role of fiber spontaneous rate could also enhance the study. How might considering SR groups affect AN results related to TFS coding? While some statistical measures are included in the supplement, a more detailed discussion in the main text could help in interpretation.  We do not think that it will be necessary to conduct any statistical analysis in addition to that already reported in the supplement.  

      We considered moving some supplementary information back into the main manuscript but decided against it. Our single-unit sample was not sufficient, i.e. not all subpopulations of auditory-nerve fibers were sufficiently sampled for all animal treatment groups, to conclusively resolve every aspect that may be interesting to explore. The power of our approach lies in the direct linkage of several levels of investigation – cochlear synaptic morphology, single-unit representation and behavioral performance – and, in the main manuscript, we focus on the core question of synaptopathy and its relation to temporal fine structure perception. This is now spelled out clearly in lines 197 - 203 of the main manuscript.  

      Although Figure S2 indicates no change in median SR, the high-dose treatment group lacks LSR fibers, suggesting a different distribution based on SR for different animal groups, as seen in similar studies on other species. A histogram of these results would be informative, as LSR fiber loss with CS-whether induced by ouabain in gerbils or noise in other animals-is well documented (e.g., Furman et al., 2013).  

      Figure S2 was revised to avoid overlap of data points and show the distributions more clearly. Furthermore, the sample sizes for LSR and HSR fibers are now provided separately.

      Although ouabain effects on gerbils have been explored in previous studies, since these data already seems to be recorded for the animal in this study, a brief description of changes in auditory brainstem response (ABR) thresholds, wave 1 amplitudes, and tuning curves for animals with cochlear synaptopathy (CS) in this study would be beneficial. This would confirm that ouabain selectively affects synapses without impacting outer hair cells (OHCs). For aged animals, since ABR measurements were taken, comparing hearing differences between normal and aged groups could provide insights into the pathologies besides CS in aged animals. Additionally, examining subject variability in treatment effects on hearing and how this correlates with behavior and physiology would yield valuable insights. If limited space maybe a brief clarification or inclusion in supplementary could be good enough.  

      We thank the reviewer for this constructive suggestion. The requested data were added in a new section of the Results, entitled “Threshold sensitivity and frequency tuning were not affected by the synapse loss.” (lines 150 – 174). Our young-adult, ouabain-treated gerbils showed no significant elevations of CAP thresholds and their neural tuning was normal. Old gerbils showed the typical threshold losses for individuals of comparable age, and normal neural tuning, confirming previous reports. Thus, there was no evidence for relevant OHC impairments in any of our animal groups.   

      Another suggestion is to discuss the potential role of MOC efferent system and effect of anesthesia in reducing efferent effects in AN recordings. This is particularly relevant for aged animals, as CS might affect LSR fibers, potentially disrupting the medial olivocochlear (MOC) efferent pathway. Anesthesia could lessen MOC activity in both young and aged animals, potentially masking efferent effects that might be present in behavioral tasks. Young gerbils with functional efferent systems might perform better behaviorally, while aged gerbils with impaired MOC function due to CS might lack this advantage. A brief discussion on this aspect could potentially enhance mechanistic insights.  

      Thank you for this suggestion. The potential role of olivocochlear efferents is now discussed in lines 597 - 613.

      Lastly, although synapse counts did not differ between the low-dose treatment and NH I sham groups, separating these groups rather than combining them with the sham might reveal differences in behavior or AN results, particularly regarding the significance of differences between aged/treatment groups and the young normal-hearing group.  

      For maximizing statistical power, we combined those groups in the statistical analysis. These two groups did not differ in synapse number, threshold sensitivity or neural tuning bandwidths.

      Reviewer #2 (Public review):

      Summary:  

      Using a gerbil model, the authors tested the hypothesis that loss of synapses between sensory hair cells and auditory nerve fibers (which may occur due to noise exposure or aging) affects behavioral discrimination of the rapid temporal fluctuations of sounds. In contrast to previous suggestions in the literature, their results do not support this hypothesis; young animals treated with a compound that reduces the number of synapses did not show impaired discrimination compared to controls. Additionally, their results from older animals showing impaired discrimination suggest that agerelated changes aside from synaptopathy are responsible for the age-related decline in discrimination. 

      We agree with the reviewer’s summary.

      Strengths: 

      (1) The rationale and hypothesis are well-motivated and clearly presented. 

      (2) The study was well conducted with strong methodology for the most part, and good experimental control. The combination of physiological and behavioral techniques is powerful and informative. Reducing synapse counts fairly directly using ouabain is a cleaner design than using noise exposure or age (as in other studies), since these latter modifiers have additional effects on auditory function. 

      (3) The study may have a considerable impact on the field. The findings could have important implications for our understanding of cochlear synaptopathy, one of the most highly researched and potentially impactful developments in hearing science in the past fifteen years.  

      Weaknesses: 

      (1) My main concern is that the stimuli may not have been appropriate for assessing neural temporal coding behaviorally. Human studies using the same task employed a filter center frequency that was (at least) 11 times the fundamental frequency (Marmel et al., 2015; Moore and Sek, 2009). Moore and Sek wrote: "the default (recommended) value of the centre frequency is 11F0." Here, the center frequency was only 4 or 8 times the fundamental frequency (4F0 or 8F0). Hence, relative to harmonic frequency, the harmonic spacing was considerably greater in the present study. By my calculations, the masking noise used in the present study was also considerably lower in level relative to the harmonic complex than that used in the human studies. These factors may have allowed the animals to perform the task using cues based on the pattern of activity across the neural array (excitation pattern cues), rather than cues related to temporal neural coding. The authors show that mean neural driven rate did not change with frequency shift, but I don't understand the relevance of this. It is the change in response of individual fibers with characteristic frequencies near the lowest audible harmonic that is important here.  

      The auditory filter bandwidth of the gerbil is about double that of human subjects. Because of this, the masking noise has a larger overall level than in the human studies in the filter, prohibiting the use of distortion products. The larger auditory filter bandwidth precludes that the gerbils can use excitation patterns, especially in the condition with a center frequency of 1600 Hz and a fundamental of 200 Hz and in the condition with a center frequency of 3200 Hz and a fundamental of 400 Hz. In the condition with a center frequency of 1600 Hz and a fundamental of 400 Hz, it is possible that excitation patterns are exploited. We have now added  modeling of the excitation patterns, and a new figure showing their change at the gerbils’ perception threshold, in the discussion of the revised version (lines 440 - 446 and Fig. 8).

      The case against excitation pattern cues needs to be better made in the Discussion. It could be that gerbil frequency selectivity is broad enough for this not to be an issue, but more detail needs to be provided to make this argument. The authors should consider what is the lowest audible harmonic in each case for their stimuli, given the level of each harmonic and the level of the pink noise. Even for the 8F0 center frequency, the lowest audible harmonic may be as low as the 4th (possibly even the 3rd). In human, harmonics are thought to be resolvable by the cochlea up to at least the 8th.  

      This issue is now covered in the discussion, see response to the previous point.

      (2) The synapse reductions in the high ouabain and old groups were relatively small (mean of 19 synapses per hair cell compared to 23 in the young untreated group). In contrast, in some mouse models of the effects of noise exposure or age, a 50% reduction in synapses is observed, and in the human temporal bone study of Wu et al. (2021, https://doi.org/10.1523/JNEUROSCI.3238-20.2021) the age-related reduction in auditory nerve fibres was ~50% or greater for the highest age group across cochlear location. It could be simply that the synapse loss in the present study was too small to produce significant behavioral effects. Hence, although the authors provide evidence that in the gerbil model the age-related behavioral effects are not due to synaptopathy, this may not translate to other species (including human). This should be discussed in the manuscript. 

      We agree that our results apply to moderate synaptopathy, which predominantly characterizes early stages of hearing loss or aged individuals without confounding noise-induced cochlear damage. This is now discussed in lines 486 – 498.

      It would be informative to provide synapse counts separately for the animals who were tested behaviorally, to confirm that the pattern of loss across the group was the same as for the larger sample.  

      Yes, the pattern was the same for the subgroup of behaviorally tested animals. We have added this information to the revised version of the manuscript (lines 137 – 141).

      (3) The study was not pre-registered, and there was no a priori power calculation, so there is less confidence in replicability than could have been the case. Only three old animals were used in the behavioral study, which raises concerns about the reliability of comparisons involving this group.  

      The results for the three old subjects differed significantly from those of young subjects and young ouabain-treated subjects. This indicates a sufficient statistical power, since otherwise no significant differences would be observed.

      Reviewer #3 (Public review):

      This study is a part of the ongoing series of rigorous work from this group exploring neural coding deficits in the auditory nerve, and dissociating the effects of cochlear synaptopathy from other agerelated deficits. They have previously shown no evidence of phase-locking deficits in the remaining auditory nerve fibers in quiet-aged gerbils. Here, they study the effects of aging on the perception and neural coding of temporal fine structure cues in the same Mongolian gerbil model. 

      They measure TFS coding in the auditory nerve using the TFS1 task which uses a combination of harmonic and tone-shifted inharmonic tones which differ primarily in their TFS cues (and not the envelope). They then follow this up with a behavioral paradigm using the TFS1 task in these gerbils. They test young normal hearing gerbils, aged gerbils, and young gerbils with cochlear synaptopathy induced using the neurotoxin ouabain to mimic synapse losses seen with age. 

      In the behavioral paradigm, they find that aging is associated with decreased performance compared to the young gerbils, whereas young gerbils with similar levels of synapse loss do not show these deficits. When looking at the auditory nerve responses, they find no differences in neural coding of TFS cues across any of the groups. However, aged gerbils show an increase in the representation of periodicity envelope cues (around f0) compared to young gerbils or those with induced synapse loss. The authors hence conclude that synapse loss by itself doesn't seem to be important for distinguishing TFS cues, and rather the behavioral deficits with age are likely having to do with the misrepresented envelope cues instead.  

      We agree with the reviewer’s summary.

      The manuscript is well written, and the data presented are robust. Some of the points below will need to be considered while interpreting the results of the study, in its current form. These considerations are addressable if deemed necessary, with some additional analysis in future versions of the manuscript. 

      Spontaneous rates - Figure S2 shows no differences in median spontaneous rates across groups. But taking the median glosses over some of the nuances there. Ouabain (in the Bourien study) famously affects low spont rates first, and at a higher degree than median or high spont rates. It seems to be the case (qualitatively) in Figure S2 as well, with almost no units in the low spont region in the ouabain group, compared to the other groups. Looking at distributions within each spont rate category and comparing differences across the groups might reveal some of the underlying causes for these changes. Given that overall, the study reports that low-SR fibers had a higher ENV/TFS log-zratio, the distribution of these fibers across groups may reveal specific effects of TFS coding by group.  

      As the reviewer points out, our sample from the group treated with a high concentration of ouabain showed very few low-spontaneous-rate auditory-nerve fibers, as expected from previous work. However, this was also true, e.g., for our sample from sham-operated animals, and may thus well reflect a sampling bias. We are therefore reluctant to attach much significance to these data distributions. We now point out more clearly the limitations of our auditory-nerve sample for the exploration of  interesting questions beyond our core research aim (see also response to Reviewer 1 above).  

      Threshold shifts - It is unclear from the current version if the older gerbils have changes in hearing thresholds, and whether those changes may be affecting behavioral thresholds. The behavioral stimuli appear to have been presented at a fixed sound level for both young and aged gerbils, similar to the single unit recordings. Hence, age-related differences in behavior may have been due to changes in relative sensation level. Approaches such as using hearing thresholds as covariates in the analysis will help explore if older gerbils still show behavioral deficits.  

      Unfortunately, we did not obtain behavioral thresholds that could be used here. We want to point out that the TFS 1 stimuli had an overall level of 68 dB SPL, and the pink noise masker would have increased the threshold more than expected from the moderate, age-related hearing loss in quiet. Thus, the masked thresholds for all gerbil groups are likely similar and should have no effect on the behavioral results.

      Task learning in aged gerbils - It is unclear if the aged gerbils really learn the task well in two of the three TFS1 test conditions. The d' of 1 which is usually used as the criterion for learning was not reached in even the easiest condition for aged gerbils in all but one condition for the aged gerbils (Fig. 5H) and in that condition, there doesn't seem to be any age-related deficits in behavioral performance (Fig. 6B). Hence dissociating the inability to learn the task from the inability to perceive TFS 1 cues in those animals becomes challenging.  

      Even in the group of gerbils with the lowest sensitivity, for the condition 400/1600 the animals achieved a d’ of on average above 1. Furthermore, stimuli were well above threshold and audible, even when no discrimination could be observed. Finally, as explained in the methods, different stimulus conditions were interleaved in each session, providing stimuli that were easy to discriminate together with those being difficult to discriminate. This approach ensures that the gerbils were under stimulus control, meaning properly trained to perform the task. Thus, an inability to discriminate does not indicate a lack of proper training.  

      Increased representation of periodicity envelope in the AN - the mechanisms for increased representation of periodicity envelope cues is unclear. The authors point to some potential central mechanisms but given that these are recordings from the auditory nerve what central mechanisms these may be is unclear. If the authors are suggesting some form of efferent modulation only at the f0 frequency, no evidence for this is presented. It appears more likely that the enhancement may be due to outer hair cell dysfunction (widened tuning, distorted tonotopy). Given this increased envelope coding, the potential change in sensation level for the behavior (from the comment above), and no change in neural coding of TFS cues across any of the groups, a simpler interpretation may be -TFS coding is not affected in remaining auditory nerve fibers after age-related or ouabain induced synapse loss, but behavioral performance is affected by altered outer hair cell dysfunction with age. 

      A similar point was made by Reviewer #1. As indicated above, new data on threshold sensitivity and neural tuning were added in a new section of the Results which indirectly suggest that significant OHC pathologies were not a concern, neither in our young-adult, synaptopathic gerbils nor in the old gerbils.  

      Emerging evidence seems to suggest that cochlear synaptopathy and/or TFS encoding abilities might be reflected in listening effort rather than behavioral performance. Measuring some proxy of listening effort in these gerbils (like reaction time) to see if that has changed with synapse loss, especially in the young animals with induced synaptopathy, would make an interesting addition to explore perceptual deficits of TFS coding with synapse loss.  

      This is an interesting suggestion that we now explore in the revision of the manuscript. Reaction times can be used as a proxy for listening effort and were recorded for all responses. The the new analysis now reported in lines 378 - 396 compared young-adult control gerbils with young-adult gerbils that had been treated with the high concentration of ouabain. No differences in response latencies was found, indicating that listening effort did not change with synapse loss.  

      Reviewer #1 (Recommendations for the authors): 

      Figure 2: The y-axis labeled as "Frequency" is potentially misleading since there are additional frequency values on the right side of the panels. It would be helpful to clarify more in the caption what these right-side frequency values represent. Additionally, the legend could be positioned more effectively for clarity.

      Thank you for your suggestion. The axis label was rephrased.

      Figure 7: This figure is a bit unclear, as it appears to show two sets of gerbil data at 1500 Hz, yet the difference between them is not explained.  

      We added the following text to the figure legend: „The higher and lower thresholds shown for the gerbil data reflect thresholds at  fc of 1600 Hz for fundamentals f0 of 200 Hz and 400 Hz, respectively.“

      Maybe a short description of fmax that is used in Figure 4 could help or at least point to supplementary for finding the definition.  

      We thank the reviewer for pointing out this typo/inaccuracy. The correct terminology in line with the remainder of the manuscript is “fmaxpeak”. We corrected the caption of figure 5 (previously figure 4) and added the reference pointing to figure 11 (previously figure 9), which explains the terms.

      I couldn't find information about the possible availability of data. 

      The auditory-nerve recordings reported in this paper are part of a larger study of single-unit auditorynerve responses in gerbils, formally described and published by Heeringa (2024) Single-unit data for sensory neuroscience: Responses from the auditory nerve of young-adult and aging gerbils. Scientific Data 11:411, https://doi.org/10.1038/s41597-024-03259-3. As soon as the Version of Record will be submitted, the raw single-unit data can be accessed directly through the following link:  https://doi.org/10.5061/dryad.qv9s4mwn4. The data that are presented in the figures of the present manuscript and were statistically analyzed are uploaded to the Zenodo repository (https://doi.org/10.5281/zenodo.15546625).  

      Reviewer #2 (Recommendations for the authors): 

      L22. The term "hidden hearing loss" is used in many different ways in the literature, from being synonymous with cochlear synaptopathy, to being a description of any listening difficulties that are not accounted for by the audiogram (for which there are many other / older terms). The original usage was much more narrow than your definition here. It is not correct that Schaette and McAlpine defined HHL in the broad sense, as you imply. I suggest you avoid the term to prevent further confusion.  

      We eliminated the term hidden hearing loss.

      L43. SNHL is undefined.

      Thank you for catching that. The term is now spelled out.

      L64. "whether" -> "that"  

      We corrected this issue.

      L102. It would be informative to see the synapse counts (across groups) for the animals tested in the behavioral part of the study. Did these vary between groups in the same way?  

      Yes, the pattern was the same for the subgroup of behaviorally tested animals. We have added this information to the revised version of the manuscript (lines 137 – 141).

      L108. How many tests were considered in the Bonferroni correction? Did this cover all reported tests in the paper?  

      The comparisons of synapse numbers between treatment groups were done with full Bonferroni correction, as in the other tests involving posthoc pair-wise comparisons after an ANOVA.

      Figure 1 and 6 captions. Explain meaning of * and ** (criteria values).  

      The information was added to the figure legends of now Figs. 1 and 7. 

      L139. I don't follow the argument - the mean driven rate is not important. It is the rate at individual CFs and how that changes with frequency shift that provides the cue.

      L142. I don't follow - individual driven rates might have been a cue (some going up, some down, as frequency was shifted).  

      Yes, theoretically it is possible that the spectral pattern of driven rates (i.e., excitation pattern) can be specifically used for profile analysis and subsequently as a strong cue for discriminating the TFS1 stimuli. In order to shed some light on this question with regard to the actual stimuli used in this study, we added a comprehensive figure showing simulated excitation patterns (figure 8). The excitation patterns were generated with a gammatone filter bank and auditory filter bandwidths appropriate for gerbils (Kittel et al. 2002). The simulated excitation patterns allow to draw some at least semi-quantitative conclusions about the possibility of profile analysis: 1. In the 200/1600 Hz and 400/3200 Hz conditions (i.e., harmonic number of fc is 8), the difference between all inharmonic excitation patterns and the harmonic reference excitation pattern is far below the threshold for intensity discrimination (Sinnott et al. 1992). 2. In the same conditions, the statistics of the pink noise make excitation patterns differences at or beyond the filter slopes (on both high and low frequency limits) useless for frequency shift discrimination. 3. In the 400/1600 Hz condition (i.e., harmonic number of fc is 4), there is a non-negligible possibility that excitation pattern differences were a main cue for discrimination. All of these conclusions are compatible with the results of our study.

      L193. Is this p-value Bonferroni corrected across the whole study? If not, the finding could well be spurious given the number of tests reported.  

      Yes, it is Bonferroni corrected

      L330. TFS is already defined.  

      L346. AN is already defined.  

      L408. "temporal fine structure" -> "TFS"  

      It was a deliberate decision to define these terms again in the Discussion, for readers who prefer to skip most of the detailed Results. 

      L364-366. This argument is somewhat misleading. Cochlear resolvability largely depends on the harmonic spacing (i.e., F0) relative to harmonic frequency (in other words, on harmonic rank). Marmel et al. (2015) and Moore and Sek (2009) used a center frequency (at least) 11 times F0. Here, the center frequency was only 4 or 8 times F0. In human, this would not be sufficient to eliminate excitation pattern cues.  

      We have now included results from modeling the excitation patterns in the discussion with a new figure demonstrating that at a center frequency of 8 times F0, excitation patterns provide no useful cue while this is a possibility at  a center frequency of 4 times F0 (Fig. 8, lines 440 - 446).

      L541. Was that a spectrum level of 20 dB SPL (level per 1-Hz wide band) at 1 kHz? Need to clarify.  

      The power spectral density of the pink noise at 1 kHz (i.e., the level in a 1 Hz wide band centered at 1 kHz) was 13.3 dB SPL. The total level of the pink noise (including edge filters at 100 Hz and 11 kHz) was 50 dB SPL.

      L919. So was the correction applied across only the tests within each ANOVA? Don't you need to control the study-wise error rate (across all primary tests) to avoid spurious findings?  

      We added information about the family-wise error rate (line 1077 - 1078). Since the ANOVAs tested different specific research questions, we do not think that we need to control the study-wise error rate.

      Reviewer #3 (Recommendations for the authors): 

      There was no difference in TFS sensitivity in the AN fiber activity across all the groups. Potential deficits with age were only sound in the behavioral paradigm. Given that, it might make it clearer to specify that the deficits or lack thereof are in behavior, in multiple instances in the manuscript where it says synaptopathy showed no decline in TFS sensitivity (For example Line 342-344).  

      We carefully went through the entire text and clarified a couple more instances.

      L353 - this statement is a bit too strong. It implies causality when there is only a co-occurrence of increased f0 representation and age-related behavioral deficits in TFS1 task.  

      The statement was rephrased as “Thus, cue representation may be associated with the perceptual deficits, but not reduced synapse numbers, as originally proposed.”

      L465-467 - while this may be true, I think it is hard to say this with the current dataset where only AN fibers are being recorded from. I don't think we can say anything about afferent central mechanisms with this data set.  

      We agree. However, we refer here to published data on central inhibition to provide a possible explanation. 

      Hearing thresholds with ABRs are mentioned in the methods, but that data is not presented anywhere. Would be nice to see hearing thresholds across the various groups to account or discount outer hair cell dysfunction. 

      This important point was made repeatedly and we thank the Reviewers for it. As indicated above, new data on threshold sensitivity and neural tuning were added in a new section of the Results which indirectly suggest that significant OHC pathologies were not a concern, neither in our young-adult, synaptopathic gerbils nor in the old gerbils.

    1. eLife Assessment

      This valuable study introduces a non-perturbative pulse-labeling strategy for yeast nuclear pore complexes (NPCs), employing a nanobody-based approach in order to selectively capture Nup84-containing complexes for imaging and biochemical analysis. The data convincingly demonstrate that a short induction period (20 minutes to 1 hour) yields a strong and sustained signal, enabling affinity purification that faithfully recapitulates the endogenous Nup84 interactome. This tool offers a powerful framework for investigating NPC dynamics and associated interactomes through both imaging and biochemical assays.

    2. Reviewer #1 (Public review):

      Summary:

      The authors present a nanobody-based pulse-labeling system to track yeast NPCs. Transient expression of a nanobody targeting Nup84 (fused to NeonGreen or an affinity tag) permits selective visualization and biochemical capture of NPCs. Short induction effectively labels NPCs, and the resulting purifications match those from conventional Nup84 tagging. Crucially, when induction is repressed, dilution of the labeled pool through successive cell cycles allows the visualization of "old" NPCs (and potentially individual NPCs), providing a powerful view of NPC lifespan and turnover without permanently modifying a core scaffold protein.

      Strengths:

      (1) A brief expression pulse labels NPCs, and subsequent repression allows dilution-based tracking of older (and possibly single) NPCs over multiple cell cycles.

      (2) The affinity-purified complexes closely match known Nup84-associated proteins, indicating specificity and supporting utility for proteomics.

      Weaknesses:

      (1) Reliance on GAL induction introduces metabolic shifts (raffinose → galactose → glucose) that could subtly alter cell physiology or the kinetics of NPC assembly. Alternative induction systems (e.g., β-estradiol-responsive GAL4-ER-VP16) could be discussed as a way to avoid carbon-source changes.

      (2) While proteomics is solid, a comprehensive supplementary table listing all identified proteins (with enrichment and statistics) would enhance transparency.

      (3) Importantly, the authors note that the method is particularly useful "in conditions where direct tagging of Nup84 interferes with its function, while sub-stoichiometric nanobody binding does not." After this sentence, it would be valuable to add concrete examples, such as experiments examining NPC integrity in aging or stress conditions where epitope tags can exacerbate phenotypes. These examples will help readers identify situations in which this approach offers clear advantages.

    3. Reviewer #2 (Public review):

      Summary:

      This preprint describes a practical and useful approach for labeling and tracking NPCs in situ. While useful applications including timelapse imaging, affinity purification, or proximity labeling are envisioned, addressing some outstanding technical questions would give a clearer picture of the sensitivity and temporal resolution of this approach.

      Strengths:

      Clever use of a fluorescently conjugated nanobody that binds directly to the core scaffold nucleoporin Nup84 with nanomolar affinity.

      Weaknesses:

      The decrease in nanobody labeling over 8 hours of chase period is interpreted to indicate that NPCs turn over during this time. However, it is also possible that the nanobody:Nup84 association is disrupted during mitosis by phosphorylation, other PTMs, or structural remodeling.

    4. Reviewer #3 (Public review):

      Summary:

      Submitted to the Tools and Resources series, this study reports on the use of a single-domain antibody targeting the nucleoporin Nup84 to probe and track NPCs in budding yeast. The authors demonstrate their ability to rapidly label or pull down NPCs by inducing the expression of a tagged version of the nanobody (Figure 1).

      Strengths:

      This tool's main strength is its versatility as an inexpensive, easy-to-set-up alternative to metabolic labelling or optical switching. This same rationale could, in principle, be applied to the study of other multiprotein complexes using similar strategies, provided that single-chain antibodies are available.

      Weaknesses:

      This approach has no inherent weaknesses, but it would be useful for the authors to verify that their pulse labelling strategy can also be used to detect assembly intermediates, structural variants, or damaged NPCs.

      Overall, the data clearly show that Nup84 nanobodies are a valuable tool for imaging NPC dynamics and investigating their interactomes through affinity purification.

    1. eLife Assessment

      The authors examined the frequency of alternative splicing across prokaryotes and eukaryotes and found that the rate of alternative splicing varies with taxonomic groups and genome coding content. This solid work, based on nearly 1,500 high-quality genome assemblies, relies on a novel genome-scale metric that enables cross-species comparisons and that quantifies the extent to which coding sequences generate multiple mRNA transcripts via alternative splicing. This timely study provides an important basis for improving our general understanding of genome architecture and the evolution of life forms.

    2. Reviewer #2 (Public review):

      Summary:

      In this contribution, the authors investigate the degree of alternative splicing across the evolutionary tree, and identify a trend of increasing alternative splicing as you move from the base of the tree (here, only prokaryotes are considered) towards the tips of the tree. In particular, the authors investigate how the degree of alternative splicing (roughly speaking, the number of different proteins made from a single ORF (open reading frame) via alternative splicing) relates to three genomic variables: the genome size, the gene content (meaning the fraction of the genome composed of ORFs), and finally, the coding percentage of ORFs, meaning the ratio between exons and total DNA in the ORF.

      The revised manuscript addresses the problems identified in the first round of reviews and now serves as a guide to understand how alternative splicing has evolved within different phyla, as opposed to making unsubstantiated claims about overall trends.

    3. Reviewer #3 (Public review):

      Summary:

      In "Alternative Splicing Across the Tree of Life: A Comparative Study," the authors use rich annotation features from nearly 1,500 high-quality NCBI genome assemblies to develop a novel genome-scale metric, the Alternative Splicing Ratio, that quantifies the extent to which coding sequences generate multiple mRNA transcripts via alternative splicing (AS). This standardized metric enables cross-species comparisons and reveals clear phylogenetic patterns: minimal AS in prokaryotes and unicellular eukaryotes, moderate AS in plants, and high AS in mammals and birds. The study finds a strong negative correlation between AS and coding content, with genomes containing approximately 50% intergenic DNA exhibiting the highest AS activity. By integrating diverse lines of prior evidence, the study offers a cohesive evolutionary framework for understanding how alternative splicing varies and evolves across the tree of life.

      Strengths:

      By studying alternative splicing patterns across the tree of life, the authors systematically address an important yet historically understudied driver of functional diversity, complexity, and evolutionary innovation. This manuscript makes a valuable contribution by leveraging standardized, publicly available genome annotations to perform a global survey of transcriptional diversity, revealing lineage-specific patterns and evolutionary correlates. The authors have done an admirable job in this revised version, thoroughly addressing prior reviewer comments. The updated manuscript includes more rigorous statistical analyses, careful consideration of potential methodological biases, expanded discussion of regulatory mechanisms, and acknowledgment of non-adaptive alternatives. Overall, the work presents an intriguing view of how alternative splicing may serve as a flexible evolutionary strategy, particularly in lineages with limited capacity for coding expansion (e.g., via gene duplication). Notably, the identification of genome size and genic coding fraction thresholds (~20 Mb and ~50%, respectively) as tipping points for increased splicing activity adds conceptual depth and potential generalizability.

      Weaknesses:

      While the manuscript offers a broad comparative view of alternative splicing, its central message becomes diffuse in the revised version. The focus of the study is unclear, and the manuscript comes across as largely descriptive without a well-articulated hypothesis or explanatory evolutionary model. Although the discussion gestures toward adaptive and non-adaptive mechanisms, these interpretations are not developed early or prominently enough to anchor the reader. The negative correlation between alternative splicing and coding content is compelling, but the biological significance of this pattern remains ambiguous: it is unclear whether it reflects functional constraint, genome organization, or annotation bias. This uncertainty weakens the manuscript's broader evolutionary inferences.

      Sections of the Introduction, particularly lines 72-90, lack cohesion and logical flow, shifting abruptly between topics without a clear structure. A more effective approach may involve separating discussions of coding and non-coding sequence evolution to clarify their distinct contributions to splicing complexity. Furthermore, some interpretive claims lack nuance. For example, the assertion that splicing in plants "evolved independently" seems overstated given the available evidence, and the citation regarding slower evolution of highly expressed genes overlooks counterexamples from the immunity and reproductive gene literature.

      Presentation of the results is occasionally vague. For instance, stating "we conducted comparisons of mean values" (line 146) without specifying the metric undercuts interpretability. The authors should clarify whether these comparisons refer to the Alternative Splicing Ratio or another measure. Additionally, the lack of correlation between splicing and coding region fraction in prokaryotes may reflect a statistical power issue, particularly given their limited number of annotated isoforms, rather than a biological absence of pattern.

      Finally, the assessment of annotation-related bias warrants greater methodological clarity. The authors note that annotations with stronger experimental support yield higher splicing estimates, yet the normalization strategy for variation in transcriptomic sampling (e.g., tissue breadth vs sequencing depth) is insufficiently described. As these factors can significantly influence splicing estimates, a more rigorous treatment is essential. While the authors rightly acknowledge that splicing represents only one layer of regulatory complexity, the manuscript would benefit from a more integrated consideration of additional dimensions, such as 3D genome architecture, e.g., the potential role of topologically associating domains in constraining splicing variation.

    4. Reviewer #4 (Public review):

      The manuscript reports on a large-scale study correlating genomic architecture with splicing complexity over almost 1,500 species. We still know relatively little about alternative splicing functional consequences and evolution, and thus, the study is relevant and timely. The methodology relies on annotations from NCBI for high-quality genomes and a main metric proposed by the authors and named Alternative Splicing Ratio (ASR). It quantifies the level of redundancy of each coding nucleotide in the annotated isoforms.

      According to the authors' response to the first reviewers' comments, the present version of the manuscript seems to be a profoundly revised version compared to the original submission. I did not have access to the reviewers' comments.

      Although the study addresses an important question and the authors have visibly made an important effort to make their claims more statistically robust, I have a number of major concerns regarding the methodology and its presentation.

      (1) A large part of the manuscript is speculative and vague. For instance, the Discussion is very long (almost longer than the Results section) and the items discussed are sometimes not in direct connection with the present work. I would suggest merging the last 2 paragraphs, for instance, since the before last paragraph is essentially a review of the literature without direct connection to the present work.

      (2) The Methods section lacks clarity and precision. A large part is devoted to explaining the biases in the data without any reference or quantification. The definition of ASR is very confusing. It is first defined in equation 2, with a different name, and then again in the next subsection from a different perspective on lines 512-518. Why build matrices of co-occurrences if these are, in practice, never used? It seems the authors exploit only the trace. A major revision, if I understood correctly, was the correction/normalisation of the ASR metric. This normalisation is not explained. The authors argue that they will write another paper about it, I do not think this is acceptable for the publication of the present manuscript. Furthermore, there is no information about the technical details of the implementation: which packages did the authors use?

      (3) Could the authors motivate why they do not directly focus on the MC permutation test? They motivate the use of permutations because the data contains extreme outliers and are non normal in most cases. Hence, it seems the Welch's ANOVA is not adapted. "To further validate our findings, we also conducted<br /> 148 a Monte Carlo permutation test, which supported the conclusions (see Methods)." Where is the comparison shown? I did not see any report of the results for the non-permuted version of the Welch's ANOVA.

      (4) What are the assumptions for the Phylogenetic Generalized Least Squares? Which evolution model was chosen and why? What is the impact of changing the model? Could the authors define more precisely (e.g. with equations) what is lambda? Is it estimated or fixed?

      (5) I think the authors could improve their account of recent literature on the topic. For instance, the paper https://doi.org/10.7554/eLife.93629.3, published in the same journal last year, should be discussed. It perfectly fits in the scope of the subsection "Evidence for the adaptive role of alternative splicing". Methods and findings reported in https://doi.org/10.1186/s13059-021-02441-9 and https://www.genome.org/cgi/doi/10.1101/gr.274696.120 directly concern the assessment of AS evolutionary conservation across long evolutionary times and/or across many species. These aspects are mentioned in the introduction on p.3. but without pointing to such works. Can we really qualify a work published in 2011 as "recent" (line 348-350)?

      The generated data and codes are available on Zenodo, which is a good point for reproducibility and knowledge sharing with the community.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1

      Methodological biases in annotation and sequencing methods

      We acknowledge the reviewer’s concern regarding methodological heterogeneity in genome annotations, particularly regarding the use of CDS annotations derived from public databases. In response, we have properly addressed the potential sources of bias in estimating alternative splicing (AS) across such a broad taxonomic range.

      Given the methodological challenges encountered in this study, we have undertaken an in-depth analysis of the biases associated with genome annotations and their impact on large-scale estimates of alternative splicing. This effort has resulted in the development of a comprehensive framework for quantifying, modeling, and correcting such biases, which we believe will be of interest to the broader genomics community. We are currently preparing a separate manuscript dedicated to this methodological aspect, which we intend to submit for publication in the near future.

      To account for these biases, we performed a statistical evaluation of annotation quality by examining the relationship between ASR values and multiple features of the NCBI annotation pipeline, including both technical and biological variables. Specifically, we analyzed a set of metadata descriptors related to: (i) genome assembly quality (e.g., Contig N50, Scaffold N50, number of gaps, gap length, contig/scaffold count), (ii) the amount and diversity of experimental evidence used in annotation (e.g., number of RNA-Seq reads, number of tissues, number of experimental runs, number of proteins and transcripts, including those derived from Homo sapiens), and (iii) the nature of the annotated coding sequences (e.g., total number of CDSs, percentage of CDSs supported by experimental evidence, proportion of known CDSs, percentage of CDSs derived from ab initio predictions).

      This comprehensive analysis revealed that the strongest bias affecting ASR values is associated with the proportion of fully supported CDSs, which showed a strong positive correlation with observed splicing levels. In contrast, the percentage of CDSs relying on ab initio models showed a negative correlation, indicating that computational predictions tend to underestimate splicing complexity. Based on these findings, we implemented a polynomial normalization model using the percentage of fully supported CDSs as the main predictor of annotation bias. The resulting normalized metric, ASR<sup>∗</sup>, corrects for annotation-related variability while preserving biologically meaningful variation.

      We further verified the robustness of this correction by comparing the main results of our study using both the raw ASR and the normalized ASR<sup>*</sup> across all analyses. The qualitative and quantitative consistency of results obtained with both metrics demonstrates that our findings are not an artifact of methodological bias and validates the reliability of our approach.

      Conceptual and Statistical Framework

      Our aim was not to investigate specific regulatory mechanisms of alternative splicing, but rather to explore large-scale statistical patterns across the tree of life using a newly defined metric—the Alternative Splicing Ratio (ASR)—that enables genome-wide comparisons of splicing complexity across species. To clarify the conceptual framework, we have revised the manuscript to explicitly state our assumptions, objectives, and the scope of our conclusions. The ASR metric is now briefly introduced in the Results section, with a more detailed mathematical formulation included in the Methods section.

      From a methodological standpoint, we have expanded the manuscript to better support the comparative framework through additional statistical analyses. In particular, we now include:

      • Monte Carlo permutation tests to assess pairwise differences in splicing and genomic variables across taxonomic groups, which are robust to non-normality and heteroscedasticity in the data.

      • Welch’s ANOVA with Bonferroni correction, which accounts for unequal variances when comparing group means.

      • Phylogenetic Generalized Least Squares (PGLS) regression, which explicitly models phylogenetic non-independence between species and allows us to infer lineage-specific associations between genomic composition and alternative splicing.

      • Coefficient of variation analysis, used to evaluate the relative variability of splicing and genomic traits across groups in a scale-independent manner.

      • Variability ratio metrics, designed to compare the dispersion of splicing values relative to genomic features, thereby quantifying trends in regulatory plasticity versus structural constraints.

      All methods are thoroughly described in the revised Methods section, and their application is presented in the Results section.

      Functional vs. non-functional nature of AS events

      We have included a new discussion paragraph addressing the ongoing debate regarding the functionality of alternative splicing and a possible non-adaptive explanation for the patterns observed. While many previous studies suggest that a considerable fraction of AS events might represent splicing noise or non-functional isoforms, our intention is not to adopt this view uncritically. Instead, we cite recent literature to provide a more nuanced interpretation, recognizing both the potential adaptive value and the uncertainty surrounding the functional relevance of many AS events. Thus, rather than assuming that all observed alternative splicing events are adaptive or biologically meaningful, we now emphasize that many patterns may emerge from other processes, such as those associated to genomic constraints.

      Terminology and Result Interpretation

      The manuscript has been thoroughly revised to improve both the scientific language and the conceptual framing. We have removed inappropriate terminology such as “higher/lower organisms” and “highly evolved”. Also, we have reinterpreted the results. As part of this process, the manuscript has been substantially rewritten to focus on the most meaningful findings. Ultimately, we have retained only those results that specifically concern broad-scale patterns of alternative splicing across taxa, which are now presented with greater clarity and methodological rigor.

      Reviewer #2

      Gene Regulatory Complexity Beyond Splicing Mechanisms

      While alternative splicing represents a prominent mechanism of transcriptomic diversification, we agree with the reviewer that it constitutes only one component of the broader landscape of gene regulation. Structural and behavioral complexity in organisms arises from a combination of regulatory processes, and our study focuses specifically on alternative splicing as a measurable proxy within this multifactorial system. To clarify this point, we have added a paragraph in the Discussion section, where we explicitly contextualize alternative splicing within the wider regulatory architecture. In that paragraph, we discuss additional mechanisms that contribute to phenotypic complexity—such as transcriptional control, chromatin remodeling, epigenetic modifications, and RNA editing—citing key literature.

      Alternative Splicing Measure and Methodology

      While we agree that alternative splicing is not a definitive measure of organismal complexity, we argue that it remains a meaningful proxy for transcriptomic and regulatory diversification, especially when analyzed at large phylogenetic scale. In this version of the manuscript, our goal was not to equate alternative splicing with biological complexity, but rather to quantify its patterns across lineages and evaluate its relationship with genome structure. This point is now explicitly stated in both the Introduction and Discussion.

      We also recognize the limitations associated with the use of coding sequence (CDS) annotations from public databases such as NCBI RefSeq. To address this concern, we have conducted a detailed analysis of the potential biases introduced by heterogeneous annotation quality, sequencing depth, and computational prediction, as previously addressed in our response to Reviewer #1.

      In response to concerns about unsupported statements, we have completely rewritten the manuscript to ensure that all claims are now explicitly supported by data and grounded in up-to-date scientific literature. We have reformulated speculative statements, removed inappropriate generalizations, and improved the logical flow of the arguments throughout the text. In summary, we have strengthened both the conceptual framework and the methodological foundation of the study, while maintaining a cautious interpretation of the results.

      Trends of Alternative Splicing

      To address the reviewer’s concern, we have revised the interpretation of trends as used in our analysis. In this study, we define a trend not as a strict directional progression or a linear trajectory across all species, but rather as a broad statistical pattern observable in the relative distribution and variability of alternative splicing across major taxonomic groups. We do not claim that this pattern reflects a universal adaptive pathway. Instead, we interpret it as a signal of differences in regulatory strategies associated to the genome architecture. To avoid misinterpretation, we have rephrased several sentences in the manuscript and explicitly emphasized the variability within groups, and the lack of significant correlations in certain clades.

      Inconsistent statistics

      The discrepancies pointed out were due to differences between mean and median-based analyses. These have been clarified and consistently reported in the revised manuscript. Error bars, p-values, and a supplementary table summarizing all tests are now included. Furthremore, we have no removed any species from our dataset.

    1. eLife Assessment

      This important study examines the evolution of virulence and antibiotic resistance in Staphylococcus aureus under multiple selection pressures. The evidence presented is convincing, with rigorous data that characterizes the outcomes of the evolution experiments. However, the manuscript's primary weakness is in its presentation, as claims about the causal relationship between genotypes and phenotypes are based on correlational evidence. The manuscript needs to be revised to address these limitations, clarify the implications of the experimental design, and adjust the overall narrative to better reflect the nature of the findings.

    2. Reviewer #1 (Public review):

      Summary:

      The authors investigate how methicillin-resistant (MRSA) and sensitive (MSSA) Staphylococcus aureus adapt to a new host (C. elegans) in the presence or absence of a low dose of the antibiotic oxacillin. Using an "Evolve and Resequence" design with 48 independently evolving populations, they track changes in virulence, antibiotic resistance, and other fitness-related traits over 12 passages. Their key finding is that selection from both the host and the antibiotic together, rather than either pressure alone, results in the evolution of the most virulent pathogens. Genomically, they find that this adaptation repeatedly involves mutations in a small number of key regulatory genes, most notably codY, agr, and saeRS.

      Strengths:

      The main advantage of the research lies in its strong and thoroughly replicated experimental framework, enabling significant conclusions to be drawn based on the concept of parallel evolution. The study successfully integrates various phenotypic assays (virulence, growth, hemolysis, biofilm formation) with whole-genome sequencing, offering an extensive perspective on the adaptive landscape. The identification of certain regulatory genes as common targets of selection across distinct lineages is an important result that indicates a level of predictability in how pathogens adapt.

      Weaknesses:

      (1) The main limitation of the paper is that its findings on the function of specific genes are based on correlation, not cause-and-effect evidence. While the parallel evolution evidence is strong, the authors have not yet performed the definitive tests (i.e., reconstruction of ancestral genes) to ensure that the mutations identified in isolation are enough to account for the virulence or resistance changes observed. This makes the conclusions more like firm hypotheses, not confirmed facts.

      (2) In some instances, the claims in the text are not fully supported by the visual data from the figures or are reported with vagueness. For example, the display of phenotypic clusters in the PCA (Figure 6A) and the sweeping generalization about the effect of antibiotics on the mutation rates (Figure S5) can be more precise and nuanced. Such small deviations dilute the overall argument somewhat and must be corrected.

    3. Reviewer #2 (Public review):

      Summary:

      The manuscript describes the results of an evolution experiment where Staphylococcus aureus was experimentally evolved via sequential exposure to an antibiotic followed by passaging through C. elegans hosts. Because infecting C. elegans via ingestion results in lysis of gut cells and an immune response upon infection, the S. aureus were exposed separately across generations to antibiotic stress and host immune stress. Interestingly, the dual selection pressure of antibiotic exposure and adaptation to a nematode host resulted in increased virulence of S. aureus towards C. elegans.

      Strengths:

      The data presented provide strong evidence that in S. aureus, traits involved in adaptation to a novel host and those involved in antibiotic resistance evolution are not traded off. On the contrary, they seem to be correlated, with strains adapted to antibiotics having higher virulence towards the novel host. As increased virulence is also associated with higher rates of haemolysis, these virulence increases are likely to reflect virulence levels in vertebrate hosts.

      Weaknesses:

      Right now, the results are presented in the context of human infections being treated with antibiotics, which, in my opinion, is inappropriate. This is because<br /> (1) exposure to the host and antibiotics was sequential, not simultaneous, and thus does not reflect the treatment of infection, and<br /> (2) because the site of infection is different in C. elegans and human hosts.

      Nevertheless, the results are of interest; I just think the interpretation and framing should be adjusted.

    4. Reviewer #3 (Public review):

      Summary:

      Su et al. sought to understand how the opportunistic pathogen Staphylococcus aureus responds to multiple selection pressures during infection. Specifically, the authors were interested in how the host environment and antibiotic exposure impact the evolution of both virulence and antibiotic resistance in S. aureus. To accomplish this, the authors performed an evolution experiment where S. aureus was fed to Caenorhabditis elegans as a model system to study the host environment and then either subjected to the antibiotic oxacillin or not. Additionally, the authors investigated the difference in evolution between an antibiotic-resistant strain, MRSA, and an isogenic susceptible strain, MSSA. They found that MRSA strains evolved in both antibiotic and host conditions became more virulent, and that strains evolved outside these conditions lost virulence. Looking at the strains evolved in just antibiotic conditions, the authors found that S. aureus maintained its ability to lyse blood cells. Mutations in codY, gdpP, and pbpA were found to be associated with increased virulence. Additionally, these mutations identified in these experiments were found in S. aureus strains isolated from human infections.

      Strengths:

      The data are well-presented, thorough, and are an important addition to the understanding of how certain pathogens might adapt to different selective pressures in complex environments.

      Weaknesses:

      There are a few clarifications that could be made to better understand and contextualize the results. Primarily, when comparing the number of mutations and selection across conditions in an evolution experiment, information about population sizes is important to be able to calculate the mutation supply and number of generations throughout the experiment. These calculations can be difficult in vivo, but since several steps in the methodology require plating and regrowth, those population sizes could be determined. There was also no mention of how the authors controlled the inoculation density of bacteria introduced to each host. This would need to be known to calculate the generation time within the host. These caveats should be addressed in the manuscript.

      Another concern is the number of generations the populations of S. aureus spent either with relaxed selection in rich media or under antibiotic pressure in between the host exposure periods. It is probable then that the majority of mutations were selected for in these intervening periods between host infection. Again, a more detailed understanding of population sizes would contribute to the understanding of which phase of the experiment contributed to the mutation profile observed.

    1. eLife Assessment

      This study reports on the development and characterization of chickens with genetic deficiencies in type I or type III interferon receptors, which is an important contribution to the field of avian immunology. The data reflecting the development of the new interferon-receptor-deficient chickens is compelling. However, the characterization of IFN biology and infection responses in these knockout chickens is somewhat incomplete and could be improved by addressing the noted weaknesses.

    2. Reviewer #1 (Public review):

      Summary:

      This manuscript presents an extensive body of work and an outstanding contribution to our understanding of the IFN type I and III system in chickens. The research started with the innovative approach of generating KO chickens that lack the receptor for IFNα/β (IFNAR1) or IFN-λ (IFNLR1). The successful deletion and functional loss of these receptors was clearly and comprehensively demonstrated in comparison to the WT. Moreover, the homozygous KO lines (IFNAR1-/- or IFNLR1-/- ) were found to have similar body weights, and normal egg production and fertility compared to their WT counterparts. These lines are a major contribution to the toolbox for the study of avian/chicken immunology.

      The significance of this contribution is further demonstrated by the use of these lines by the authors to gain insight into the roles of IFN type I and IFN-type III in chickens, by conducting in ovo and in vivo studies examining basic aspects of immune system development and function, as well as the responses to viral challenges conducted in ovo and in vivo.

      Based on solid, state-of the-art methods and convincing evidence from studies comparing various immune system related functions in the IFNAR1-/- or IFNLR1-/- lines to the WT, revealed that the deletion of IFNAR1 and/or IFNLR1 resulted in:<br /> (1) impaired IFN signaling and induction of anti-viral state;<br /> (2) modulation of immune cell profiles in the peripheral blood circulation and spleen;<br /> (3) modulation of the cecum microbiome;<br /> (4) reduced concentrations of IgM and IgY in the blood plasma before and following immunization with model antigen KLH, whereby also line differences in the time-course of the antibody production were observed;<br /> (5) decrease in MHCII+ macrophages and B cells in the spleen of IFNAR1 KO chickens, although the MHCII-expression per cell was not affected in this line; and<br /> (6) reduction in the response of αβ1 TCR+ T cells of IFNAR1 KO chickens as suggested by clonal repertoire analyses.

      These studies were then followed by examination of the role of type I and type III IFN in virus infection, using different avian influenza A virus strains as well as an avian gamma corona virus (IBV) in in ovo challenge experiments. These studies revealed: viral titers that reflect virus-species and strain-specific IFN responses; no differences in the secretion of IFN-α/β in both KO compared to the WT lines; a predominant role of type I IFN in inducing the interferon-stimulated gene (ISG) Mx; and that an excessive and unbalanced type I IFN response can harm host fitness (survival rate, length of survival) and contribute to immunopathology.

      Based on guidance from the in ovo studies, comprehensive in vivo studies were conducted on host-pathogen interactions in hens from the three lines (WT, IFNAR1 KO, or IFNLR1 KO). These studies revealed the early appearance of symptoms and poor survival of hens from the IFNR1 KO line challenged with H3N1 avian influenza A virus; efficient H#N1 virus replication in IFNAR1 KO hens, increased plasma concentrations of IFNα/β and mRNA expression of IFN-λ in spleens of the IFNAR1 KO hens; a pro-inflammatory role of IFN-λ in the oviduct of hens infected with H3N1 virus; increased proinflammatory cytokine expression in spleens of IFNAR1 KO hens, and Impairment of negative feedback mechanisms regulating IFN-α/β secretion in IFNAR1-KO hens and a significant decrease in this group's antiviral state; additionally it was demonstrated that IFN-α/β can compensate IFN-λ to induce an adequate antiviral state in the spleen during H3N1 infection, but IFN-λ cannot compensate for IFN-α/β signaling in the spleen.

      Strengths:

      (1) Both the methods and results from the comprehensive, well-designed, and well-executed experiments are considered excellent. The results are well and correctly described in the result narrative and well presented in both the manuscript and supplement Tables and Figures. Excellent discussion/interpretation of results.

      (2) The successful generation of the type I and type III IFN KO lines offers unprecedented insight and opens multiple new venues for exploring the IFN system in chickens. The new knowledge reported here is direct evidence of the high impact of this model system on effectively addressing a critical knowledge gap in avian immunology.

      (3) The thoughtful selection of highly relevant viruses to poultry and human health for the in ovo and in vivo challenge studies to examine and assess host-pathogen interactions in the IFNR KO and WT lines.

      (4) Making use of the unique opportunities in the chicken model to examine and evaluate the host's IFN system responses to various viral challenges in ovo, before conducting challenge studies in hens.

      (5) The new knowledge gained from the IFNAR1 and IFNLR1 KO lines will find much-needed application in developing more effective strategies to prevent health challenges like avian influenza and its devastating effects on poultry, humans, and other mammals.

      (6) The excellent cooperation and contributions of the co-authors and institutions.

      Weaknesses:

      No weaknesses were identified by this reviewer.

    3. Reviewer #2 (Public review):

      Summary:

      This study attempts to dissect the contributions of type I and type III IFNs to the antiviral response in chickens. The first part of the study characterises the generation of IFNAR and IFNLR KO chicken strains and describes basic differences. Four different viruses are then tested in chicken embryos, while the subsequent analysis of the antiviral response in vivo is performed with one influenza H3N1 strain.

      Strengths:

      Having these two KO chicken strains as a tool is a great achievement. The initial analysis is solid. Clear effect of IFNAR deficiency in in vivo infection, less so for IFNLR deficiency.

      Weaknesses:

      (1) The antibody induction by KLH immunisation: No data indicated whether or not this vaccination induces IFN responses in wt mice, so the effects observed may be due to steady-state differences or to differential effects of IFN induced during the vaccination phase. No pre-immune results are shown. The differences are relatively small and often found at only one plasma dilution - the whole of Figure 4 could be condensed into one or two panels by proper calculation of Ab titers - would these titres be significantly different? This, as all of the other in vivo experiments, has not been repeated, if I understand the methods section correctly.

      (2) The basic conundrum here and in later figures is never addressed by the authors: Situations where IFN type 1 and 3 signalling deficiency each have an independent effect (i.e., Figure 4d) suggest that they act by separate, unrelated mechanisms. However, all the literature about these IFN families suggests that they show almost identical signalling and gene induction downstream of their respective receptors. How can the same signalling, clearly active here downstream of the receptors for IFN type 1 or type 3, be non-redundant, i.e., why does the unaffected IFN family not stand in? This is a major difference from the mouse studies, which showed a rather subtle phenotype when only one of the two IFN systems was missing, but a massive reduction in virus control in double KO mice (the correct primary paper should be quoted here, not only the review by McNab). Reasons could be a direct effect of IFNab on B cells and an indirect effect of IFNL through non-B cells, timing issues, and many other scenarios can be envisaged. The authors do not address this question, which limits the depth of analysis.

      (3) In the one in vivo experiment performed with chickens, only one virus was tested; more influenza strains should be included, as well as non-influenza viruses.

      (4) The basic conundrum of point 2 applies equally to Figure 6a; both KOs have a phenotype. Again in 6d, both IFNs appear to be separately required for Mx induction. An explanation is needed.

      (5) Line 308, where are the viral titers you refer to in the text? The statement that the results demonstrate that excessive IFNab has a negative impact is overstretched, as no IFN measurements of the infected embryos are shown here.

      (6) The in vivo infection is the most interesting experiment, and the key outcome here is that IFN type 1 is crucial for anti-H3N1 protection in chickens, while type 3 is less impactful. However, this experiment suffers from the different time points when chickens were culled, so many parameters are impossible to compare (e.g., weight loss, histopathology, IFN measurements, and more). Many of these phenomena are highly dynamic in acute virus infections, so disparate time points do not allow a meaningful comparison between different genotypes. What are the stats in 7b? Is the median rather than the mean indicated by the line? Otherwise, the lines appear in surprising places. SD must be shown, and I find it difficult to believe that there is a significant difference in weight, for e.g., IFNAR KO, unless maybe with a paired t test. What is the statistical test?

      (7) Figures 7e,f: these comparisons are very difficult to interpret as the virus loads at these time points already differ significantly, so any difference could be secondary to virus load differences.

    1. eLife Assessment

      Non-essential amino acids such as glutamine have been known to be required for T cell general activation through sustaining basic biosynthetic processes, including nucleotide biosynthesis, ATP generation, and protein synthesis. In this important study, the authors found that extracellular asparagine (Asn) is required not only for T cells to generally refuel metabolic reprogramming, but to produce helper T cell lineage-specific cytokine, for instance, IL17. In particular, the importance of Asn in IL17 production was convincingly demonstrated in the mouse experimental autoimmune encephalomyelitei (EAE) model, mimicking human multiple sclerosis disease.

    2. Reviewer #1 (Public review):

      Summary:

      In this manuscript, the authors reveal that the availability of extracellular asparagine (Asn) represents a metabolic vulnerability for the activation and differentiation of naive CD4+ T cells. To deplete extracellular Asn, they employed two orthogonal approaches: activating naive CD4+ T cells in either PEGylated asparaginase (PEG-AsnASE)-treated medium or custom-formulated RPMI medium specifically lacking Asn. Importantly, they demonstrate that Asn depletion not only impaired metabolic reprogramming associated with CD4+ T cell activation but also reduced CD4+ helper T cell lineage-specific cytokine production, thereby ameliorating the severity of experimental autoimmune encephalomyelitis.

      Strengths:

      The experiments presented here are comprehensive and well-designed, providing compelling evidence for the conclusions. The conclusions will be important to the field.

      Weaknesses:

      (1) EAE is the prototypic T cell-mediated autoimmune disease model, and both Th1 and Th17 cells are implicated in its pathogenesis. In contrast, Th2 and Treg cells and their associated cytokines (such as IL-4 and IL-10) have been shown to play a role in the resolution of EAE, and potentially in the modulation of disease progression. Thus, it will be important to determine whether Asn depletion affects the differentiation of naive CD4+ T cells into corresponding subsets under Th2 and Treg polarization conditions, as well as the expression of lineage-specific transcription factors and cytokine production.

      (2) EAE is characterized by inflammation and demyelination in the central nervous system (CNS), leading to neurological deficits. Myelin destruction is directly correlated with the severity of the disease. For Figure 6, did the authors perform spinal cord histological analysis by hematoxylin and eosin (H&E) or Luxol fast blue (LFB) staining? This is important to rigorously examine pathological EAE symptoms.

    3. Reviewer #2 (Public review):

      While the importance of asparagine in the differentiation and activation of CD8 T cells has been previously reported, its role in CD4 T cells remained unclear. Using culture media containing specific amino acids, the authors demonstrated that extracellular asparagine promotes CD4 T cell proliferation. Consistent with this, depletion of extracellular asparagine using PEG-AsnASE suppressed CD4 T cell activation. Proteomic analysis focusing on asparagine content revealed that, during the early phase of T cell activation, most asparagine incorporated into proteins is derived from extracellular sources. The authors further confirmed the importance of extracellular asparagine in vivo, demonstrating improved EAE pathology.

      While the data are well organized and convincing, the mechanism by which asparagine deficiency leads to altered T cell differentiation remains unclear. It is also necessary to investigate the transporters involved in asparagine uptake. In particular, elucidating whether different T cell subsets utilize the same or distinct transport mechanisms would provide important insight into the immunoregulatory role of asparagine.

      (1) The finding that asparagine supplementation promotes T cell proliferation under various amino acid conditions is highly significant. However, the concentration at which this effect occurs remains unclear. A titration analysis would be necessary to determine the dose-dependency of asparagine.

      (2) The effects of asparagine deficiency occur during the early phase of T cell activation. Thus, it is likely that the transporters responsible for asparagine uptake are either rapidly induced upon activation or already expressed in the resting state. Since this is central to the focus of the manuscript, it is interesting to identify the transporter responsible for asparagine uptake during early T cell activation. A recent paper (DOI: 10.1126/sciadv.ads350) reported that macrophages utilize Slc6a14 to use extracellular asparagine. Is this also true for CD4+ T cells?

      (3) Given that depletion of extracellular asparagine impairs differentiation of Th1 and Th17 cells, it is possible that TCR signaling is compromised under these conditions. This point should be investigated by targeting downstream signaling molecules such as Lck, ZAP70, or mTOR. Also, does it affect the protein stability of master transcription factors such as T-bet and RORgt?

      (4) Is extracellular asparagine also important for the differentiation of helper T cell subsets other than Th1 and Th17, such as Th2, Th9, and iTreg?

      (5) Asparagine taken up from outside the cell has been shown to be used for de novo protein synthesis (Figure 3E), but are there any proteins that are particularly susceptible to asparagine deficiency? This can be verified by performing proteome analysis, and the effects on Th1/17 subset differentiation mentioned above should also be examined.

      (6) While the importance of extracellular asparagine is emphasized, Asns expression is markedly induced during early T cell activation. Nevertheless, the majority of asparagine incorporated into proteins appears to be derived from extracellular sources. Does genetic deletion of Asns have any impact on early CD4+ T cell activation? The authors indicated that newly synthesized Asns have little impact on CD8+ T cells in the Discussion section, but is this also true for CD4+ T cells? This could be verified through experiments using CRISPR-mediated Asns gene targeting or pharmacological inhibition.

    1. eLife Assessment

      This study illustrates a valuable application of BID-seq to bacterial RNA, allowing transcriptome-wide mapping of pseudouridine modifications across various bacterial species. The evidence presented includes a mix of solid and incomplete data and analyses, and would benefit from more rigorous approaches. The work will interest a specialized audience involved in RNA biology.

    2. Reviewer #1 (Public review):

      Summary:

      The manuscript by Xu et al. reported base-resolution mapping of RNA pseudouridylation in five bacterial species, utilizing recently developed BID-seq. They detected pseudouridine (Ψ) in bacterial rRNA, tRNA, and mRNA, and found growth phase-dependent Ψ changes in tRNA and mRNA. They then focused on mRNA and conducted a comparative analysis of Ψ profiles across different bacterial species. Finally, they developed a deep learning model to predict Ψ sites based on RNA sequence and structure.

      Strengths:

      This is the first comprehensive Ψ map across multiple bacterial species, and systematically reveals Ψ profiles in rRNA, tRNA, and mRNA under exponential and stationary growth conditions. It provides a valuable resource for future functional studies of Ψ in bacteria.

      Weaknesses:

      Ψ is highly abundant on non-coding RNA such as rRNA and tRNA, while its level on mRNA is very low. The manuscript focuses primarily on mRNA, which raises questions about the data quality and the rigor of the analysis. Many conclusions in the manuscript are speculative, based solely on the sequencing data but not supported by additional experiments.

    3. Reviewer #2 (Public review):

      Summary:

      In this study, Xu et al. present a transcriptome-wide, single-base resolution map of RNA pseudouridine modifications across evolutionarily diverse bacterial species using an adapted form of BID-Seq. By optimizing the method for bacterial RNA, the authors successfully mapped modifications in rRNA, tRNA, and, importantly, mRNA across both exponential and stationary growth phases. They uncover evolutionarily conserved Ψ motifs, dynamic Ψ regulation tied to bacterial growth state, and propose functional links between pseudouridylation and bacterial transcript stability, translation, and RNA-protein interactions. To extend these findings, they develop a deep learning model that predicts pseudouridine sites from local sequence and structural features.

      Strengths:

      The authors provide a valuable resource: a comprehensive Ψ atlas for bacterial systems, spanning hundreds of mRNAs and multiple species. The work addresses a gap in the field - our limited understanding of bacterial epitranscriptomics, by establishing both the method and datasets for exploring post-transcriptional modifications.

      Weaknesses:

      The main limitation of the study is that most functional claims (i.e., translation efficiency, mRNA stability, and RNA-binding protein interactions) are based on correlative evidence. While suggestive, these inferences would be significantly strengthened by targeted perturbation of specific Ψ synthases or direct biochemical validation of proposed RNA-protein interactions (e.g., with Hfq). Additionally, the GNN prediction model is a notable advance, but methodological details are insufficient to reproduce or assess its robustness.

    4. Reviewer #3 (Public review):

      Summary:

      This study aimed to investigate pseudouridylation across various RNA species in multiple bacterial strains using an optimized BID-seq approach. It examined both conserved and divergent modification patterns, the potential functional roles of pseudouridylation, and its dynamic regulation across different growth conditions.

      Strengths:

      The authors optimized the BID-seq method and applied this important technique to bacterial systems, identifying multiple pseudouridylation sites across different species. They investigated the distribution of these modifications, associated sequence motifs, their dynamics across growth phases, and potential functional roles. These data are of great interest to researchers focused on understanding the significance of RNA modifications, particularly mRNA modifications, in bacteria.

      Weaknesses:

      (1) The reliability of BID-seq data is questionable due to a lack of experimental validations.

      (2) The manuscript is not well-written, and the presented work shows a major lack of scientific rigor, as several key pieces of information are missing.

      (3) The manuscript's organization requires significant improvement, and numerous instances of missing or inconsistent information make it difficult to understand the key objectives and conclusions of the study.

      (4) The rationale for selecting specific bacterial species is not clearly explained, and the manuscript lacks a systematic comparison of pseudouridylation among these species.

    1. eLife Assessment

      This study presents valuable data suggesting that ATP-induced modulation of alveolar macrophage (AM) functions is associated with NLRP3 inflammasome activation and enhanced phagocytic capacity. While the in vivo and in vitro data reveal an interesting phenotype, the evidence provided is incomplete and does not fully support the paper's conclusions. Additional investigations would be of value in complementing the data and strengthening the interpretation of the results. This study should be of interest to immunologists and the mucosal immunity community.

    2. Reviewer #1 (Public review):

      Summary:

      Alveolar macrophages (AMs) are key sentinel cells in the lungs, representing the first line of defense against infections. There is growing interest within the scientific community in the metabolic and epigenetic reprogramming of innate immune cells following an initial stress, which alters their response upon exposure to a heterologous challenge. In this study, the authors show that exposure to extracellular ATP can shape AM functions by activating the P2X7 receptor. This activation triggers the relocation of the potassium channel TWIK2 to the cell surface, placing macrophages in a heightened state of responsiveness. This leads to the activation of the NLRP3 inflammasome and, upon bacterial internalization, to the translocation of TWIK2 to the phagosomal membrane, enhancing bacterial killing through pH modulation. Through these findings, the authors propose a mechanism by which ATP acts as a danger signal to boost the antimicrobial capacity of AMs.

      Strengths:

      This is a fundamental study in a field of great interest to the scientific community. A growing body of evidence has highlighted the importance of metabolic and epigenetic reprogramming in innate immune cells, which can have long-term effects on their responses to various inflammatory contexts. Exploring the role of ATP in this process represents an important and timely question in basic research. The study combines both in vitro and in vivo investigations and proposes a mechanistic hypothesis to explain the observed phenotype.

      Weaknesses:

      First, the concept of training or trained immunity refers to long-term epigenetic reprogramming in innate immune cells, resulting in a modified response upon exposure to a heterologous challenge. The investigations presented demonstrate phenotypic alterations in AMs seven days after ATP exposure; however, they do not assess whether persistent epigenetic remodeling occurs with lasting functional consequences. Therefore, a more cautious and semantically precise interpretation of the findings would be appropriate.

      Furthermore, the in vivo data should be strengthened by additional analyses to support the authors' conclusions. The authors claim that susceptibility to Pseudomonas aeruginosa infection differs depending on the ATP-induced training effect. Statistical analyses should be provided for the survival curves, as well as additional weight curves or clinical assessments. Moreover, it would be appropriate to complement this clinical characterization with additional measurements, such as immune cell infiltration analysis (by flow cytometry), and quantification of pro-inflammatory cytokines in bronchoalveolar lavage fluid and/or lung homogenates.

      Moreover, the authors attribute the differences in resistance to P. aeruginosa infection to the ATP-induced training effect on AMs, based on a correlation between in vivo survival curves and differences in bacterial killing capacity measured in vitro. These are correlative findings that do not establish a causal role for AMs in the in vivo phenotype. ATP-mediated effects on other (i.e., non-AM) cell populations are omitted, and the possibility that other cells could be affected should be, at least, discussed. Adoptive transfer experiments using AMs would be a suitable approach to directly address this question.

    3. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Thompson et al. investigate the impact of prior ATP exposure on later macrophage functions as a mechanism of immune training. They describe that ATP training enhances bactericidal functions, which they connect to the P2x7 ATP receptor, Nlrp3 inflammasome activation, and TWIK2 K+ movement at the cell surface and subsequently at phagosomes during bacterial engulfment. With stronger methodology, these findings could provide useful insight into how ATP can modulate macrophage immune responses, though they are generally an incremental addition to existing literature. The evidence supporting their conclusions is currently inadequate. Gaps in explaining methodology are substantial enough to undermine trust in much of the data presented. Some assays may not be designed rigorously enough for interpretation.

      Strengths:

      The authors demonstrate two novel findings that have sufficient rigor to assess:

      (1) prolonged persistence of TWIK2 at the macrophage plasma membrane following ATP, and can translocate to the phagosome during particle engulfment, which builds upon their prior report of ATP-driven 'training' of macrophages.

      (2) administering mice intra-nasal ATP to 'train' lungs to protect mice from otherwise fatal bacterial infection.

      Weaknesses:

      (1) Missing details from methods/reported data: Substantial sections of key methods have not been disclosed (including anything about animal infection models, RNA-sequencing, and western blotting), and the statistical methods, as written, only address two-way comparisons, which would mean analysis was improperly performed. In addition, there is a general lack of transparency - the methods state that only representative data is included in the manuscript, and individual data points are not shown for assays.

      (2) Poor experimental design including missing controls: Particularly problematic are the Seahorse assay data (requires normalization to cell numbers to interpret this bulk assay - differences in cell growth/loss between conditions would confound data interpretation) and bacterial killing assays (as written, this method would be heavily biased by bacterial initial binding/phagocytosis which would confound assessment of killing). Controls need to be included for subcellular fractionating to confirm pure fractions and for dye microscopy to show a negative background. Conclusions from these assays may be incorrect, and in some cases, the whole experiment may be uninterpretable.

      (3) The conclusions overstate what was tested in the experiments: Conceptually, there are multiple places where the authors draw conclusions or frame arguments in ways that do not match the experiments used. Particularly:<br /> a) The authors discuss their findings in the context of importance for AM biology during respiratory infection but in vitro work uses cells that are well-established to be poor mimics of resident AMs (BMDM, RAW), particularly in terms of glycolytic metabolism.<br /> b) In vivo work does not address whether immune cell recruitment is triggered during training.<br /> c) Figure 3 is used to draw conclusions about K+ in response to bacterial engulfment, but actually assesses fungal zymosan particles.<br /> d) Figure 5 is framed in bacterial susceptibility post-viral infection, but the model used is bacterial post-bacterial.<br /> e) In their discussion, the authors propose to have shown TWIK2-mediated inflammasome activation. They link these separately to ATP, but their studies do not test if loss of TWIK2 prevents inflammasome activation in response to ATP (Figure 4E does not use TWIK2 KO).

      In summary, this work contains some useful data showing how ATP can 'train' macrophages. However, it largely lacks the expected level of rigor. For this work to be valuable to the field, it is likely to need substantial improvement in methods reporting, inclusion of missing assay controls, may require repeating key experiments that were run with insufficient methodology (or providing details and supplemental data to prove that methodology was sufficient), and should either add additional experiments that properly test their experimental question or rewrite their conclusions.

    1. eLife Assessment

      This convincing study, which is based on a survey of researchers, finds that women are less likely than men to submit articles to elite journals. It also finds that there is no relation between gender and reported desk rejection. The study is an important contribution to work on gender bias in the scientific literature.

    2. Joint Public Review:

      Summary from an earlier round of review:

      This paper summarises responses from a survey completed by around 5,000 academics on their manuscript submission behaviours. The authors find several interesting stylised facts, including (but not limited to):- Women are less likely to submit their papers to highly influential journals (e.g., Nature, Science and PNAS).

      - Women are more likely to cite the demands of co-authors as a reason why they didn’t submit to highly influential journals.

      - Women are also more likely to say that they were advised not to submit to highly influential journals.

      The paper highlights an important point, namely that the submission behaviours of men and women scientists may not be the same (either due to preferences that vary by gender, selection effects that arise earlier in scientists’ careers or social factors that affect men and women differently and also influence submission patterns). As a result, simply observing gender differences in acceptance rates - or a lack thereof - should not be automatically interpreted as as evidence for or against discrimination (broadly defined) in the peer review process.

      Editor’s note: This is the third version of this article.

      Comments made during the peer review of the second version, along with author’s responses to these comments, are available below. Revisions made in response to these comments include changing the colour scheme used for the figures to make the figures more accessible for readers with certain forms of colour blindness.

      Comments made during the peer review of the first version, along with author’s responses to these comments, are available with previous versions of the article.

    3. Author response:

      The following is the authors’ response to the previous reviews

      Reviewer #1 (Public review):

      Summary

      This paper summarises responses from a survey completed by around 5,000 academics on their manuscript submission behaviours. The authors find several interesting stylised facts, including (but not limited to):

      Women are less likely to submit their papers to highly influential journals (e.g., Nature, Science and PNAS).

      Women are more likely to cite the demands of co-authors as a reason why they didn't submit to highly influential journals.

      Women are also more likely to say that they were advised not to submit to highly influential journals.

      The paper highlights an important point, namely that the submission behaviours of men and women scientists may not be the same (either due to preferences that vary by gender, selection effects that arise earlier in scientists' careers or social factors that affect men and women differently and also influence submission patterns). As a result, simply observing gender differences in acceptance rates - or a lack thereof - should not be automatically interpreted as as evidence for or against discrimination (broadly defined) in the peer review process.

      Major comments

      What do you mean by bias?

      In the second paragraph of the introduction, it is claimed that "if no biases were present in the case of peer review, then we should expect the rate with which members of less powerful social groups enjoy successful peer review outcomes to be proportionate to their representation in submission rates." There are a couple of issues with this statement.

      First, the authors are implicitly making a normative assumption that manuscript submission and acceptance rates *should* be equalised across groups. This may very well be the case, but there can also be valid reasons - even when women are not intrinsically better at research than men - why a greater fraction of female-authored submissions are accepted relative to male-authored submissions (or vice versa). For example, if men are more likely to submit their less ground-breaking work, then one might reasonably expect that they experience higher rejection rates compared to women, conditional on submission.

      We do assume that normative statement: unless we believe that men’s papers are intrinsically better than women’s papers, the acceptance rate should be the same. But the referee is right: we have no way of controlling for the intrinsic quality of the work of men and women. That said, our manuscript does not show that there is a different acceptance rate for men and women; it shows that women are less likely to submit papers to a subset of journals that are of a lower Journal Impact Factor, controlling for their most cited paper, in an attempt to control for intrinsic quality of the manuscripts.

      Second, I assume by "bias", the authors are taking a broad definition, i.e., they are not only including factors that specifically relate to gender but also factors that are themselves independent of gender but nevertheless disproportionately are associated with one gender or another (e.g., perhaps women are more likely to write on certain topics and those topics are rated more poorly by (more prevalent) male referees; alternatively, referees may be more likely to accept articles by authors they've met before, most referees are men and men are more likely to have met a given author if he's male instead of female). If that is the case, I would define more clearly what you mean by bias. (And if that isn't the case, then I would encourage the authors to consider a broader definition of "bias"!)

      Yes, the referee is right that we are taking a broad definition of bias. We provide a definition of bias on page 3, line 92. This definition is focused on differential evaluation which leads to differential outcomes. We also hedge our conversation (e.g., page 3, line 104) to acknowledge that observations of disparities may only be an indicator of potential bias, as many other things could explain the disparity. In short, disparities are a necessary but insufficient indicator of bias. We add a line in the introduction to reinforce this. The only other reference to the term bias comes on page 10, line 276. We add a reference to Lee here to contextualize.

      Identifying policy interventions is not a major contribution of this paper

      I would take out the final sentence in the abstract. In my opinion, your survey evidence isn't really strong enough to support definitive policy interventions to address the issue and, indeed, providing policy advice is not a major - or even minor - contribution of your paper. (Basically, I would hope that someone interested in policy interventions would consult another paper that much more thoughtfully and comprehensively discusses the costs and benefits of various interventions!) While it's fine to briefly discuss them at the end of your paper - as you currently do - I wouldn't highlight that in the abstract as being an important contribution of your paper.

      We thank the referee for this comment. While we agree that our results do not lead to definitive policy interventions, we believe that our findings point to a phenomenon that should be addressed through policy interventions. Given that some interventions are proposed in our conclusion, we feel like stating this in the abstract is coherent.

      Minor comments

      What is the rationale for conditioning on academic rank and does this have explanatory power on its own - i.e., does it at least superficially potentially explain part of the gender gap in intention to submit?

      Thank you for this thoughtful question. We conditioned on academic rank in all regression analyses to account for structural differences in career stage that may potentially influence submission behaviors. Academic rank (e.g., assistant, associate, full professor) is a key determinant of publishing capacity and strategic considerations, such as perceived likelihood of success at elite journals, tolerance for risk, and institutional expectations for publication venues.

      Importantly, academic rank is also correlated with gender due to cumulative career disadvantages that contribute to underrepresentation of women at more senior levels. Failing to adjust for rank would conflate gender effects with differences attributable to career stage. By including rank as a covariate, we aim to isolate gender-associated patterns in submission behavior within comparable career stages, thereby producing a more precise estimate of the gender effect.

      Regarding explanatory power, academic rank does indeed contribute significantly to model fit across our analyses, indicating that it captures meaningful variation in submission behavior. However, even after adjusting for rank, we continue to observe significant gender differences in submission patterns in several disciplines. This suggests that while academic rank explains part of the variation, it does not fully account for the gender gap—highlighting the importance of examining other structural and behavioral factors that shape the publication trajectory.

      Reviewer #2 (Public review):

      Basson et al. present compelling evidence supporting a gender disparity in article submission to "elite" journals. Most notably, they found that women were more likely to avoid submitting to one of these journals based on advice from a colleague/mentor. Overall, this work is an important addition to the study of gender disparities in the publishing process.

      I thank the authors for addressing my concerns.

      Reviewer #4 (Public review):

      Main strengths

      The topic of the MS is very relevant given that across the sciences/academia, genders are unevenly represented, which has a range of potential negative consequences. To change this, we need to have the evidence on what mechanisms cause this pattern. Given that promotion and merit in academia are still largely based on the number of publications and the impact factor, one part of the gap likely originates from differences in publication rates of women compared to men.

      Women are underrepresented compared to men in journals with a high impact factor. While previous work has detected this gap and identified some potential mechanisms, the current MS provides strong evidence that this gap might be due to a lower submission rate of women compared to men, rather than the rejection rates. These results are based on a survey of close to 5000 authors. The survey seems to be conducted well (though I am not an expert in surveys), and data analysis is appropriate to address the main research aims. It was impossible to check the original data because of the privacy concerns.

      Interestingly, the results show no gender bias in rejection rates (desk rejection or overall) in three high-impact journals (Science, Nature, PNAS). However, submission rates are lower for women compared to men, indicating that gender biases might act through this pathway. The survey also showed that women are more likely to rate their work as not groundbreaking and are advised not to submit to prestigious journals, indicating that both intrinsic and extrinsic factors shape women's submission behaviour.

      With these results, the MS has the potential to inform actions to reduce gender bias in publishing, but also to inform assessment reform at a larger scale.

      I do not find any major weaknesses in the revised manuscript.

      Reviewer #4 (Recommendations for the authors):

      (1) Colour schemes of the Figures are not adjusted for colour-blindness (red-green is a big NO), some suggestions can be found here https://www.nceas.ucsb.edu/sites/default/files/2022-06/Colorblind%20Safe%20Color%20Schemes.pdf

      We appreciate the suggestion. We’ve adjusted the colors in the manuscript to be color-blind friendly using one of the colorblind safe palettes suggested by the reviewer.

      (2) I do not think that the authors have fully addressed the comment about APCs and the decision to submit, given that PNAS has publication charges that amount to double of someone's monthly salary. I would add a sentence or two to explain that publication charges should not be a factor for Nature and Science, but might be for PNAS.

      While APCs are definitely a factor affecting researchers’ submission behavior, it is mostly does so for lower prestige journals rather than for the three elite journals analyzed here. As mentioned in the previous round of revisions, Nature and Science have subscription options. And PNAS authors without funding have access to waivers: https://www.pnas.org/author-center/publication-charges

      (3) Line 268, the first suggestion here is not something that would likely work. Thus, I would not put it as the first suggestion.

      We made the suggested change.

      (4) Data availability - remove AND in 'Aggregated and de-identified data' because it sounds like both are shared. Suggest writing: 'Aggregated, de-identified data..'. I still suggest sharing data/code in a trusted repository (e.g. Dryad, ZENODO...) rather than on GitHub, as per the current recommendation on the best practices for data sharing.

      Thank you for your comment regarding data availability. Due to IRB restrictions and the conditions of our ethics approval, we are not permitted to share the survey data used in this study. However, to support transparency and reproducibility, we have made all analysis code available on Zenodo at https://doi.org/10.5281/zenodo.16327580. In addition, we have included a synthetic dataset with the same structure as the original survey data but containing randomly generated values. This allows others to understand the data structure and replicate our analysis pipeline without compromising participant confidentiality.

    1. Now imagine getting trapped in that same unhelpful loop whenyou’re trying to get welfare benefits, seek housing, apply for a job, or secure a loan. It’s clearhow the impacts of these systems aren’t evenly felt even if all that garbage is cleaned up.

      As someone who has made at least 10 tech support phone calls this week with no help because they were all chat bots, this is terrifying

    2. A now-defunct AI recruiting tool created by Amazon taught itself malecandidates were preferable, after being trained on mostly male résumés. Biased data canhave widespread effects that touch the lives of real people.

      Maybe there can be a kind of IRB regulation for the data that AI is trained on, but instead of focusing on the ethics of how participants are treated, it reviews the ethics of the data before introducing it to AI

    3. Itrelies on a branch of artificial intelligence — statistical machine learning — to recognizepatterns rather than produce new text.

      I feel like this relates to some of our other readings about how AI works and why it chooses the words it does

    4. “The training data hasbeen shown to have problematic characteristics resulting in models that encode stereotypicaland derogatory associations along gender, race, ethnicity, and disability status,”

      I wonder why this wasn't reason enough for Google to start rolling back on their AI initiatives

    5. he company’sResponsible AI initiative, which looked at the social implications of artificial intelligence —including “generative” AI systems

      I wonder how bad the consequences would have to get before google would decide to start recinding their AI

    6. the problems with AI aren’t hypothetical

      I oftentimes think about the future consequences of AI, such as becoming overly dependent on it, but this perspective comes from a place of ignorance about the human mechanics of AI

    1. I relate to X when he says how he feels limited when you don't have the words to express yourself because sometimes I feel like I know what I mean in my head, but I don't know how to put it into words.

    2. When Malcolm X talks about copying the dictionary word for word I'm surprised he's able to do that. I wouldn't have the patience to do that but it shows how bad he wanted knowledge.

    1. eLife Assessment

      This valuable study introduces a modern and accessible PyTorch reimplementation of the widely used SpliceAI model for splice site prediction. The authors provide convincing evidence that their OpenSpliceAI implementation matches the performance of the original while improving usability and enabling flexible retraining across species. These advances are likely to be of broad interest to the computational genomics community.

    2. Reviewer #1 (Public review):

      Summary:

      Chao et al. produced an updated version of the SpliceAI package using modern deep learning frameworks. This includes data preprocessing, model training, direct prediction, and variant effect prediction scripts. They also added functionality for model fine-tuning and model calibration. They convincingly evaluate their newly trained models against those from the original SpliceAI package and investigate how to extend SpliceAI to make predictions in new species. Their comparisons to the original SpliceAI models are convincing on the grounds of model performance and their evaluation of how well the new models match the original's understanding of non-local mutation effects. However, their evaluation of the new calibration functionality would benefit from a more nuanced discussion of the limitations of calibration.

      Strengths

      (1) They provide convincing evidence that their new implementation of SpliceAI matches the performance and mutation effect estimation capabilities of the original model on a similar dataset while benefiting from improved computational efficiencies. This will enable faster prediction and retraining of splicing models for new species as well as easier integration with other modern deep learning tools.

      (2) They produce models with strong performance on non-human model species and a simple well well-documented pipeline for producing models tuned for any species of interest. This will be a boon for researchers working on splicing in these species and make it easy for researchers working on new species to generate their own models.

      (3) Their documentation is clear and abundant. This will greatly aid the ability of others to work with their code base.

      Weaknesses

      (1) Their discussion of their package's calibration functionality does not adequately acknowledge the limitations of model calibration. This is problematic as this is a package intended for general use and users who are not experienced in modeling broadly and the subfield of model calibration specifically may not already understand these limitations. This could lead to serious errors and misunderstandings down the road. A model is not calibrated or uncalibrated in and of itself, only with respect to a specific dataset. In this case they calibrated with respect to the training dataset, a set of canonical transcript annotations. This is a perfectly valid and reasonable dataset to calibrate against. However, this is unlikely to be the dataset the model is applied to in any downstream use case, and this calibration is not guaranteed or expected to hold for any shift in the dataset distribution. For example, in the next section they use ISM based approaches to evaluate which sequence elements the model is sensitive to and their calibration would not be expected to hold for this set of predictions. This issue is particularly worrying in the case of their model because annotation of canonical transcript splice sites is a task that it is unlikely their model will be applied to after training. Much more likely tasks will be things such as predicting the effects of mutations, identification of splice sites that may be used across isoforms beyond just the canonical one, identification of regulatory sequences through ISM, or evaluation of human created sequences for design or evaluation purposes (such as in the context of an MPSA or designing a gene to splice a particular way), we would not expect their calibration to hold in any of these contexts. To resolve this issue, the authors should clarify and discuss this limitation in their paper (and in the relevant sections of the package documentation) to avoid confusing downstream users.

      (2) The clarity of their analysis of mutation effects could be improved with some minor adjustments. While they report median ISM importance correlation it would be helpful to see a histogram of the correlations they observed. Instead of displaying (and calculating correlations using) importance scores of only the reference sequence, showing the importance scores for each nucleotide at each position provides a more informative representation. This would also likely make the plots in 6B clearer.

    3. Reviewer #2 (Public review):

      Summary:

      The paper by Chao et al offers a reimplantation of the SpliceAI algorithm in PyTorch so that the model can more easily/efficiently be retrained. They apply their new implementation of the SpliceAI algorithm, which they call OpenSpliceAI, to several species and compare it against the original model, showing that the results are very similar and that in some small species pre-training on other species helps improve performance.

      Strengths:

      On the upside, the code runs fine and it is well documented.

      Weaknesses:

      The paper itself does not offer much beyond reimplementing SpliceAI. There is no new algorithm, new analysis, new data, or new insights into RNA splicing. There is not even any comparison to many of the alternative methods that have since been published to surpass SpliceAI. Given that some of the authors are well known with a long history of important contributions, our expectations were admittedly different. Still, we hope some readers will find the new implementation useful.

      Update for the revised version:

      The update includes mostly clarifications for tech questions/comments raised by the other two reviewers. There is no additional analysis/results that changes our above initial assessment of this paper's contribution.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      Chao et al. produced an updated version of the SpliceAI package using modern deep learning frameworks. This includes data preprocessing, model training, direct prediction, and variant effect prediction scripts. They also added functionality for model fine-tuning and model calibration. They convincingly evaluate their newly trained models against those from the original SpliceAI package and investigate how to extend SpliceAI to make predictions in new species. While their comparisons to the original SpliceAI models are convincing on the grounds of model performance, their evaluation of how well the new models match the original's understanding of non-local mutation effects is incomplete. Further, their evaluation of the new calibration functionality would benefit from a more nuanced discussion of what set of splice sites their calibration is expected to hold for, and tests in a context for which calibration is needed.

      Strengths:

      (1) They provide convincing evidence that their new implementation of SpliceAI matches the performance of the original model on a similar dataset while benefiting from improved computational efficiencies. This will enable faster prediction and retraining of splicing models for new species as well as easier integration with other modern deep learning tools.

      (2) They produce models with strong performance on non-human model species and a simple, well-documented pipeline for producing models tuned for any species of interest. This will be a boon for researchers working on splicing in these species and make it easy for researchers working on new species to generate their own models.

      (3) Their documentation is clear and abundant. This will greatly aid the ability of others to work with their code base.

      We thank the reviewer for these positive comments.  

      Weaknesses:

      (1) The authors' assessment of how much their model retains SpliceAI's understanding of "nonlocal effects of genomic mutations on splice site location and strength" (Figure 6) is not sufficiently supported. Demonstrating this would require showing that for a large number of (non-local) mutations, their model shows the same change in predictions as SpliceAI or that attribution maps for their model and SpliceAI are concordant even at distances from the splice site. Figure 6A comes close to demonstrating this, but only provides anecdotal evidence as it is limited to 2 loci. This could be overcome by summarizing the concordance between ISM maps for the two models and then comparing across many loci. Figure 6B also comes close, but falls short because instead of comparing splicing prediction differences between the models as a function of variants, it compares the average prediction difference as a function of the distance from the splice site. This limits it to only detecting differences in the model's understanding of the local splice site motif sequences. This could be overcome by looking at comparisons between differences in predictions with mutants directly and considering non-local mutants that cause differences in splicing predictions.

      We agree that two loci are insufficient to demonstrate preservation of non-local effects. To address this, we have extended our analysis to a larger set of sites: we randomly sampled 100 donor and 100 acceptor sites, applied our ISM procedure over a 5,001 nt window centered at each site for both models, and computed the ISM map as before. We then calculated the Pearson correlation between the collection of OSAI<sub>MANE</sub> and SpliceAI ISM importance scores. We also created 10 additional ISM maps similar to those in Figure 6A, which are now provided in Figure S23.

      Follow is the revised paragraph in the manuscript’s Results section:

      First, we recreated the experiment from Jaganathan et al. in which they mutated every base in a window around exon 9 of the U2SURP gene and calculated its impact on the predicted probability of the acceptor site. We repeated this experiment on exon 2 of the DST gene, again using both SpliceAI and OSAI<sub>MANE</sub> . In both cases, we found a strong similarity between the resultant patterns between SpliceAI and OSAI<sub>MANE</sub>, as shown in Figure 6A. To evaluate concordance more broadly, we randomly selected 100 donor and 100 acceptor sites and performed the same ISM experiment on each site. The Pearson correlation between SpliceAI and OSAI<sub>MANE</sub> yielded an overall median correlation of 0.857 (see Methods; additional DNA logos in Figure S23). 

      To characterize the local sequence features that both models focus on, we computed the average decrease in predicted splice-site probability resulting from each of the three possible singlenucleotide substitutions at every position within 80bp for 100 donor and 100 acceptor sites randomly sampled from the test set (Chromosomes 1, 3, 5, 7, and 9). Figure 6B shows the average decrease in splice site strength for each mutation in the format of a DNA logo, for both tools.

      We added the following text to the Methods section:

      Concordance evaluation of ISM importance scores between OSAI<sub>MANE</sub> and SpliceAI

      To assess agreement between OSAI<sub>MANE</sub>  and SpliceAI across a broad set of splice sites, we applied our ISM procedure to 100 randomly chosen donor sites and 100 randomly chosen acceptor sites. For each site, we extracted a 5,001 nt window centered on the annotated splice junction and, at every coordinate within that window, substituted the reference base with each of the three alternative nucleotides. We recorded the change in predicted splice-site probability for each mutation and then averaged these Δ-scores at each position to produce a 5,001-score ISM importance profile per site.

      Next, for each splice site we computed the Pearson correlation coefficient between the paired importance profiles from ensembled OSAI<sub>MANE</sub> and ensembled SpliceAI. The median correlation was 0.857 for all splice sites. Ten additional zoom-in representative splice site DNA logo comparisons are provided in Supplementary Figure S23.

      (2) The utility of the calibration method described is unclear. When thinking about a calibrated model for splicing, the expectation would be that the models' predicted splicing probabilities would match the true probabilities that positions with that level of prediction confidence are splice sites. However, the actual calibration that they perform only considers positions as splice sites if they are splice sites in the longest isoform of the gene included in the MANE annotation. In other words, they calibrate the model such that the model's predicted splicing probabilities match the probability that a position with that level of confidence is a splice site in one particular isoform for each gene, not the probability that it is a splice site more broadly. Their level of calibration on this set of splice sites may very well not hold to broader sets of splice sites, such as sites from all annotated isoforms, sites that are commonly used in cryptic splicing, or poised sites that can be activated by a variant. This is a particularly important point as much of the utility of SpliceAI comes from its ability to issue variant effect predictions, and they have not demonstrated that this calibration holds in the context of variants. This section could be improved by expanding and clarifying the discussion of what set of splice sites they have demonstrated calibration on, what it means to calibrate against this set of splice sites, and how this calibration is expected to hold or not for other interesting sets of splice sites. Alternatively, or in addition, they could demonstrate how well their calibration holds on different sets of splice sites or show the effect of calibrating their models against different potentially interesting sets of splice sites and discuss how the results do or do not differ.

      We thank the reviewer for highlighting the need to clarify our calibration procedure. Both SpliceAI and OpenSpliceAI are trained on a single “canonical” transcript per gene: SpliceAI on the hg 19 Ensembl/Gencode canonical set and OpenSpliceAI on the MANE transcript set. To calibrate each model, we applied post-hoc temperature scaling, i.e. a single learnable parameter that rescales the logits before the softmax. This adjustment does not alter the model’s ranking or discrimination (AUC/precision–recall) but simply aligns the predicted probabilities for donor, acceptor, and non-splice classes with their observed frequencies. As shown in our reliability diagrams (Fig. S16-S22), temperature scaling yields negligible changes in performance, confirming that both SpliceAI and OpenSpliceAI were already well-calibrated. However, we acknowledge that we didn’t measure how calibration might affect predictions on non-canonical splice sites or on cryptic splicing. It is possible that calibration might have a detrimental effect on those, but because this is not a key claim of our paper, we decided not to do further experiments. We have updated the manuscript to acknowledge this potential shortcoming; please see the revised paragraph in our next response.

      (3) It is difficult to assess how well their calibration method works in general because their original models are already well calibrated, so their calibration method finds temperatures very close to 1 and only produces very small and hard to assess changes in calibration metrics. This makes it very hard to distinguish if the calibration method works, as it doesn't really produce any changes. It would be helpful to demonstrate the calibration method on a model that requires calibration or on a dataset for which the current model is not well calibrated, so that the impact of the calibration method could be observed.

      It’s true that the models we calibrated didn’t need many changes. It is possible that the calibration methods we used (which were not ours, but which were described in earlier publications) can’t improve the models much. We toned down our comments about this procedure, as follows.

      Original:

      “Collectively, these results demonstrate that OSAIs were already well-calibrated, and this consistency across species underscores the robustness of OpenSpliceAI’s training approach in diverse genomic contexts.”

      Revised:

      “We observed very small changes after calibration across phylogenetically diverse species, suggesting that OpenSpliceAI’s training regimen yielded well‐calibrated models, although it is possible that a different calibration algorithm might produce further improvements in performance.”

      Reviewer #2 (Public review):

      Summary:

      The paper by Chao et al offers a reimplementation of the SpliceAI algorithm in PyTorch so that the model can more easily/efficiently be retrained. They apply their new implementation of the SpliceAI algorithm, which they call OpenSpliceAI, to several species and compare it against the original model, showing that the results are very similar and that in some small species, pretraining on other species helps improve performance.

      Strengths:

      On the upside, the code runs fine, and it is well documented.

      Weaknesses:

      The paper itself does not offer much beyond reimplementing SpliceAI. There is no new algorithm, new analysis, new data, or new insights into RNA splicing. There is no comparison to many of the alternative methods that have since been published to surpass SpliceAI. Given that some of the authors are well-known with a long history of important contributions, our expectations were admittedly different. Still, we hope some readers will find the new implementation useful.

      We thank the reviewer for the feedback. We have clarified that OpenSpliceAI is an open-source PyTorch reimplementation optimized for efficient retraining and transfer learning, designed to analyze cross-species performance gains, and supported by a thorough benchmark and the release of several pretrained models to clearly position our contribution.

      Reviewer #3 (Public review):

      Summary:

      The authors present OpenSpliceAI, a PyTorch-based reimplementation of the well-known SpliceAI deep learning model for splicing prediction. The core architecture remains unchanged, but the reimplementation demonstrates convincing improvements in usability, runtime performance, and potential for cross-species application.

      Strengths:

      The improvements are well-supported by comparative benchmarks, and the work is valuable given its strong potential to broaden the adoption of splicing prediction tools across computational and experimental biology communities.

      Major comments:

      Can fine-tuning also be used to improve prediction for human splicing? Specifically, are models trained on other species and then fine-tuned with human data able to perform better on human splicing prediction? This would enhance the model's utility for more users, and ideally, such fine-tuned models should be made available.

      We evaluated transfer learning by fine-tuning models pretrained on mouse (OSAI<sub>Mouse</sub>), honeybee (OSAI<sub>Honeybee</sub>), Arabidopsis (OSAI<sub>Arabidopsis</sub>), and zebrafish (OSAI<sub>Zebrafish</sub>) on human data. While transfer learning accelerated convergence compared to training from scratch, the final human splicing prediction accuracy was comparable between fine-tuned and scratch-trained models, suggesting that performance on our current human dataset is nearing saturation under this architecture.

      We added the following paragraph to the Discussion section:

      We also evaluated pretraining on mouse (OSAI<sub>Mouse</sub>), honeybee (OSAI<sub>Honeybee</sub>), zebrafish (OSAI<sub>Zebrafish</sub>), and Arabidopsis (OSAI<sub>Arabidopsis</sub>) followed by fine-tuning on the human MANE dataset. While cross-species pretraining substantially accelerated convergence during fine-tuning, the final human splicing-prediction accuracy was comparable to that of a model trained from scratch on human data. This result indicates that our architecture seems to capture all relevant splicing features from human training data alone, and thus gains little or no benefit from crossspecies transfer learning in this context (see Figure S24).

      Reviewer #1 (Recommendations for the authors):

      We thank the editor for summarizing the points raised by each reviewer. Below is our point-bypoint response to each comment:

      (1) In Figure 3 (and generally in the other figures) OpenSpliceAI should be replaced with OSAI_{Training dataset} because otherwise it is hard to tell which precise model is being compared. And in Figure 3 it is especially important to emphasize that you are comparing a SpliceAI model trained on Human data to an OSAI model trained and evaluated on a different species.

      We have updated the labels in Figures 3, replacing “OpenSpliceAI” with “OSAI_{training dataset}” to more clearly specify which model is being compared.

      (2) Are genes paralogous to training set genes removed from the validation set as well as the test set? If you are worried about data leakage in the test set, it makes sense to also consider validation set leakage.

      Thank you for this helpful suggestion. We fully agree, and to avoid any data leakage we implemented the identical filtering pipeline for both validation and test sets: we excluded all sequences paralogous or homologous to sequences in the training set, and further removed any sequence sharing > 80 % length overlap and > 80 % sequence identity with training sequences. The effect of this filtering on the validation set is summarized in Supplementary Figure S7C.

      Reviewer #3 (Recommendations for the authors):

      (1) The legend in Figure 3 is somewhat confusing. The labels like "SpliceAI-Keras (species name)" may imply that the model was retrained using data from that species, but that's not the case, correct?

      Yes, “SpliceAI-Keras (species name)” was not retrained; it refers to the released SpliceAI model evaluated on the specified species dataset. We have revised the Figure 3 legends, changing “SpliceAI-Keras (species name)” to “SpliceAI-Keras” to clarify this.

      (2) Please address the minor issues with the code, including ensuring the conda install works across various systems.

      We have addressed the issues you mentioned. OpenSpliceAI is now available on Conda and can be installed with:  conda install openspliceai. 

      The conda package homepage is at: https://anaconda.org/khchao/openspliceai We’ve also corrected all broken links in the documentation.

      (3) Utility:

      I followed all the steps in the Quick Start Guide, and aside from the issues mentioned below, everything worked as expected.

      I attempted installation using conda as described in the instructions, but it was unsuccessful. I assume this method is not yet supported.

      In Quick Start Guide: predict, the link labeled "GitHub (models/spliceai-mane/10000nt/)" appears to be incorrect. The correct path is likely "GitHub (models/openspliceaimane/10000nt/)".

      In Quick Start Guide: variant (https://ccb.jhu.edu/openspliceai/content/quick_start_guide/quickstart_variant.html#quick-startvariant), some of the download links for input files were broken. While I was able to find some files in the GitHub repository, I think the -A option should point to data/grch37.txt, not examples/data/input.vcf, and the -I option should be examples/data/input.vcf, not data/vcf/input.vcf.

      Thank you for catching these issues. We’ve now addressed all issues concerning Conda installation and file links. We thank the editor for thoroughly testing our code and reviewing the documentation.

    1. Avoid sending harsh or demanding emails or messages when you are panicked, frustrated, or angry. Walk away from your computer and return at a later time when you feel calmer. Then re-read the instructions, or syllabus, or the course materials you find confusing, and if you still cannot find the answer because it is not there, definitely email or message your instructor.

      Even if you are under pressure, it is best to stay respectful and know that teachers or professors are humans too. You can not only always resort to your teachers, but also can look at other sources. This may include instructions to conclude whether you really need help or if your questions can be answered easily with what was given.

    1. You’ll need to learn to communicate effectively using the genres of the discourse community of your workplace, and this might mean asking questions of more experienced discourse community members, analyzing models of the types of genres you’re expected to use to communicate, and thinking about the most effective style, tone, format, and structure for your audience and purpose. Some workplaces have guidelines for how to write in the genres of the discourse community, and some workplaces will initiate you to their genres by trial and error.

      Communication is key no matter where you are, but being able to adapt and learn what diction a field uses compared to another may be harsh to fit in. This is why there are plenty of opportunities that are brought into play to help every person.

    2. Just as discourse communities have specialized vocabularies and standards, different discourse communities pursue different kinds of questions. Let’s take a big problem like global climate change and focus on Alaska. An environmental scientist, a pathologist, an economist, and an anthropologist would raise different kinds of questions about the same problem.

      All these discourse communities have their differences in diction, how they explain things, and purposes. Though it may not be understood by others, these fields look at different aspects of a problem depending on what they were taught.

    1. Some texts make the task of identifying main points relatively easy. Textbooks, for instance, include the aforementioned features as well as headings and subheadings intended to make it easier for students to identify core concepts. Graphic features, such as sidebars, diagrams, and charts, help students understand complex information and distinguish between essential and inessential points. When you are assigned to read from a textbook, be sure to use available comprehension aids to help you identify the main points.

      There are plenty of tools that are given in books or textbooks because they are there to help. They are not there to be ignored so it's best to explore to better understand what you are reading.

    2. setting a purpose for your reading. Knowing what you want to achieve from a reading assignment not only helps you determine how to approach that task, but it also helps you stay focused during those moments when you are up late, already tired, or unmotivated

      Finding types of achievements or goals to be set along the way to pour one's focus into one text. You don't need to create a huge purpose, but taking it one step at a time to make progress.