Reviewer #2 (Public review):
Summary:
Using a gerbil model, the authors tested the hypothesis that loss of synapses between sensory hair cells and auditory nerve fibers (which may occur due to noise exposure or aging) affects behavioral discrimination of the rapid temporal fluctuations of sounds. In contrast to previous suggestions in the literature, their results do not support this hypothesis; young animals treated with a compound that reduces the number of synapses did not show impaired discrimination compared to controls. Additionally, their results from older animals showing impaired discrimination suggest that age-related changes aside from synaptopathy are responsible for the age-related decline in discrimination.
Strengths:
(1) The rationale and hypothesis are well-motivated and clearly presented.
(2) The study was well conducted with strong methodology for the most part, and good experimental control. The combination of physiological and behavioral techniques is powerful and informative. Reducing synapse counts fairly directly using ouabain is a cleaner design than using noise exposure or age (as in other studies), since these latter modifiers have additional effects on auditory function.
(3) The study may have a considerable impact on the field. The findings could have important implications for our understanding of cochlear synaptopathy, one of the most highly researched and potentially impactful developments in hearing science in the past fifteen years.
Weaknesses:
(1) I have concerns that the gerbils may not have been performing the behavioral task using temporal fine structure information.
Human studies using the same task employed a filter center frequency that was (at least) 11 times the fundamental frequency (Marmel et al., 2015; Moore and Sek, 2009). Moore and Sek wrote: "the default (recommended) value of the centre frequency is 11F0." Here, the center frequency was only 4 or 8 times the fundamental frequency (4F0 or 8F0). Hence, relative to harmonic frequency, the harmonic spacing was considerably greater in the present study. However, gerbil auditory filters are thought to be broader than those in human. In the revised version of the manuscript, the authors provide modelling results suggesting that the excitation patterns were discriminable for the 4F0 conditions, but may not have been for the 8F0 conditions. These results provide some reassurance that the 8F0 discriminations were dependent on temporal cues, but the description of the model lacks detail. Also, the authors state that "thus, for these two conditions with harmonic number N of 8 the gerbils cannot rely on differences in the excitation patterns but must solve the task by comparing the temporal fine structure." This is too strong. Pulsed tone intensity difference limens (the reference used for establishing whether or not the excitation pattern cues were usable) may not be directly comparable to profile-analysis-like conditions, and it has been argued that frequency discrimination may be more sensitive to excitation pattern cues than predicted from a simple comparison to intensity difference limens (Micheyl et al. 2013, https://doi.org/10.1371/journal.pcbi.1003336).
I'm also somewhat concerned that the masking noise used in the present study was too low in level to mask cochlear distortion products. Based on their excitation pattern modelling, the authors state (without citation) that "since the level of excitation produced by the pink noise is less than 30 dB below that produced by the complex tones, distortion products will be masked." The basis for this claim is not clear. In human, distortion products may be only ~20 dB below the levels of the primaries (referenced to an external sound masker / canceller, which is appropriate, assuming that the modelling reported in the present paper did not include middle-ear effects; see Norman-Haignere and McDermott, 2016, doi: 10.1016/j.neuroimage.2016.01.050). Oxenham et al. (2009, doi: 10.1121/1.3089220) provide further cautionary evidence on the potential use of distortion product cues when the background noise level is too low (in their case the relative level of the noise in the compromised condition was only a little below that used in the present study). The masking level used in the present study may have been sufficient, but it would be useful to have some further reassurance on this point.
(2) The synapse reductions in the high ouabain and old groups were relatively small (mean of 19 synapses per hair cell compared to 23 in the young untreated group). In contrast, in some mouse models of the effects of noise exposure or age, a 50% reduction in synapses is observed, and in the human temporal bone study of Wu et al. (2021, https://doi.org/10.1523/JNEUROSCI.3238-20.2021) the age-related reduction in auditory nerve fibres was ~50% or greater for the highest age group across cochlear location. It could be simply that the synapse loss in the present study was too small to produce significant behavioral effects. Hence, although the authors provide evidence that in the gerbil model the age-related behavioral effects are not due to synaptopathy, this may not translate to other species (including human).
(3) The study was not pre-registered, and there was no a priori power calculation, so there is less confidence in replicability than could have been the case. Only three old animals were used in the behavioral study, which raises concerns about the reliability of comparisons involving this group. Statistical analyses on very small samples can be unreliable due to problems of power, generalisability, and susceptibility to outliers.