Reviewer #2 (Public Review):
Summary:<br /> This work analyses the historical spread and evolution, termed 'population dynamics', of a human bacterial pathogen, Neisseria gonorrhoeae, the cause of the sexually transmitted infection, gonorrhoea. N. gonorrhoeae is classified as a high priority pathogen by the World Health Organisation, due to infections numbering in the tens of millions annually, with high levels of antibiotic resistance and no vaccine available, meaning treating and preventing infections is becoming increasingly more difficult. To implement interventions effectively, important resistant lineages and their transmission routes must be identified on a national and international level.
In this work, Osnes et al. use genomic data, coupled with geographic, temporal and demographic metadata, to analyse the global population dynamics N. gonorrhoeae using 9,732 genomes. The study also includes a granular analysis of transmission between and within four regions of different sizes with high levels of data coverage: USA, Europe, Norway, and Victoria state in Australia.<br /> The authors built a phylogenetic tree including all genomes using a novel computationally efficient method for removing genome regions resulting from recombination, which would otherwise result in incorrect branch lengths and tree topology. Using the tree, the authors show that the effective population size of N. gonorrhoeae, describing population size and diversity, decreased in the period from 2010 to present day, and was not entirely an artefact of sampling bias. The authors then stratified the tree based on isolates that contained alleles that are associated with resistance to antibiotics commonly used to treat gonorrhoea. The authors found resistance was associated with particular lineages, of which most, but not all, underwent shrinking in effective population size in the last decade.<br /> Using the tree, the authors then inferred likely importation, exportation, and local transmission events, finding notable differences in the contribution of imports to local incidence between locations, as well as the likelihood of exportation. As inference of these events relies on sampling density, the authors used a novel method for identifying whether sampling was representative of the population diversity of a given location. Using this approach, they found that the densely sampled regions, Norway and Victoria, were likely representative of the local N. gonorrhoeae population diversity, whilst the larger, less densely sampled regions, Europe and USA, were not. Finally, they investigated the contribution of specific transmission networks to the spread gonorrhoea, finding that the frequency of males within a transmission network may play a role in the rate of N. gonorrhoeae transmission in Norway, but not Victoria.<br /> This work introduces several novel approaches to the analysis of pathogen population dynamics, and highlights notable differences in N. gonorrhoeae transmission between and within distinct geographic locations.
Strengths:<br /> • The authors have collated a large global collection of N. gonorrhoeae genomes with associated metadata, and in some cases generated assemblies themselves. A dataset of this size and detail is a valuable asset to the public health community, enabling analysis of both national and international population dynamics.<br /> • The stratification of the phylogenetic tree by antimicrobial resistance gene alleles enables the study of how antibiotic usage has shaped global and regional N. gonorrhoeae populations. Analysis of changes in the effective population size of clades harbouring resistance alleles is particularly impactful, as this can be used to show how changes in treatment patterns affect the growth or decline of drug-resistant pathogen populations. This analysis also enables the determination of the frequency of multiple resistance alleles being present in single isolates, important for determining the scale of multidrug resistance within the N. gonorrhoeae global population.<br /> • The use of ancestral trait reconstruction to quantify importation, exportation and local transmission is an important contribution to public health efforts tackling N. gonorrhoeae spread. Understanding the differences in transmission networks within and between different geographic locations provides public health researchers with crucial information to model and implement effective targeted interventions on regional and international scales.
Weaknesses:<br /> • The method used to generate the phylogenetic tree and mask regions of recombination is likely flawed. The authors repeatedly down-sampled the whole population to 500 genomes, using Gubbins to identify regions that have recombined and therefore would not follow the clonal history of the N. gonorrhoeae population. This small sample size will result in the same ancient internal nodes being sampled repeatedly, whilst more recent internal nodes will not. Therefore, more recent recombination events would not be identified by this method and were therefore likely included in the whole genome alignment used to build the tree. Furthermore, Gubbins was designed to identify recombination between closely related genomes, not across a whole species, where the background mutation rate will be too high to differentiate between recombined regions and the clonal frame. Both of these factors will mean that the amount of the genome predicted to have recombined will likely be underestimated, resulting in inflated branch lengths and incorrect tree topology. This effect is potentially the cause of the observed drop in N. gonorrhoeae effective population size between 2010-present day in Figure 2, which does not align with gonorrhoea incidence, and the elevated estimated mutation rate of 7.41x10-6 substitutions per site per year, which is higher than previous estimates based on N. gonorrhoeae global populations. The result of underestimation of recombined regions will be two-fold. Inclusion of recombined regions in the alignment will result in inflated branch lengths, which will impact all estimates of effective population size in the study. Furthermore, tree topology may be incorrect, which will impact ancestral trait reconstruction and result in incorrect inference of import, export and local transmission events in Figures 3, 4 and 5. Additionally, the clade-specific resistance gene analyses will be affected in Figure 2, as certain isolates may be incorrectly included or excluded within stratified clades. Therefore, the conclusions made about the changes in effective population size for the global population, and individual clades, as well as the differences in transmission dynamics between locations, are likely to be incorrect.<br /> • The method used to identify sampling bias, shown in Figure 4, is a novel and interesting take on the problem. However, it is not clear whether the effect being measured is the presence of sampling bias or an artefact of differences in N. gonorrhoeae diversity between locations. The results in Figure 4 do align with what is known about the population datasets; the data from Norway and Victoria is more comprehensive than that of the USA and Europe due to the difference in size of the respective human populations, meaning the likelihood of sampling bias will be lower in the smaller population. However, with increased human population size, we would also expect a greater amount of pathogen diversity, due to increased within-region transmission and greater numbers of importation events. Supporting this, we see in Figure 3 that the transmission lineages in the USA and Europe are estimated to have emerged earlier than Norway or Victoria, indicative of a greater amount of standing population diversity. Therefore, the reason why convergence is observed when up-sampling from smaller populations may be because a vast majority of isolates will sit within a small part of the tree, whilst from a larger, more diverse population, isolates will be placed all across the tree and so convergence will never be observed. In effect, it is unknown whether increasing the sample size of the USA and Europe to be truly representative of their respective N. gonorrhoeae populations would ever result in convergence between the two methods of up-sampling. Testing this method using simulations could be used to determine whether it is sensitive to sampling bias, or population diversity.<br /> • In Figure 5, a significant difference in transmission lineage size was only found between male-dominated and mixed lineages in Norway and not Victoria. Therefore, the conclusion that sex distribution within transmission networks affects the size of transmission lineages is not supported by the data, and could also be due to geographical and other demographic differences between the datasets which were not accounted for.