Rapid Replacement of Acinetobacter baumannii Strains Accompanied by Changes in Lipooligosaccharide Loci and Resistance Gene Repertoire.

Multidrug-resistant (MDR) A. baumannii is a difficult-to-treat health care-associated pathogen. Knowing the resistance genes present in isolates causing infection aids in empirical treatment selection. Furthermore, knowledge of the genetic background can assist in tracking patterns of transmission to limit the spread of infections in hospitals. The appearance of a new genetic background in A. baumannii strains with a different set of resistance genes and cell surface structures suggests that strong selective pressures exist, even in highly MDR pathogens. Because the new strains have levels of antimicrobial resistance similar to those of the strains that were displaced, we hypothesize that other features, including host colonization and infection, may confer additional selective advantages and contribute to their increased prevalence.

mannii strains with a different set of resistance genes and cell surface structures suggests that strong selective pressures exist, even in highly MDR pathogens. Because the new strains have levels of antimicrobial resistance similar to those of the strains that were displaced, we hypothesize that other features, including host colonization and infection, may confer additional selective advantages and contribute to their increased prevalence. KEYWORDS Acinetobacter, antibiotic resistance, genome analysis A cinetobacter baumannii is generally considered to be a pathogen of the hospital environment, as community-acquired cases are rare in the United States (1)(2)(3). Before the early 2000s, A. baumannii infections were uncommon but readily treatable. Antimicrobial resistance has increased dramatically during the last 2 decades, with a majority of clinical isolates now being multidrug resistant (MDR), based on resistance to at least three classes of antibiotics (1), and extremely drug-resistant (XDR) isolates have been reported, which cannot be killed by any FDA-approved antimicrobial agents (4). As a result, considerable attention is focused upon reducing hospital-acquired A. baumannii infections and limiting patient-to-patient transmission (5).
Two large complex hospital systems (denoted hospital system A and hospital system B) serve the same metropolitan area in northeastern Ohio, and their main tertiary care hospitals are located within 1 mile of each other in Cleveland. Their physician networks are largely nonoverlapping, and there is limited patient flow between the two systems due to constraints of provider networks. We previously showed considerable genomic diversity of A. baumannii strains isolated in hospital system A from 2007 to 2008 (6). Most strains belonged to four major clades (clades A to D) that are subsets of the GC2 (global clone 2) lineage (7,8), and a fifth clade belonged to multilocus sequence type (MLST) group ST79 (clade E). In the context of examining genome changes in sets of multiple isolates from individual patients obtained from 2007 to 2013, we found further support for five major clades, each of which exhibits significant strain-level variation in gene content (9).
In the present study, we expanded the genome survey from hospital system A to cover nearly a decade, from 2007 to 2016. This longitudinal sampling allowed us to examine how A. baumannii populations diverged, revealing a dramatic change with time in the predominant genetic background of strains causing infections. In addition, we sequenced 70 A. baumannii strains isolated at hospital system B between 2012 and 2015. We sought to address whether these two hospital systems had distinct A. baumannii populations and, if not, to what extent the populations in the two health systems overlapped. By comparing strains from the two hospital systems, we tested the hypothesis that the hospital is the primary site of pathogen transmission. If true, we would expect closer relationships among isolates within each hospital system than among those between systems. Remarkably, we found an extensive overlap of strains in the two systems, with limited clustering of very closely related strains, suggesting a reservoir of strains outside these hospital environments.

RESULTS
To place the metropolitan Cleveland isolates into a broader phylogenetic perspective, we combined 371 hospital system A and 70 hospital system B genome sequences with 354 additional genome sequences available in GenBank (see Materials and Methods). Phylogenetic analysis was performed with both the full set of 795 genomes and the 711 genomes that correspond to GC2 isolates, since these were the most common among the Cleveland strains ( Fig. 1; see also Fig. S1 and Fig. S2 in the supplemental material). The GC2 isolates clustered into five major clades, four of which corresponded to those described previously (6). More than 90% of the Cleveland isolates belonged to GC2. Non-GC2 genomes were more diverse based on gene content and the extent of homologous recombination and tended to have fewer antibiotic resistance genes (Fig. S1).     Emergence of a new GC2 group, clade F. All five major GC2 clades belong to MLST group ST2 (Pasteur scheme [10]). The largest clade in the tree, termed clade F, includes 173 Cleveland strains, representing 32% of the isolates from the two hospital systems, and 250 genomes downloaded from GenBank, including 232 from a comparable health care system in Baltimore, MD (11) (Fig. S2). Clade F corresponds to two strain types in the Oxford (12) MLST scheme: ST281 and a single-locus variant, ST349. The ST349 and ST281 genomes are in separate phylogenetic groups.
The Cleveland and Baltimore clade F isolates are located in distinct branches of the tree, with the exception of UH7007, which was isolated 4 years before clade F strains began to appear more broadly. The few isolates from Pittsburgh (13) and Chicago (14) are located outside the Cleveland group on the tree. Considerable diversity of gene content was observed among clade F isolates, particularly those from Baltimore ( Fig. 1). A few strains were isolated from distant geographic locations, including HUMC1, isolated in 2009 in Los Angeles, CA (15); ABIsac_ColiS and ABIsac_ColiR, isolated in France in 2011 (16,17); and ORAB10, from Oregon in 2013 (18). Most clade F strains were isolated after 2011; however, this clade is likely much older than that based on the phylogenetic distances from other GC2 genomes and based on the variability in genome content, particularly among the Baltimore isolates. An average of 690 Ϯ 110 single-nucleotide variant (SNV) differences separate core genome regions of clade F genomes from those in clades A to D, compared to generally fewer than 100 SNV differences within clades A to D.
Temporal patterns of A. baumannii. Before 2013, the majority of strains from hospital system A originated from five distinct clades ( Fig. S1A to E). The frequency of clade F began to rise in 2014 (Fig. 2). By the end of 2015, most of the strains belonged to clade F. This demonstrates a remarkable decrease in overall genetic diversity and a major restructuring of the population.
Extensive overlap was evident in the phylogenetic placement for hospital system A and B strains isolated between 2012 and 2016. Each of the major clades contained genomes from isolates from both hospital systems. Approximately 80% of the strains isolated in 2014 to 2016 at both systems belonged to clade F. The average number of SNVs between clade F strains did not differ either within the hospital system genome sets or between the two sets (15 Ϯ 7). The shared locations of insertion sequence (IS) elements were concordant with the SNV tree in supporting intermixing of strains between the two hospital systems (data not shown).
Clade F strains have very limited hospital-specific clustering. The small number of SNV differences between strains makes precise inference of phylogenetic relationships difficult. However, two broad patterns were apparent in the distribution of Cleveland strains across the phylogenetic tree. First, extensive intermixing of genomes between the two hospital systems was revealed, indicating that strains diverged before entry into the hospital systems. This is most apparent in clade F, which was expected because it had the most isolates collected over similar time periods in both hospital systems. Second, there were small clusters of strains adjacent to one another on the tree that may represent local outbreaks within one hospital system. Sufficient information about patient locations was not available to support a more rigorous analysis of epidemiological patterns.
Antibiotic resistance gene features of clade F strains. Overall, the antimicrobial resistance phenotypes of Cleveland strains from clade F are similar to those from other GC2 strains (Table 1). The antimicrobial resistance gene content of clade F strains differed in many significant ways from those of other GC2 isolates. The Tn1548 genomic island carries the armA gene, which confers resistance to most aminoglycoside antibi- otics and is common in isolates from clades A and B. Tn1548 was absent from clade F strains, and the genome sequence where it is typically inserted adjacent to the homolog of ACICU_02399 was intact in these strains, with no evidence of an IS26 or transposon insertion at that location. Aminoglycoside resistance in clade F strains from Cleveland is likely to be mediated by genes carried in one of two alternative integron structures, as described below, and by a plasmid-borne aphA6 gene.   (19); however, it appears that the plasmid was lost from several isolates distributed throughout the clade F group. Upregulation of the endogenous OXA-51 family carbapenemase gene bla OXA-82 by insertion of an upstream ISAba1 element that carries a strong outward-facing promoter can also lead to high-level resistance to imipenem and meropenem (20,21). All but two clade F strains had the ISAba1-bla OXA-82 structure, which may make the plasmid-borne bla OXA-23 gene redundant in certain growth environments. Most clade B isolates carried the ISAba1-bla OXA structure, along with several strains in clades C and D. No clear difference was seen in the patterns of susceptibility to carbapenem drugs for isolates with ISAba1-bla OXA-82 compared with isolates with bla OXA-23 or both genes. Fifteen clade F strains carry Tn2006, which harbors bla OXA-23 flanked by ISAba1 elements (22,23). The genomes with Tn2006 are located at several distinct locations on the phylogenetic tree, suggesting multiple independent acquisitions.
The pACICU2-like plasmid that was the most common in clade F isolates from Cleveland is closely related to pABUH1 that carries both the bla OXA-23 gene in Tn2008 and aphA6 in TnaphA6 (7,24). In genomes from Cleveland isolates, the absence of bla OXA-23 generally correlated with loss of the plasmid. Loss of the plasmid did not always result in loss of TnaphA6, however, suggesting that this element can mobilize to other genomic locations. Clade F isolates have about a dozen copies of ISAba125, making it impossible to precisely discern the location(s) of the TnaphA6 transposon in these draft genome sequences. There was more variability in plasmid sequences represented in the assemblies of genomes among the Baltimore isolates, and Tn2008 and TnaphA6 could not be linked to the plasmid in many cases.
Class 1 integrons carrying a range of antibiotic resistance genes are common in MDR A. baumannii (25). Clade F strains from Cleveland carry one of two distinct class 1 integrons. One integron carries a resistance gene cassette that has not been previously described in A. baumannii (Fig. 3A and B). This cassette includes the aadB gene encoding aminoglycoside (2=)-adenyltransferase ANT(2Љ)-Ia and aadA2 encoding streptomycin 3Љ-adenylyltransferase. A search by BLAST of the NCBI database revealed that a nearly identical integron sequence was present in Proteus mirabilis strain AR_0059 (GenBank accession number CP020052.1); several strains of Escherichia coli isolated from cow, pig, and chicken; and Salmonella enterica subsp. enterica (e.g., GenBank accession number CP014658.1). It appears that the integron was mobilized by a pair of IS26 elements in direct-repeat orientation. In strain HUMC1, an IS26-flanked Tn6020 element carrying aphA1 was located immediately adjacent to the integron. However, Tn6020 was not found in any other clade F strains. Previously, the aadB gene has been found most often associated with the small plasmid pRAY in A. baumannii (7). A few clade F strains have aadB in a different genomic context. Strains without aadB were more likely to be susceptible to tobramycin (data not shown).
The second integron is closely related to a sequence from TCDC-AB0715 (26) (GenBank accession number CP002522.2), except that the qacEΔ gene is interrupted by an IS26 element (Fig. 3C). This structure includes the aacC1 and aadA1 genes, encoding aminoglycoside N(3=)-acetyltransferase and streptomycin 3Љ-adenylyltransferase, respectively. CCF48, CCF75, ABUH628, and HUMC1 appear to contain both integrons, whereas all other clade F genomes contain only one of them. The novel integron structure (Fig. 3B) is largely restricted to Cleveland isolates, whereas the TCDC-AB0715like integron was present in more than half of the genomes of Baltimore isolates.
Most MDR GC2 A. baumannii genomes described to date have an AbaGRI1-type resistance island, also termed Tn6166, inserted into the comM gene (27)(28)(29). In contrast, most clade F strains lack an insertion at that location. CCF66 and CCF31 have an ϳ20-kb island that is very similar to Tn6166 and includes strAB and a tetA gene (30). ABUH773 and eight other strains have a related structure at this location that lacks about 8 kb at the 3= end of the element, including the strAB genes. In addition to resistance to antibiotic treatment, A. baumannii is subjected to selective pressure in the context of competition for resources with its host, immune evasion, and survival in the hospital environment. Genes involved in the host response and virulence include those related to motility, cell surface glycosylation, attachment to biotic and abiotic surfaces, micronutrient acquisition, and secretion systems (3). Genes related to these traits are also variable among clade F strains.
Two genetic loci carry genes whose products determine the structure of the lipooligosaccharide (LOS) structures on the cell surface (31,32). Clade F strains were distinct from other GC2 strains by expressing the KL22 or KL13 variant at the K locus and an OCL3-like variant at the OC locus (Fig. 1). Fourteen clade F genomes have an intact OCL3 locus, but all other clade F strains carried one of several variants. An ISAba13 element interrupted the last gene in the cluster (gtrOC11), which encodes a glycosyltransferase (Fig. 4). Many clade F strains had additional modifications to the OCL3 locus, including deletions (adjacent to the ISAba13 insertion site) and insertions of ISAba125 and ISAba17. Kenyon et al. showed a similar pattern of IS element disruption of the OCL1 locus in GC2 isolates from Australian hospitals that resulted in truncated LOS structures (33). KL13 strains belong to ST349, while KL22 strains belong to ST281 in the Oxford MLST scheme.
Most A. baumannii strains express a type IV secretion system comprised of a complex cell surface pilus structure. The structure is encoded by multiple genes, and it has been noted that there is variation in the critical PilA structural pilin protein (34). Most GC2 strains carried a pilA gene homologous to ACICU_03380. Clade We also used a comprehensive pangenome analysis of the Cleveland isolates to identify broader patterns of genes that are shared and distinct among the clades. A neighbor-joining tree created using a distance matrix based on the number of shared genes largely supported the clades identified in the SNV tree (Fig. S3A). Fifty-three genes were specific to clade F, and 19 genes were present in all other clades but were missing from clade F. Most of these genes were annotated as encoding hypothetical proteins and may represent poorly annotated prophage. Within clade F, there was little correlation between the order of strains in the SNV tree and the shared-genes tree (Fig. S3B), indicating that there were recent changes in gene content, such as independent losses of the pACICU2-like plasmid and IS26-associated deletions. The clade A genomes comprised several subgroups in the pangenome analysis. One group of those clade A groups lacked the pACICU2-like plasmid and the bla OXA-23 gene. Clade D genomes had 109 clade-specific genes and lacked 51 genes that were present in all other clades. Clade D genomes encoded the KL42 capsular K locus cassette, and several of the differentially present genes were within this locus. Several metabolic enzymes were also encoded among the clade D-specific genes.

DISCUSSION
Long-term population dynamics reflect the cumulative impact of dispersal, mutation, introgression of new genetic backgrounds, diversification through lateral gene transfer or homologous recombination, and selection in the local environment. Here we show that the predominant genetic background of A. baumannii strains shifted in a similar way during the course of several months, resulting in very closely related strains becoming predominant in two different hospital systems serving the same geographical area. Interleaving of the genomes from the two hospital systems in the phylogenetic tree and analysis of shared genomic features suggest multiple introductions of closely related but distinct A. baumannii strains with small local expansions.
The emergence and replacement of previous genetic backgrounds in both systems spanning the same time period could be due to several factors that may have contributed to shaping the set of strains causing infections. First, a common source of new strain backgrounds might exist in extended care facilities that exchange patients with both hospital systems (35,36). The involvement of asymptomatic carriage and transmission of the Gram-negative pathogen Klebsiella pneumoniae was recently hypothesized as an explanation of genome variation observed in a large survey of isolates from multiple hospitals (37). A reservoir for A. baumannii outside health care settings has never been definitively demonstrated, and community-acquired infections are rare in the United States (38).
Second, the pattern of strains in Cleveland could reflect a broader shift in A. baumannii strains in the United States. Systematic genome-based surveys of A. baumannii strains do not presently exist in the United States, and so it is not possible to determine how representative the isolates from these Cleveland hospitals are. As shown in Fig. 1, isolates from Chicago, Baltimore, and Pittsburgh are all located in separate phylogenetic branches from the Cleveland isolates, suggesting geographical separation of the Cleveland isolates. There are, however, three non-Cleveland strains that cluster with the genomes from Cleveland in the core genome phylogeny: two strains isolated from the same patient in France (17) and one from Oregon. It seems likely that as additional isolates are sampled, there will be more examples of genomes that share a recent common ancestor with the Cleveland strains. This study along with those described above show that SNVs in the nonrecombinant core genome are only one view of shared genetic content. A full picture of relationships between strains needs to encompass SNVs, mobile elements, and homologous recombination events. To highlight this, the two genomes from France have only 3 ISAba1 copies, compared to 22 to 29 for the Cleveland isolates.
Finally, the clade F strains were not simply added to the population in the Cleveland hospital systems; over the course of about a year, they displaced the previously dominant lineages. The overall levels of antibiotic resistance are similar in strains from clade F and from other lineages (Table 1). Thus, while it is possible that antibiotic pressure contributed to the rise of clade F, we also hypothesize that these strains have some advantage in either survival or transmissibility. Harding et al. have summarized the current knowledge of virulence mechanisms in A. baumannii (3). Many of the mechanisms that they highlight are conserved across all A. baumannii genomes, including the type I and II secretion systems; the Ata protein, which is important for adherence to host surfaces; and the type IV pilus. However, several of the genes highlighted in the review by Harding et al. are absent or variable in the clade F genomes. The presence of a putative alternative major pilus protein, PilA, in the clade F genomes may indicate a functional difference of this structure that is important for twitching motility and adherence to epithelial cells (39,40). The three-dimensional structures of three variant PilA proteins were recently described (34), providing insight into functional differences among these proteins. The pilA allele in clade F genomes is distinct from all of those, with Ͻ80% amino acid identity of the encoded protein.
Lipopolysaccharide, lipooligosaccharide, and capsule structures contribute to evasion of innate immune responses in several pathogens, including Escherichia coli (41).
Capsule replacement has also been suggested to have contributed to the success of the ST258 lineage of Klebsiella pneumoniae (42). We also found an enrichment in variants affecting cell surface structures among longitudinally sampled A. baumannii strains from individual patients (9). Most GC2 A. baumannii strains carry the OCL1 locus, whereas clade F strains carry several variants of the OCL3 locus. These variants are characterized by multiple intragenic IS element insertions and deletions, suggesting the loss of some glycosyltransferase activities. Multiple novel variants of the phosphoglycosyltransferase gene pglC that catalyzes the initial step of capsule synthesis (43) were also found among the GC2 genomes. The functional impact of variation at these loci bears further investigation.
Among genome features of the clade F strains, relatively few are unique to the Cleveland isolates. The HUMC1-type integron (Fig. 3B) is restricted to HUMC1 and the Cleveland isolates, among the genomes that have so far been described. This integron carries aadB and aadA2, which confer aminoglycoside resistance, and while this structure is novel, those genes are present in different genetic contexts in other A. baumannii genomes. The alternative pilA allele is shared by Cleveland and non-Cleveland clade F genomes, as are the OC and K locus variants.
A number of studies have used genomic approaches to enhance epidemiological studies of A. baumannii in hospital environments (44)(45)(46). We previously found that almost one-third of patients with multiple A. baumannii isolates were infected by more than one genetically distinct strain (9). Others have seen a similar diversity of cocirculating strains. Schultz et al. found multiple distinct lineages of carbapenem-resistant A. baumannii in a Vietnamese hospital and documented the transposon-mediated spread of a bla OXA-23 gene to new genetic backgrounds (47). These investigators also observed switching of the capsule locus genes, likely by homologous recombination in flanking regions. A yearlong survey of a German intensive care unit found that half of the A. baumannii isolates belonged to one of two clusters but that all others were genetically distinct from each other (48).
A somewhat different picture emerged in an analysis of 85 strains from Mexico (49). The Mexico strains had very few SNV differences compared to a broad set of reference genomes and thus appeared to be clonally related. However, they exhibited extensive variation in gene content, with scores of strain-specific genes and with few genomes highly similar in gene content. Chromosomal and plasmid-mediated mobile elements were a chief source of the variation in gene number, but the origins of the variable gene content could not be determined. Feng et al. characterized A. baumannii isolates from East Asia and also found strains that were very closely related in their core gene phylogeny despite being isolated in hospitals that are more than 2,500 km apart (50). Even closely related genomes had differences in mobile elements related to antibiotic resistance. A common theme of these studies and the current survey of A. baumannii genomes from Cleveland is that A. baumannii genomes exhibit a high degree of genomic variation and that apparently closely related strains can nonetheless differ in genome composition, leading to phenotypic differences.
Knowledge regarding mechanisms by which MDR pathogens are transmitted among and within hospitals is important for understanding how new resistance genes or virulent lineages disperse from local to global scales. The rapid shift in genetic background described here highlights the importance of genetic monitoring of strains causing infection as part of an ongoing effort to optimize treatment and prevention strategies. Extensive genic diversity is a hallmark of A. baumannii isolates. While the antimicrobial resistance genotypes are certainly an important component of this variation, the breadth of functions represented among genes that are acquired and lost suggests that other selective forces, such as host interaction and survival, also play a significant role in shaping the genome evolution of A. baumannii and that these areas are worthy of further exploration in developing an improved understanding of the basis for the clinical impact of this pathogen.

MATERIALS AND METHODS
Strains. Clinical isolates of A. baumannii were archived by the clinical microbiology departments under protocols approved by institutional review boards at each institution. Isolates from hospital system A were collected between 2007 and 2016 and stored at Ϫ80°C. Samples were randomly selected for sequencing to span the time frame of collection. In cases where more than one strain was isolated from a patient, only one isolate was included in the analysis presented here. Isolates from hospital system B were obtained from bloodstream infections during 2012 and 2016. Only meropenem-resistant strains were selected; these represented 51% of all A. baumannii bloodstream infections during this period. A total of 371 strains from hospital system A and 70 from hospital system B were included. Genome sequences for 344 of those strains were determined in this study, and the remainder have been reported previously (6,9). Accession numbers for all genomes are available in Data Set S1 in the supplemental material.
Genome sequencing, annotation, and comparative analysis. DNA was isolated using mechanical lysis with glass beads (51). Libraries were prepared for sequencing using NexteraXT kits and sequenced on an Illumina NextSeq 500 DNA sequencer as paired-end, 150-base reads. Read sets were assembled using velvet (52) and submitted for annotation to the Prokaryotic Genome Annotation Pipeline (PGAP) at the National Center for Biotechnology Information (53). Antibiotic resistance genes were identified using the Comprehensive Antibiotic Resistance Database (CARD) (http://arpcard.mcmaster.ca) (54). In silico MLST typing was performed using LOCUST (55). The locations of insertion sequence elements in draft genomes were mapped using ISseeker (56). ISseeker finds partial IS matches at contig edges and maps the flanking sequence to a reference genome, in this case ACICU (GenBank accession number NC_010611.1). To define the structures of integrons, we began by identifying antibiotic resistance genes of interest (using a tBLASTn search with protein sequence queries of the draft genomes). Contigs containing these genes were then compared to a collection of previously described genomic islands and other mobile elements from A. baumannii, with particular attention being paid to well-characterized elements in reported genomes. Matching elements were then aligned to the draft genome, and an accounting was made of regions of identity and of difference. A set of nucleotide query sequences was then used to search the full set of genome sequences to identify all genomes with highly similar content. This back-and-forth strategy was used to refine the interpretation of the draft genome content. OC and K loci were mapped by BLAST of contigs from each genome to a database comprised of previously reported sequences (31,32). An OC or K locus was considered a match if Ͼ98% of the locus was represented by sequences in the query genome. Pangenome analysis was performed using PanOCT (57) with the Cleveland genomes plus the 70 complete genomes from GenBank. Pairwise comparisons between the nucleotide sequences of all protein-coding genes were performed using an iterative approach to building the distance matrix. Fourteen groups were used in the first round of distance matrix construction, selected based on proximity on the phylogenetic tree. A gene presence/absence matrix was constructed, and the matrix was used to construct a neighbor-joining tree based on shared gene content.
Phylogeny. Genome sequences for 70 strains, representing complete genome sequences of all strains available in GenBank as of 28 June 2017, were included in the analysis to provide phylogenetic context for the Cleveland strains. In addition, all genomes that fall into the same clade as HUMC1 in the GenBank genomes division were downloaded and included in certain analyses of clade F to provide as broad a context as possible for the Cleveland genomes (250 additional genomes after removing 10 assemblies with more than 300 contigs). Single-nucleotide variants in the genome sequences were called using Parsnp (58), using the genome sequence of strain HUMC1 (15) as the reference. Three different genome groups were analyzed: (i) all the genomes described above, (ii) all GC2 genomes, and (iii) all clade F genomes. Parsnp calls SNVs in core genome regions, so the number of informative variants increases as the number of genes shared across the data set increases and as the genome diversity increases. To identify regions of recombination, SNVs were imputed onto the HUMC1 genome sequences using the FastaAlternateReferenceMaker feature of the Genome Analysis Toolkit (GATK) (59) to create the input sequences for Gubbins (60). Gubbins iteratively identifies regions of recombination based on the density of SNVs and simultaneously constructs a maximum likelihood tree based on the filtered polymorphisms using RAxML (61). For the all-GC2 tree in Fig. 1, 19,551 SNVs identified by Parsnp were reduced to 8,868 in nonrecombining regions by Gubbins. For the clade F tree in Fig. S2, 15,725 SNVs identified by Parsnp were reduced to 9,090 in nonrecombining regions.
Data availability. Data Set S1 in the supplemental material contains information about each strain, including GenBank accession numbers and antimicrobial susceptibility patterns, when available. Newly sequenced isolates have been registered at the NCBI under BioProject accession numbers PRJNA352251, PRJNA316619, and PRJNA271775.