Skip to main content
  • ASM
    • Antimicrobial Agents and Chemotherapy
    • Applied and Environmental Microbiology
    • Clinical Microbiology Reviews
    • Clinical and Vaccine Immunology
    • EcoSal Plus
    • Eukaryotic Cell
    • Infection and Immunity
    • Journal of Bacteriology
    • Journal of Clinical Microbiology
    • Journal of Microbiology & Biology Education
    • Journal of Virology
    • mBio
    • Microbiology and Molecular Biology Reviews
    • Microbiology Resource Announcements
    • Microbiology Spectrum
    • Molecular and Cellular Biology
    • mSphere
    • mSystems
  • Log in
  • My alerts
  • My Cart

Main menu

  • Home
  • Articles
    • Latest Articles
    • COVID-19 Special Collection
    • Archive
    • Minireviews
  • Topics
    • Applied and Environmental Science
    • Clinical Science and Epidemiology
    • Ecological and Evolutionary Science
    • Host-Microbe Biology
    • Molecular Biology and Physiology
    • Therapeutics and Prevention
  • For Authors
    • Submit a Manuscript
    • Scope
    • Editorial Policy
    • Submission, Review, & Publication Processes
    • Organization and Format
    • Errata, Author Corrections, Retractions
    • Illustrations and Tables
    • Nomenclature
    • Abbreviations and Conventions
    • Publication Fees
    • Ethics Resources and Policies
  • About the Journal
    • About mBio
    • Editor in Chief
    • Board of Editors
    • AAM Fellows
    • For Reviewers
    • For the Media
    • For Librarians
    • For Advertisers
    • Alerts
    • RSS
    • FAQ
  • ASM
    • Antimicrobial Agents and Chemotherapy
    • Applied and Environmental Microbiology
    • Clinical Microbiology Reviews
    • Clinical and Vaccine Immunology
    • EcoSal Plus
    • Eukaryotic Cell
    • Infection and Immunity
    • Journal of Bacteriology
    • Journal of Clinical Microbiology
    • Journal of Microbiology & Biology Education
    • Journal of Virology
    • mBio
    • Microbiology and Molecular Biology Reviews
    • Microbiology Resource Announcements
    • Microbiology Spectrum
    • Molecular and Cellular Biology
    • mSphere
    • mSystems

User menu

  • Log in
  • My alerts
  • My Cart

Search

  • Advanced search
mBio
publisher-logosite-logo

Advanced Search

  • Home
  • Articles
    • Latest Articles
    • COVID-19 Special Collection
    • Archive
    • Minireviews
  • Topics
    • Applied and Environmental Science
    • Clinical Science and Epidemiology
    • Ecological and Evolutionary Science
    • Host-Microbe Biology
    • Molecular Biology and Physiology
    • Therapeutics and Prevention
  • For Authors
    • Submit a Manuscript
    • Scope
    • Editorial Policy
    • Submission, Review, & Publication Processes
    • Organization and Format
    • Errata, Author Corrections, Retractions
    • Illustrations and Tables
    • Nomenclature
    • Abbreviations and Conventions
    • Publication Fees
    • Ethics Resources and Policies
  • About the Journal
    • About mBio
    • Editor in Chief
    • Board of Editors
    • AAM Fellows
    • For Reviewers
    • For the Media
    • For Librarians
    • For Advertisers
    • Alerts
    • RSS
    • FAQ
Editor's Pick Research Article | Ecological and Evolutionary Science

Diversification of Colonization Factors in a Multidrug-Resistant Escherichia coli Lineage Evolving under Negative Frequency-Dependent Selection

Alan McNally, Teemu Kallonen, Christopher Connor, Khalil Abudahab, David M. Aanensen, Carolyne Horner, Sharon J. Peacock, Julian Parkhill, Nicholas J. Croucher, Jukka Corander
Julian E. Davies, Editor
Alan McNally
aInstitute of Microbiology and Infection, University of Birmingham, Birmingham, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Teemu Kallonen
bInfection Genomics, Wellcome Sanger Institute, Cambridge, United Kingdom
cDepartment of Biostatistics, University of Oslo, Oslo, Norway
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Christopher Connor
aInstitute of Microbiology and Infection, University of Birmingham, Birmingham, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Khalil Abudahab
bInfection Genomics, Wellcome Sanger Institute, Cambridge, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
David M. Aanensen
bInfection Genomics, Wellcome Sanger Institute, Cambridge, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Carolyne Horner
dBritish Society of Antimicrobial Chemotherapy, Birmingham, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sharon J. Peacock
bInfection Genomics, Wellcome Sanger Institute, Cambridge, United Kingdom
eDepartment of Medicine, University of Cambridge, Cambridge, United Kingdom
fLondon School of Hygiene and Tropical Medicine, London, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Julian Parkhill
bInfection Genomics, Wellcome Sanger Institute, Cambridge, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Julian Parkhill
Nicholas J. Croucher
gFaculty of Medicine, School of Public Health, Imperial College, London, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jukka Corander
bInfection Genomics, Wellcome Sanger Institute, Cambridge, United Kingdom
cDepartment of Biostatistics, University of Oslo, Oslo, Norway
hDepartment of Mathematics and Statistics, University of Helsinki, Helsinki, Finland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Julian E. Davies
University of British Columbia
Roles: Editor
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jesse Shapiro
University of Montreal
Roles: Solicited external reviewer
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Rene Niehus
University of Oxford
Roles: Solicited external reviewer
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
DOI: 10.1128/mBio.00644-19
  • Article
  • Figures & Data
  • Info & Metrics
  • PDF
Loading

Article Figures & Data

Figures

  • Supplemental Material
  • FIG 1
    • Open in new tab
    • Download powerpoint
    FIG 1

    Summarizing the population dynamics of the British Society for Antimicrobial Chemotherapy extraintestinal pathogenic E. coli collection. These isolates were collected from bacteremia cases around the United Kingdom between 2001 and 2011. (A) Conservation of gene frequencies. Each point corresponds to one of the 6,824 genes identified by Roary in the BSAC collection with mean frequencies between 0.05 and 0.95 across all years. Error bars indicate the full range observed across annual samples. (B) Correlation of gene frequencies with those observed in 2001. This shows the changing correlation of gene frequencies, calculated by both the Pearson and Spearman methods, in each year relative to those observed in 2001. Both measures indicate a divergence in gene frequencies as ST69 and ST131 emerge, until 2010, at which point there is a reversion to the frequencies seen in the original population. (C) Emergence of ST69 (SC9, in orange) and ST131 (SC13, red). The frequencies of the subclades of ST131 are shown by the red dashed lines.

  • FIG 2
    • Open in new tab
    • Download powerpoint
    FIG 2

    Simulations of changes in the BSAC extraintestinal pathogenic E. coli population evolving under multilocus NFDS. Genomic data (top) and median frequencies (middle) observed from 100 simulations run with the best-matching parameter set identified by fitting the model with BOLFI. This corresponded to σf = 0.029, r = 0.179, m = 0.001, pf = 0.425, and σw = 0.0048. Each column corresponds to a sequence cluster identified by hierBAPS (see Materials and Methods) and is annotated with the predominant sequence type with which it is associated. Each bar indicates the frequency of the sequence cluster in consecutive time periods, from left to right. The bars are colored according to the number of antibiotic resistance phenotypes associated with the isolates within the sequence cluster at different time points. (Bottom) The equivalent best fit in the absence of NFDS. Only sequence clusters reaching a frequency of at least 2.5% at one time point in the genomic sample are shown; the full results of the simulation, including measures of between-simulation variation, are shown in Fig. S3.

  • FIG 3
    • Open in new tab
    • Download powerpoint
    FIG 3

    (A) Maximum likelihood phylogeny of 862 E. coli ST131 strains. The phylogeny was inferred using RAxML with a GTR Gamma model of substitution, on an alignment of concatenated core CDS as determined by Roary. (B) PANINI plot of the accessory genome content of all 862 strains based on a tSNE plot. The plot is a diagrammatical representation of the relatedness of each strain based on the presence/absence of accessory genes and is presented as a two dimensional representation. The taxa are color coded by BAPS grouping (Table S1) and show clade A (green, BAPS-3), clade B (red, yellow, and purple, BAPS-2, -4, and -5, respectively), and clade C (blue, BAPS-1).

  • FIG 4
    • Open in new tab
    • Download powerpoint
    FIG 4

    Bar chart depicting functional classes of accessory genes differentially present in clade A and clade B/C E. coli ST131. Functional classes are based on GO classes as described in Materials and Methods. *, significant difference exists between clade A and clade C as determined by t test.

  • FIG 5
    • Open in new tab
    • Download powerpoint
    FIG 5

    Annotation of a maximum likelihood phylogeny of E. coli ST131, based on concatenated core CDS, with the presence of alternative alleles of 64 loci involved in anaerobic metabolism. Each blue box along the top of the tree annotation represents an individual anaerobic metabolism gene, and its presence in the ST131 population is indicated by a blue line. The inset is a bar chart displaying the proportions of the accessory pangenome that are occupied by genes involved in anaerobic metabolism for ST131 clade A, clade B, subsampled clade C versus clade A, and subsampled clade C versus clade B. P = 0.042 for clade C versus clade A and P = 0.086 for clade C versus clade B. Error bars represent standard errors of the means. Significance was determined using the median value P value from chi-square tests performed on random subsamples of the C clade.

  • FIG 6
    • Open in new tab
    • Download powerpoint
    FIG 6

    Annotation of a maximum likelihood phylogeny of E. coli ST131, based on concatenated core CDS, with the presence of alternative alleles of loci involved in capsule production, cell division, iron acquisition, pili/fimbriae production, flagella, and MDR efflux pumps. Each box represents an individual gene, and its presence in the ST131 population is indicated by an appropriately colored line.

  • FIG 7
    • Open in new tab
    • Download powerpoint
    FIG 7

    Bar charts depicting the compositions of the accessory genomes of ST73 (A) and ST95 (B) compared to a repetitively sampled clade C ST131. The proportions of the accessory genome are plotted against manually assigned functional categories. Hypothetical proteins are responsible for the majority of the accessory pangenome and are omitted from the graphs. Error bars are standard errors of the means. Iterative chi-square tests were performed to assess significance, as described in Materials and Methods. *, P < 0.05; **, P < 0.01; ***, P < 0.001.

Supplemental Material

  • Figures
  • FIG S2

    Distribution of accessory loci relative to the E. coli population structure, displayed using Phandango. (A) Core genome phylogeny encompassing 1,509 E. coli genomes from the BSAC and Cambridge University Hospitals collections, as described by Kallonen et al. (16). (B) Population structure analyses. Each column contains one row for each isolate in the phylogeny. The phylogroup information is reproduced from Kallonen et al. (16). The middle three columns show the different levels of clustering identified by hierBAPS. The rightmost column shows the sequence clusters used in this work, inferred from the hierBAPS analysis by identifying the level of clustering that corresponded to a clonal complex when linking isolates sharing identical alleles at five of the seven multilocus sequence typing loci. (C) Distributions of intermediate frequency loci. The 8,311 loci found between 5% and 95% frequency in the overall population are each represented by a column, cells of which are colored blue when the locus is present in the isolate defining the corresponding row. The vertical stripes indicate those loci stably associate with particular sequence clusters. Download FIG S2, PDF file, 14.5 MB.

    Copyright © 2019 McNally et al.

    This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

  • FIG S3

    (A) Correlations of gene frequencies in the BSAC collection over time. Each plot shows the frequencies of those genes, identified by Roary, that were found to be present at a mean frequency between 0.05 and 0.95 across the entire collection. These graphs show how the correlation between the starting frequencies, in 2001, and those in later years weakened until 2008, at which point the correlation strengthened considerably in 2010 and 2011. (B) Full results of the NFDS simulations. These bar charts show the frequencies for all lineages from the 100 simulations performed using the optimal parameters identified within the BOLFI model fitting, which are summarized in Fig. 2. Each column again corresponds to a sequence cluster and is annotated according to the predominant sequence type. The five bars within each column represent the frequencies of the sequence cluster over subsequent time intervals: either that observed in the genomic samples for the top panel or the median frequency in simulations in the bottom panel. The error bars on the bottom panel indicate the interquartile ranges from the 100 simulations. The red bars correspond to the ST69 and ST131 sequence clusters that had a reproductive fitness benefit, r, over the rest of the population. Download FIG S3, PDF file, 0.3 MB.

    Copyright © 2019 McNally et al.

    This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

  • TABLE S2

    (a) Parameter estimates (associated 95% credibility intervals in parentheses) derived from sequential Monte Carlo sampling of the BOLFI model fitted through 500 iterations of model simulation. (b) Tajima’s D measurements for anaerobic metabolism showing 3 or more allelic variants. Download Table S2, PDF file, 0.3 MB.

    Copyright © 2019 McNally et al.

    This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

  • TABLE S1

    Collection of E. coli ST131 genomes. Download Table S1, CSV file, 0.1 MB.

    Copyright © 2019 McNally et al.

    This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

  • FIG S4

    (A) Variation in intermediate gene frequencies, present at between 5% and 95% frequency in the 2001 BSAC population, within sequence clusters. This plot compares the output of two different measures of genome content for each sequence cluster represented by more than 10 isolates in the BSAC collection. Each point corresponding to such a grouping is labelled with the most common sequence type among the isolates it encompasses. ST131 clades A, B, and C are also included as separate groupings, and labeled as such. The horizontal axis shows the estimates of alpha from a Heap’s law model fit to a rarefaction curve calculation from the distribution of intermediate frequency loci identified by Roary. In this model, alpha values of <1 are associated with an open pangenome, whereas alpha values >1 represent closed pangenomes. Lower values are associated with a greater diversity of genes per isolate in the grouping. The vertical axis shows the genomic fluidity, corresponding to the Jaccard distance calculated from a sample of pairwise comparisons, also calculated from the intermediate frequency loci: the point shows the mean value, and the error bars show the sample standard deviations. Higher values are associated with greater dissimilarity between each individual pair of isolates. Clade C is relatively close to the origin, along with a few sequence clusters. These appear to represent groupings in which isolates are radiating from a common ancestor: the openness of the pangenomes results from many genes being present at low frequencies, such that each isolate pair individually differs by relatively few loci. (B) Effect of changing the seeding level on simulation results. The top shows the genomic data to which simulations were fitted, and the bottom shows the median outputs from 100 simulations, as in Fig. 2. Download FIG S4, PDF file, 0.1 MB.

    Copyright © 2019 McNally et al.

    This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

  • FIG S5

    (A) Effect of changing the carrying capacity on simulation results. The top row shows the genomic data to which simulations were fitted, as in Fig. 2. The rows beneath each show the median outputs from 100 simulations, as in Fig. 2. The second row reproduces the simulation with the best-fitting parameter set and a carrying capacity of 5 × 104. The third and fourth rows show the effect of decreasing the simulated carrying capacity. These demonstrate the final prevalence of ST69 and ST131 increases with higher carrying capacities, but the effect is small over changes of more than 2 orders of magnitude in the simulated population size. (B) Effect of removing NFDS on the emergence of ST69 and ST131. The top row shows the genomic data to which simulations were fitted, as in Fig. 2. The rows beneath each show the median outputs from 100 simulations, as in Fig. 2. The second row reproduces the simulation with the best-fitting parameter set for a model, including NFDS. The third row shows simulations in which r is kept at the same level, but there is no NFDS. The fourth row shows the best-fitting parameter set for a neutral model, in which r is adjusted to account for the lack of NFDS. Download FIG S5, PDF file, 0.1 MB.

    Copyright © 2019 McNally et al.

    This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

  • FIG S1

    Histograms of the relative frequency of genes within the accessory genome of the entire E. coli ST131 population and of each separate clade. The x axes indicate the relative frequencies with which a gene appears, and the y axes indicate the numbers of accessory genes which appear at that given frequency. Download FIG S1, PDF file, 0.3 MB.

    Copyright © 2019 McNally et al.

    This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

  • FIG S6

    Frequency dependence plot showing the frequency at which all E. coli ST131 accessory genes occur in strains isolated from the United Kingdom versus strains isolated from outside the United Kingdom. The allele variants are color coded as in the other figures: anaerobic metabolism (blue), capsule production (pale blue ), cell division (black), iron acquisition (orange), pili/fimbriae production (green), flagella (red), and MDR efflux pumps (pink). Download FIG S6, JPG file, 0.1 MB.

    Copyright © 2019 McNally et al.

    This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

  • FIG S7

    Stable intermediate frequencies of anaerobic metabolism loci. Four genes involved in anaerobic metabolism were found to be present at intermediate frequencies in the BSAC collection. All were absent from the ST131 lineage, except nirB_2, which was found in a subset of the lineage. Nevertheless, plotting their annual frequencies reveals distinct stable frequencies over the period, despite the rise to prominence of ST131. Download FIG S7, PDF file, 0.1 MB.

    Copyright © 2019 McNally et al.

    This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

PreviousNext
Back to top
Download PDF
Citation Tools
Diversification of Colonization Factors in a Multidrug-Resistant Escherichia coli Lineage Evolving under Negative Frequency-Dependent Selection
Alan McNally, Teemu Kallonen, Christopher Connor, Khalil Abudahab, David M. Aanensen, Carolyne Horner, Sharon J. Peacock, Julian Parkhill, Nicholas J. Croucher, Jukka Corander
mBio Apr 2019, 10 (2) e00644-19; DOI: 10.1128/mBio.00644-19

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Print

Alerts
Sign In to Email Alerts with your Email Address
Email

Thank you for sharing this mBio article.

NOTE: We request your email address only to inform the recipient that it was you who recommended this article, and that it is not junk mail. We do not retain these email addresses.

Enter multiple addresses on separate lines or separate them with commas.
Diversification of Colonization Factors in a Multidrug-Resistant Escherichia coli Lineage Evolving under Negative Frequency-Dependent Selection
(Your Name) has forwarded a page to you from mBio
(Your Name) thought you would be interested in this article in mBio.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Diversification of Colonization Factors in a Multidrug-Resistant Escherichia coli Lineage Evolving under Negative Frequency-Dependent Selection
Alan McNally, Teemu Kallonen, Christopher Connor, Khalil Abudahab, David M. Aanensen, Carolyne Horner, Sharon J. Peacock, Julian Parkhill, Nicholas J. Croucher, Jukka Corander
mBio Apr 2019, 10 (2) e00644-19; DOI: 10.1128/mBio.00644-19
del.icio.us logo Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
  • Top
  • Article
    • ABSTRACT
    • INTRODUCTION
    • RESULTS
    • DISCUSSION
    • MATERIALS AND METHODS
    • ACKNOWLEDGMENTS
    • FOOTNOTES
    • REFERENCES
  • Figures & Data
  • Info & Metrics
  • PDF

KEYWORDS

AMR
Escherichia coli
evolutionary genomics
negative frequency-dependent selection

Related Articles

Cited By...

About

  • About mBio
  • Editor in Chief
  • Board of Editors
  • AAM Fellows
  • Policies
  • For Reviewers
  • For the Media
  • For Librarians
  • For Advertisers
  • Alerts
  • RSS
  • FAQ
  • Permissions
  • Journal Announcements

Authors

  • ASM Author Center
  • Submit a Manuscript
  • Author Warranty
  • Article Types
  • Ethics
  • Contact Us

Follow #mBio

@ASMicrobiology

       

ASM Journals

ASM journals are the most prominent publications in the field, delivering up-to-date and authoritative coverage of both basic and clinical microbiology.

About ASM | Contact Us | Press Room

 

ASM is a member of

Scientific Society Publisher Alliance

 

American Society for Microbiology
1752 N St. NW
Washington, DC 20036
Phone: (202) 737-3600

Copyright © 2021 American Society for Microbiology | Privacy Policy | Website feedback

Online ISSN: 2150-7511