ABSTRACT
We discovered a new lineage of the globally important fungal pathogen Cryptococcus gattii on the basis of analysis of six isolates collected from three locations spanning the Central Miombo Woodlands of Zambia, Africa. All isolates were from environments (middens and tree holes) that are associated with a small mammal, the African hyrax. Phylogenetic and population genetic analyses confirmed that these isolates form a distinct, deeply divergent lineage, which we name VGV. VGV comprises two subclades (A and B) that are capable of causing mild lung infection with negligible neurotropism in mice. Comparing the VGV genome to previously identified lineages of C. gattii revealed a unique suite of genes together with gene loss and inversion events. However, standard URA5 restriction fragment length polymorphism (RFLP) analysis could not distinguish between VGV and VGIV isolates. We therefore developed a new URA5 RFLP method that can reliably identify the newly described lineage. Our work highlights how sampling understudied ecological regions alongside genomic and functional characterization can broaden our understanding of the evolution and ecology of major global pathogens.
IMPORTANCE Cryptococcus gattii is an environmental pathogen that causes severe systemic infection in immunocompetent individuals more often than in immunocompromised humans. Over the past 2 decades, researchers have shown that C. gattii falls within four genetically distinct major lineages. By combining field work from an understudied ecological region (the Central Miombo Woodlands of Zambia, Africa), genome sequencing and assemblies, phylogenetic and population genetic analyses, and phenotypic characterization (morphology, histopathological, drug-sensitivity, survival experiments), we discovered a hitherto unknown lineage, which we name VGV (variety gattii five). The discovery of a new lineage from an understudied ecological region has far-reaching implications for the study and understanding of fungal pathogens and diseases they cause.
INTRODUCTION
Cryptococcosis is a severe fungal infection responsible for high levels of mortality and morbidity worldwide (1). The etiological agents are two species complexes of the genus Cryptococcus: Cryptococcus neoformans and C. gattii. While the first described cases of clinical cryptococcosis due to these two distinct species complexes were reported in the mid-1890s under the names Saccharomyces hominis (2) and S. subcutaneous tumefaciens (3), respectively, clinical Cryptococcus isolates have been taxonomically treated as a single species (C. neoformans) for more than 100 years (4). Heterogeneity among cryptococcosis-causing yeast isolates became increasingly apparent from the middle of the 20th century onward and led to the recognition of four serotypes (A, B, C, and D) on the basis of the antigenic determinant of capsular polysaccharide (5, 6). Subsequent discovery of two distinct sexual cycles produced by the isolates of serotypes A and D versus serotypes B and C (7, 8) and phylogenetic analysis using various gene sequences (9–11) confirmed these complexes to be genetically divergent enough to be considered separate species. Thus, in 2002, the isolates of serotypes B and C were formally classified as C. gattii (12) while C. neoformans includes all serotype A and D strains (13).
Over the past 2 decades, population structure analysis of the two species using molecular typing methods such as PCR fingerprinting (14), amplified fragment length polymorphism (AFLP) analysis (15), and multilocus sequencing (16) has demonstrated that both species contain genetically diverse lineages that qualify them to be considered two species complexes, and those species complexes have been further subdivided into numerous molecular types (14, 17). To date, four major lineages, denoted VGI/AFLP4, VGII/AFLP6, VGIII/AFLP5, and VGIV/AFLP7, are recognized for C. gattii. Recently, a fifth genotype was described on the basis of a single strain but with several different designations, including clade B (based on multilocus sequence typing [MLST]), VGIIIc/VGIV, and C. decagattii (17, 18). Elevation of these five lineages to separate species has been proposed previously (17). However, such taxonomic treatment is currently controversial, mainly due to the lack of clear biological differences between the lineages and of a clear consensus with respect to the limits and numbers of the putative species boundaries. As such, the various C. gattii lineages are collectively considered the “C. gattii species complex” (18).
In this paper, we describe the discovery of a new lineage/molecular type within the C. gattii species complex, which we designate VGV (variety gattii five). The six VGV isolates were identified among 32 C. gattii isolates recovered from soil, animal dung, and tree bark samples collected in Zambia by Vanhove et al. in 2013 (19). In this paper, we characterize genomic and phenotypic features of the VGV molecular type. Additionally, we present a new, improved genome assembly and gene sets for C. decagattii (17) which we confirmed for the first time to represent a separate lineage and which we therefore name “VGVI” for consistency with the other lineages.
RESULTS
Comparative and population genomics for the six lineages of C. gattii.We discovered a new lineage of Cryptococcus gattii from a panel of 32 (of 55) genome-sequenced isolates recovered from Southern tree hyrax (Dendrohyrax arboreus) middens, midden soil, or tree holes from the Central Zambezian Miombo Woodland ecoregion, a densely forested ecoregion that covers much of Central and East Africa (Fig. 1) (Table 1; see also Table S1 in the supplemental material). Isolates from the new lineage, which we have named VGV, were collected from a 430-km span of northern Zambia, including the Mupata Hills (Luanshya, Copperbelt Province), Mutinondo Wilderness area, and Kapishya (Mpika, Northern Province), suggesting that the lineage has a broad regional distribution across this ecoregion (Fig. 1A). All VGV isolates were identified as serotype B, which also encompasses strains from VGI, VGII, the VGIIIa subgroup, and rare isolates among VGIV strains.
Environmental sampling of C. gattii VGV in Zambia. (A) Location of C. gattii VGV isolates across Central Zambian Miombo Woodlands. Isolates MF5, MF13, and 54 (Mupata Hills, Copperbelt province) and MF56 (Kapishya, Northern Province) were found in or near Hyrax middens created by Southern tree hyrax (Dendrohyrax arboreus). MF34 and MF51 were found from sampling Miombo tree holes in the Mutinondo wilderness area, Northern Province. (B) A tree hyrax feeding on leaves. (C) Sampling from hyrax middens at Kapishya from which MF56 was isolated.
Environmental isolates of VGIV and VGV from the Central Zambezian Miombo Woodlands
TABLE S1
This is a work of the U.S. Government and is not subject to copyright protection in the United States. Foreign copyrights may apply.
Phylogenetic analyses demonstrated that VGV and VGVI (C. decagattii) and the four previously described lineages are genetically distinct from each other (Fig. 2). Indeed, none of the C. gattii lineages appear to be the result of hybridization based on the distribution of private alleles (i.e., alleles not shared between lineages) (Fig. 3a and b), maximum likelihood phylogenetic reconstruction (Fig. 2a), Wright’s fixation index (FST) (see Fig. S4 in the supplemental material) or NeighborNet Network data (Fig. 2b). Additional population genetics analyses confirmed low levels of genetic exchange between the six well-resolved C. gattii lineages. For example, principal-component analysis (PCA) resolved distinct grouping for the lineages, with the first component (PC1) separating VGII from all other lineages, forming distinct clusters for VGIII and VGVI on PC2 (Fig. 2c). The projection of PC3 and PC4 further allowed identification of distinct tight clusters for the VGI, VGIV, and VGV lineages (Fig. 2d).
Whole-genome analysis supports classification of VGV as a distinct lineage. (a) Maximum likelihood (RAxML) phylogeny of 101 C. gattii genomes generated over all nonambiguous sites with at least one SNP in at least one isolate (1,518,323 sites, or 8.7% of the total genome). Isolate names are colored according to lineage (dark blue = VGI, green = VGII, purple = VGIII, light blue = VGIV, orange = VGV, and red = VGVI). Asterisks indicate 100% bootstrap support at each node corresponding to 1,000 tree-building replicates. (b) SplitsTree NeighborNet Network. (c and d) Principal-component analysis (PCA) of genomic variant sites, showing separation of isolates into lineages (isolates are plotted with 2% random noise [jitter] for clarity of individual points). The first four principal components (PC1 and PC2 [c] and PC3 and PC4 [d]) account for 71.26% of the total genetic variation.
Population genetic analyses for each of the C. gattii lineages (based on 101 genomes). (a) Alleles shared between C. gattii lineages (SNPs per kilobase). (b) Alleles not shared between C. gattii lineages (i.e., private alleles) (SNPs per kilobase). (c) Nucleotide diversity (π) within each lineage against the number of isolates representing each lineage. (d) Admixture K optimization based on cross-validation error. (e) Unsupervised ADMIXTURE clustering analysis of all isolates at K = 9. (f) Lineage-specific gene counts and lineage-specific gene loss counts. The tree topology is based on the core ortholog RAxML tree, setting equal branch lengths, and the numbers of multilineage-specific gene gains and losses are shown above internal nodes.
FIG S4
This is a work of the U.S. Government and is not subject to copyright protection in the United States. Foreign copyrights may apply.
The new VGV lineage is represented by six isolates falling within two distinct subclades (A and B). Clade A comprises three VGV isolates (MF5, MF13, and MF54) that were recovered from soil and animal dung sampled in hyrax middens, from which we also identified VGI and VGII isolates (Fig. 1a). Clade B comprises a further three VGV isolates: two (MF34 and MF51) that were recovered approximately 345 km away from clade A and a third (MF56) that was recovered approximately 430 km away from the other clade B isolates. Clade B isolates were recovered from both a tree hole and hyrax middens, showing that the lineage can occupy both tree and dung, environments that are both associated with hyrax activity. The fact that clades A and B were found in different geographic locations might reflect the degree of spatial genetic structure within VGV. All the VGV isolates were located in regions of the granite and acidic kopjes/hills that are found occurring patchily across this ecoregion.
C. gattii VGV is highly divergent from all previously recognized C. gattii lineages. VGV isolates differ from VGII (reference isolate R265) by ∼0.75 million single nucleotide polymorphisms (SNPs) on average (44 SNPs/kb); thus, those isolates and the other lineages are similarly distant from VGII (Table S1). Analysis of the relative proportions of shared private alleles for the C. gattii lineages (Fig. 3a and b) indicates that VGII shared the fewest alleles (<4.6 kb total; <0.2 SNPs per kb) (Fig. 2 and Fig. 3a) with any of the other lineages, reflecting its greater divergence. The newly discovered VGV lineage shared fewer alleles with VGVI (0.24 per kb) and VGIII (0.29 per kb) than with VGI (1.18 per kb) and VGIV (2.53 per kb). The lineages that shared the most private alleles were VGVI and its closest relative, VGIII (92 kb total; 5.37 per kb), which account for an average of 12% of all SNPs (based on alignments to VGII) found in isolates from those lineages.
Nearly 1 in 10 nucleotides in the C. gattii genome has an alternative allele across the six lineages (1.55 × 106 sites; 9.01% of the C. gattii genome). Indeed, >180 kb of these unique/private alleles were identified for each lineage, including for VGV, which had 220-kb private alleles (12.75 per kb) (Fig. 3b). VGI is the most distinct in terms of both the highest count of private alleles (378 kb/21.93 SNPs per kb) and its nucleotide diversity (π) (Fig. 3c), which is reflected in the three distinct subclades of VGI isolates in the whole-genome phylogeny (Fig. 2a and b). Conversely, the three VGVI isolates are thought to be derived from a single isolate recovered from a patient in Mexico and subsequently distributed to different laboratories where they have been renamed and sequenced (14, 17, 20, 21). Its few clonal differences are illustrated by its low nucleotide diversity (π) (Fig. 3c).
Unsupervised model-based clustering identified highly structured ancestry components enriched in each of the lineages. The clustering solution with the lowest cross-validation error value (K = 9) grouped the VGV isolates into a single genetically homogenous group (Fig. 3d and e; see also Fig. S1) while identifying four unique components within the VGII lineage. Of these, subclades VGIIx and VGIIb share small proportions of ancestry with other defined VGII subclades. For example, VGIIb is inferred to share ancestry with other VGII subclades (isolates Ram5 and B8554) and other lineages (B7394 has alleles from VGIV, and B7735 has alleles from VGV). Conversely, none of the isolates in VGIIa and VGIIc have a demonstrable admixture with other subclades or lineages, both being formed by single unique ancestry components. VGIII isolate B8212 (a clinical isolate collected in Oregon, USA, in 2007 [22]) is also modelled as sharing ancestry with VGVI.
FIG S1
This is a work of the U.S. Government and is not subject to copyright protection in the United States. Foreign copyrights may apply.
Finer-scale clustering was performed by considering patterns of genome-wide haplotype sharing in fineSTRUCTURE (23). Here, VGV isolates forming a separate cluster with greater haplotype similarity to isolates from VGI, VGIII, and VGIV than VGII (Fig. S2). While the haplotype sharing patterns overwhelmingly accorded with each lineage being genetically distinct, a notable exception was VGIII isolate B8212, which shares haplotypes with VGIV and VGVI (also in accordance with model-based clustering), perhaps owing to a small amount of genetic exchange with one or both of those lineages. As also observed using ADMIXTURE-based clustering (Fig. 3e), two isolates from VGII, B7394 and B7735, were also genetically distinct and were assigned to their own cluster which was most closely related to isolates from subclade VGIIb.
FIG S2
This is a work of the U.S. Government and is not subject to copyright protection in the United States. Foreign copyrights may apply.
All six VGV isolates were haploid, with no evidence for aneuploidy based on allele frequencies and depth of coverage (Fig. S3). However, we did observe copy number variation (CNV) between the three VGVI isolates derived from a single clinical isolate from Mexico (14, 17, 20, 21). Specifically, isolate CBS11687 acquired an ∼200-kb duplication of supercontig 5 (sc5) (position 1040000 through the end of the supercontig). Separately, isolate WM1804 had a smaller, 40-kb duplication on sc21 (positions 150,000 to 190,000). Isolate WM1802 had neither CNV. In terms of base changes, the three VGVI isolates (WM1802, WM1804, and CBS11687) differed by only 419 SNPs, with the lowest number found between WM1804 and CBS11687 (n = 126) and the highest number found between WM1802 and CBS11687 (n = 315). These genetic differences may have occurred as a result of microevolution during or following passaging or cryopreserving, although large CNVs are common in C. gattii (24, 25). All of the newly isolated VGI (n = 7) and VGVA (n = 3) samples from Zambia had a small, <10-kb duplication within supercontig 6 of the R265 genome (kb position 400 to position 410). This genomic region encodes a single 87-amino-acid (aa) protein that is conserved in C. neoformans and C. gattii but has no functional annotation (by PFAM, GO terms, KEGG-EC, TMHMM, or SigP4).
FIG S3
This is a work of the U.S. Government and is not subject to copyright protection in the United States. Foreign copyrights may apply.
The results from our phylogenetic and population genetic analyses are in line with previous work (26), indicating that lineages within the C. gattii species complex have remained largely genetically isolated since their divergence. Pairwise lineage calculations of θ, representing Weir’s formulation of Wright’s fixation index (FST), suggest very low levels of genetic exchange between the lineages (Fig. S4), which is also reflected in analyses of genetic structure (Fig. 2 and 3; see also Fig. S1 and S2). Both depth-of-coverage plots and FST nonoverlapping sliding 10-kb window plots across the mating type locus (MAT) at the start of supercontig 18 demonstrate that all VGIV and VGV isolates included in this study were MATα (the reference genome of R265 is MATα; high depth of coverage and θ values of >0.98 across the MAT loci). In contrast, for VGI, VGII, VGIII, and VGVI, MATa isolates were included in our panel.
Genome assembly and analysis of VGV reveals C. gattii lineage-specific differences.We assembled and annotated a nearly complete genome for the newly discovered C. gattii VGV lineage (isolate MF34) using both Oxford Nanopore and Illumina sequencing reads. The resultant assembly consisted of 15 contigs corresponding to the 14 chromosomes; the single break in one chromosome corresponds to the ribosomal DNA (rDNA) region. Other than underrepresenting rDNA genes, this assembly provides a complete representation of the genome, with telomeric repeats (TTAGGG) present at 28 contig ends. Gene annotation revealed 6,322 predicted protein-coding genes, which is similar to the data from the seven other representative C. gattii isolates with publicly available complete genomes (26, 27) representing the four previously known major lineages (ranging from 6,092 to 6,763), as well as C. neoformans H99 (28) (n = 6,962) (Fig. S5 and S6).
FIG S5
This is a work of the U.S. Government and is not subject to copyright protection in the United States. Foreign copyrights may apply.
To establish the evolution of protein-coding genes in C. gattii, we compared the gene content data for two representative annotated genomes per lineage where possible (no second annotated reference genomes were available for VGIV, VGV, and VGVI), identifying 4,565 single-copy core orthologs that are shared among the five lineages of C. gattii and C. neoformans (∼74% of Cryptococcus genes) (Table 2). For VGVI, we sequenced and assembled the WM1802 isolate, obtaining a similar genome length (17.42 Mb) and a similar count of protein-coding genes (n = 6,092). For VGII, we included the updated VGII R265 PacBio assembly in our panel of genomes (29) (Table 2). Detection orthology analysis performed between just the two R265 assemblies identified 91% of genes in 1:1 orthology (n = 5,642), ∼4% of genes unique to the new assembly (n = 252), and ∼6% of genes in paralogous clusters (n = 364). The previous VGII R265 assembly had 635 genes that were not called in the new assembly, likely representing a difference in the annotation protocol. Analysis of core eukaryotic genes (CEGs) and BUSCO revealed a high level of completeness of gene sets and increased completeness in the new annotation (Fig. S6). Furthermore, all assemblies generated using long-read sequencing technology assembled into 14 scaffolds/supercontigs, suggesting that all Cryptococcus lineages/species have conserved numbers of chromosomes.
The genome assemblies used for phylogenetic analysis and orthology detectiona
FIG S6
This is a work of the U.S. Government and is not subject to copyright protection in the United States. Foreign copyrights may apply.
Ortholog amino acid differences within and between lineages were consistent with results from our phylogenetic and population genetic analyses. VGV MF34 had the highest amino acid sequence similarity to VGIV IND107 (53,000 amino acid differences = 97.55% similarity), which was observed in both alignment-based and ortholog-based phylogenies (Fig. 2 and Fig. 4). The most similar interlineage orthologs were those of VGIII and VGVI (49,500 predicted amino acid differences = 97.71% similarity) (Table S1), while the most distinct pairwise comparisons were those between C. gattii and C. neoformans (between 205,000 and 218,000 amino acid changes; ∼90% protein similarity).
A phylogenetic tree for 10 Cryptococcus genomes belonging to the six C. gattii lineages and 1 C. neoformans lineage outgroup alongside their genome synteny. The phylogenetic tree was constructed in RAxML, with branch lengths indicating the mean number of nucleotide substitutions per site. To the right is a synteny plot, visualizing regions that span three or more orthologs between any two species as a connected gray line. Supercontig numbers are shown above each genome axis if longer than 400 kb, where a plus sign (+) represents the forward orientation and a minus sign (–) represents the negative orientation.
Overall, synteny is conserved within C. gattii (26), though with notable differences between some lineages. For example, VGV has a single 171-kb inversion on supercontig 7 (positions 544,906 to 716,249) compared with the middle of VGIV IND107 supercontig 7 and the middle of VGIII CA1280 supercontig 5 (Fig. 4). VGVI also has some syntenic differences from its closest relative, VGIII (Fig. 2) (see also Fig. 3 and Fig. 4). For example, approximately half of VGVI supercontig 5 is syntenic for the start of VGIII (CA1873) supercontig 16, while the second half of VGVI supercontig 5 is syntenic for a middle region of VGIII supercontig 1, indicative of a chromosomal translocation. Further improvements and additional genome assemblies should establish the full number and genetic impact of lineage-specific genomic rearrangements.
Lineage-specific genes and multilineage-specific genes (found in two or more lineages) were identified in each of the lineages (Fig. 3f; see also Fig. S7). Many of these lineage-specific genes (223/605; 37%) were previously identified from a panel of genome assemblies without the addition of VGV and VGVI (26). A further 53/605 (9%) of the newly detected lineage-specific genes were previously categorized as multilineage-specific genes. The lineage-specific genes in newly sequenced lineages (VGV and VGVI) include 74 genes that were unique to VGV and 49 genes that were uniquely absent in VGV. Genes unique to VGV include those encoding two sugar transporters (D1P53_002216 and D1P53_002944), an alcohol dehydrogenase (D1P53_004471), and an aldehyde dehydrogenase (D1P53_006242). Conversely, eight transmembrane proteins and a single uncharacterized secreted protein were uniquely missing in VGV. All of the genes involved in the ergosterol biosynthesis pathway were present in single-copy form in VGV. The VGVI WM1802 genome includes 80 genes that are unique and lacks 127 genes that are uniquely absent. Among the unique genes in WM1802, 14 are predicted to be involved in transport and include genes encoding three monosaccharide transporters, one hexose transporter, one cadmium ion transporter, and one monocarboxylic acid transporter.
FIG S7
This is a work of the U.S. Government and is not subject to copyright protection in the United States. Foreign copyrights may apply.
Predictably, C. neoformans VNI H99 had the greatest number of lineage-specific/absent genes, with 578 unique genes and 28 absent genes. These included 47 genes predicted to encode transmembrane proteins (including four sugar transporters, five major facilitator superfamily [MFS] transporters, and a caffeine resistance transporter), and 34 secreted proteins. Fewer genes were uniquely absent in C. neoformans, which included genes encoding eukaryotic translation initiation factor 3 subunit B, an ACC oxidase, a copper amine, an allantoate permease of the major facilitator, and a 3-hydroxyacyl-dehydrogenase with oxidoreductase activity. Full details of all lineage-specific genes are provided in Table S1.
Phenotypic characteristics of VGV.All six isolates that belonged to the new VGV lineage based on whole-genome sequencing (Table 1) were first incorrectly identified as VGIV based on the URA5 restriction fragment length polymorphism (RFLP) banding pattern. Unlike most of VGIV, all six VGV isolates were serotype B. These were further characterized as MATα and as melanin and urease positive (Fig. S8) and grew well at 37°C (Fig. 5A). However, the VGV strains grew slightly more slowly on canavanine glycine bromothymol blue (CGB) agar (Fig. S8) than the serotype C VGIV strain (MF46) isolated from the same environment in the Central Miombo Woodland (Table 1). Hence, the positive blue/green color development on CGB agar took longer in VGV than in the control strains (Fig. S8). Since VGV is genetically closest to VGIV (Fig. 2a and b; see also Fig. S2), two serotype C VGIV strains were used as control isolates for further phenotypic comparisons (WM779 isolated from a cheetah in South Africa [16] and MF46 isolated from Miombo tree bark in Zambia near Hyrax middens) (Table 1).
Phenotypic characteristics. (a) Growth of six VGV isolates and two VGIV control isolates at 30°C and 37°C on YEPD agar. (b) India ink staining of VGV cells grown in YEPD broth for 24 h at 30°C. The isolates of clade A (MF5, MF13, and MF53) produced thinner capsules than the isolates of clade B cells, which produced capsules similar in size to those produced by the two VGIV control isolates. Bar = 5 μm. (c) MIC of VGV isolates for FLC, 5-FC, and amphotericin B. All tested isolates showed high FLC MICs ranging from 20 to 256 μg/ml. The MIC for 5-FC was low, except for MF46, an environmental isolate of VGIV, for which the MIC for 5-FC averaged 4 μg/ml. All VGV isolates showed higher MICs for amphotericin B than for the VGIV controls. (d) Survival curve of mice infected by four VGV isolates (intrapharyngeal aspiration of 5,000 cell/mouse) and two VGIV control isolates. Only the mice infected with VGIV isolate WM779 succumbed to infection. (e) Lung (left) and brain (right) fungal loads. VGV isolates grew moderately well in lungs, but the CFU levels in the brains were negligible.
FIG S8
This is a work of the U.S. Government and is not subject to copyright protection in the United States. Foreign copyrights may apply.
The size and morphology of VGV yeast cells were typical for Cryptococcus and indistinguishable from those of cells of the control strains (Fig. 5B). Two distinct patterns of capsule formation were found among the six VGV isolates grown in yeast extract-peptone-dextrose (YEPD) broth (Fig. 5B). The isolates recovered from clade A (MF5, MF13, and MF54) produced thinner (≤1-μm-thick) capsules than those recovered from clade B (MF34, MF51, and MF56), which produced thick (2-to-4-μm-thick) capsules similar to those seen with the VGIV control strains.
The VGV isolates and the control strains of VGIV manifested unusually high resistance toward fluconazole (FLC), particularly given that they were sampled from an environmental niche. The three isolates of VGV clade A were more resistant to FLC, with drug MICs of ≥128 μg/ml, than the three isolates in clade B, which showed drug MICs of 24 to 64 μg/ml. All six VGV isolates showed a drug MIC of 0.0625 μg/ml for 5-fluorocytocine (5-FC), a value similar to that shown by WM779. The drug MIC of MF46 for 5-FC, 4 μg/ml, was unusually high. The VGV amphotericin B MIC ranged between 0.5 and 1 μg/ml, higher than the levels measured for the control strains, which showed drug MICs below 0.5 μg/ml (Fig. 5C).
To explore the relative pathogenicities among VGV subclades we selected two isolates from clade A (MF5 and MF13) and two isolates from clade B (MF34 and MF51) for inoculation in mouse models. Mice infected by all four isolates survived for 70 days, while all the mice infected by WM779, a virulent serotype C control isolate, succumbed to infection within 30 days postinfection (Fig. 5D). VGIV environmental isolate MF46 (serotype C) caused no death in the mouse model. Fungal loads in the lungs of VGV-infected mice were substantially lower than those seen with the mice infected with WM779 and slightly lower than those seen with the mice infected with MF46. Brain fungal loads of mice infected with the VGV strains were also low to negligible. The control isolates of VGIV showed little neurotropism (Fig. 5E). Histopathological analysis of the lungs demonstrated significant pathology in WM779-infected mice, with yeast found throughout the lung together with notable disruption of lung tissue. In many locations, extensive leukocyte recruitment was evident in areas of concentrated infection (Fig. 6).
Histopathology of lung infected by VGV isolates. Sections of the mouse lungs infected by three different VGV isolates and two VGIV control strains were stained with alcian blue and periodic acid-Schiff stain and counterstained with hematoxylin (left and middle columns) or with standard hematoxylin and eosin (right column). Note that alcian blue stains cryptococcal cells blue. Images in the left columns were acquired using a 2.5× objective. Images in the middle and right columns are higher-magnification (10×) images of the areas indicated by the yellow boxes in the 2.5× images.
Histopathological analysis of the VGV isolates displayed substantially lower pulmonary yeast levels. That said, mice infected by MF46, MF34, and MF51 had higher levels of yeast than those infected by MF5 and MF13 as shown by both CFU counts (Fig. 5E) and the results of histopathological analysis (Fig. 6), in which MF51 was shown to represent clade B. The lung histopathology of the mice infected by MF34 was similar to that of the mice infected by MF51 (data not shown). Notable for its absence, leukocyte infiltration was mostly low or absent from sites of VGV infection. MF13 showed some leukocyte infiltration to a subset of infectious foci (Fig. 6).
Identification of VGV by URA5 RFLP.The patterns of URA5 RFLP, obtained by double digestion with Sau96I and HhaI, have been widely used to identify the lineage/molecular type in both C. neoformans and C. gattii species complexes (14). The URA5 RFLP of clade A isolates obtained by Sau96I/HhaI digest showed a pattern identical with that of VGIV (Fig. S9A). Those of clade B, however, produced an additional 1.3-kb amplicon which was absent from clade A and from all of the other VG molecular type reference isolates. This 1.3-kb amplicon was present even in uncut DNA of clade B isolates (Fig. S9A and B), but its nature is not known at this juncture. Since the URA5 RFLP patterns of VGIV and VGV were not conclusively different, we compared the URA5 gene sequence of VGV isolate MF34 to that of VGIV reference strain WM779 to identify possible restriction enzymes that can clearly distinguish the two lineages. This led to our identifying two highly discriminatory restriction enzymes: StuI and EarI. The expected sizes (measured in base pairs) of the URA5 gene fragments resulting from StuI digestion are as follows: 221 bp, 237 bp, and 322 bp in VGIV and 237 bp and 543 bp in VGV. The EarI digestions produced 247-bp and 501-bp fragments in VGIV and 247-bp and 300-bp fragments in VGV. We compared URA5 RFLP data from 17 VGIV isolates (Table S1) with the data from 6 VGV isolates obtained by StuI or EarI digestion, and the results are shown in Fig. S9C and D.
FIG S9
This is a work of the U.S. Government and is not subject to copyright protection in the United States. Foreign copyrights may apply.
Type strain of VGV.We have designated MF34 (clade B isolate) as the type strain of VGV. MF34 was isolated from a tree hole located in Mutinondo (latitude −12.45, longitude 31.29), Central Zambezian Miombo Woodlands (Table 1). Its genome has been assembled and annotated to near-completion (15 scaffolds with N50 = 1.3 Mb and telomeric repeats at 28 contig ends). MF34 is serotype B and MATα and causes mild pneumonia in C57BL/6 mice with negligible neurotropism.
DISCUSSION
Over the past decade, increased sampling worldwide and whole-genome sequencing (WGS) methods have uncovered greater genetic diversity of important pathogens, including the C. neoformans and C. gattii species complexes. For example, sampling from Botswana revealed the existence of the C. neoformans VNB lineage (30), which itself has recently been shown to be deeply split into two genetically isolated lineages, VNBI and VNBII (31). Thus far, VGVI is the only lineage that exists as a single genotype since the three isolates previously designated C. decagattii appear to have been isolated from the same patient (21). The previously identified lineages of Cryptococcus have recently been designated separate taxonomic species based on phylogenetic species recognition criteria (17). While we agree that Cryptococcus contains a number of genetically diverse and monophylectic clades that can be viewed as species under an evolutionary species concept (32), we have previously argued that it is premature to give each clade a separate taxonomic name at this juncture (18, 33). One notable concern raised by Kwon-Chung et al. (18) was that the proposed seven-species taxonomy (33) was likely to be unstable due to incomplete knowledge of the true extent of Cryptococcus diversity worldwide. Our discovery of C. gattii VGV from the Miombo woodlands of Zambia clearly shows that we have not yet achieved a full understanding of the global biodiversity of Cryptococcus and that further exploration will likely yield additional phylogenetic species. Until we have a more accurate consensus on the true numbers of Cryptococcus lineages, we propose that the names “VN” and “VG” serve as practical “Zip code” identifiers within C. neoformans and C. gattii, offering a convenient way to describe newly discovered lineages or recombinants without introducing unwanted nomenclatural instability and confusion.
Our discovery of C. gattii VGV in samples from hyrax-associated environments suggests an association with these mammals. Hyrax are small herbivores that are most closely related to elephants (Proboscidea) and sea cows (Sirenia) and are characterized by the behavior of defecating in communal latrines, usually located in crevices in rocky kopjes, over many generations (34). These locations are often sheltered in rocky caves and droppings are likely to accumulate for upwards of 50,000 years, in some cases forming a stable paleoenvironmental hot spot of urea-rich nitrogenous material (35). Cryptococcus has a pronounced trophism for urea as a nutritive substrate, and pigeon guano is known to support prolific growth of C. neoformans and (to a lesser extent) C. gattii (36). Our finding that hyrax middens are hotspots of Cryptococcus diversity suggests that their ecological stability in landscapes that are low in nitrogen availability may lead to them being important arenas for the evolution of Cryptococcus and that they will likely be fertile ground for further discovery of diversity within this genus.
Fungal association with small mammals may suggest adaptations that confer pathogenicity, known as the “endozoan, small-mammal reservoir” hypothesis (37), and deserves to be explored further following our findings of an association of Cryptococcus with hyrax. Accordingly, alongside further study of potential mammalian reservoirs, the search for VGV clinical isolates is also needed in order to understand the true virulence potential of VGV and whether it can spill over into humans. Murine models have shown that environmental isolates are less virulent than clinical isolates of the same molecular type in both the C. gattii and C. neoformans species complexes, suggesting that polymorphic virulence factors exist (38, 39). However, despite its great genetic distance from all other lineages, the new VGV lineage is not clearly distinguishable from others by existing methods such as serotyping or the routinely used approach of RFLP analysis employing Sau96I and HhaI URA5 digestion (14). Thus, it is possible that previously analyzed isolates belonging to VGV may have been misidentified using non-WGS methods. The most likely candidates for the search of clinical VGV are VGIV serotype B isolates recovered from patients. Geographically, the most likely place to find the VGV clinical isolates appear to be in sub-Saharan Africa since the current panel of isolates were found in the Zambian environment within an ecoregion that includes Tanzania, Burundi, Democratic Republic of the Congo, Angola, and Malawi.
Previous work has shown that most isolates of the C. gattii species complex cause pulmonary infection in a murine model with low neurotropism (20, 40, 41). The four VGV isolates tested here were less neurotropic than the MF46 VGIV isolate that was collected from the same Zambian environment, and all the examined Zambian environmental isolates were significantly less virulent than a VGIV control strain, WM779. It remains to be shown whether the differences in neurotropism are due to lineage-specific genes or to alleles in VGV. As previous work has shown in C. neoformans (42), capsule size difference manifested by clades A and B in vitro was unrelated to virulence in mice.
Although serotypes have not yet been conclusively linked to virulence in Cryptococcus, they remain important for strain identification. The majority of C. gattii isolates tested to date have been of serotype B, except for a subset of VGIII and the majority of VGIV isolates, which are serotype C. The six VGV isolates are also all serotype B—but, due to the lower growth rate on CGB agar, the CGB reaction was weaker than that seen with other isolates of serotype B or serotype C. It took 24 h longer for VGV than for the other VG isolates (VGI to VGIV) to turn the medium dark blue. As the six VGV isolates were all serotype B whereas the majority of VGIV isolates (the most closely related lineage) reported thus far have been serotype C, it is possible that VGV may also occur in serotype C. Additional environmental sampling of VGV is therefore necessary to establish the dominant serotype, since the current sample size of six is insufficient for definitive conclusions.
Surprisingly, five of the six VGV isolates and the two control VGIV isolates were highly resistant to fluconazole (MIC of >64 μg/ml), a commonly used antifungal drug. The three isolates of VGV clade A were more resistant to FLC than the isolates of VGV clade B. Although the C. gattii species complex was previously known to be more resistant to FLC on average than C. neoformans (43), such high resistance to FLC in environmental isolates is notable and was not previously reported (44). All of the VGV isolates had identical nucleotide sequences for ERG11 and AFR1, demonstrating that the resistant isolates did represent not a result of genetic differences in the target or transporter of FLC. However, innate fungal resistance to FLC can be due to multiple factors besides the ERG11 gene or efflux pumps and the mechanism(s) of FLC resistance in VGV remains a subject for future investigation. Why environmental VGV isolates should have such high resistance to azoles is unclear, as, given the relatively pristine environments from which they were recovered, it is unlikely that they had come into contact with agrichemicals. More likely, fluconazole resistance is a pleiotrophic effect that has evolved as a consequence of exposure to xenobiotics other than azoles. Further investigations into the evolution of FLC resistance in VGV may take on additional importance as clinical cases due to VGV are a distinct possibility in the Sub-Saharan regions where 12% of the members of the Zambian population are living with the HIV (https://www.unaids.org/en/regionscountries/countries/zambia).
In this paper, we present a nearly complete genome assembly for VGV type strain MF34. The MF34 genome allowed us to conclusively establish that VGV is a lineage of C. gattii that is separate and distinct from any previously identified and that it is not the result of hybridization, as has been seen for other divergent isolates (31). Indeed, while both Cryptococcus species complexes appear to have a conserved chromosome number of 14 based on the current panels of assembled and annotated genomes available, intrachromosomal and interchromosomal rearrangements as well as large CNVs appear to be common. This chromosomal variation may provide the genetic basis for phenotypic variation and may act as a genetic barrier to recombination between more highly divergent isolates such as those from separate lineages. At the within-lineage level, there are also a number of unique and uniquely lost “lineage-specific” genes which may contribute to phenotypic differences between lineages. However, it should be noted that many of the main phenotypes routinely measured, including virulence in animal models, growth rates, and ability to cause pulmonary versus central nervous system (CNS) infections, appear to differ as much within as between lineages.
One line of future inquiry that may help to explain this phenotypic diversity may come from the characterization of further transcriptional differences. For example, VGII has been shown previously to upregulate many of the ergosterol genes during coincubation with bone-marrow derived macrophages (45) and it will be important to determine whether other traits exist which differentiate the lineages of Cryptococcus. Further, it will be important to examine whether similar lineage-specific differences underpin VGV’s increased FLC resistance and whether clinically relevant traits such as drug resistance are linked to the environment within which these isolates have evolved. Ultimately, our report testifies to the deep reservoir of diversity that exists within Cryptococcus, which, despite decades of research into this genus, still harbors abundant surprises.
MATERIALS AND METHODS
Library preparation and sequencing of Zambian isolates.Environmental sampling took place in January and September of 2013. Samples were collected using Transwab Amies swabs (MWE; code MW170) and sterilized 30-ml screw-cap glass bottles. Amies liquid transport swabs were taken from tree bark (n = 20), soil (n = 19), and cracks in granite kopjes or droppings from rock Hyrax middens (n = 16). Samples were collected and processed according to previously established protocols (46, 47), and the samples were kept at 4°C before being processed on niger seed agar. All samples were collected under license from the Zambian Wildlife Authority (ZAWA).
Single colonies purified from the original isolation media were maintained under cryopreservation conditions at –80°C at Imperial College in London since 2013. The isolates were has been changed to revived on YPD agar (yeast extract 1%, peptone 1%, dextrose 2%) and incubated at 30°C before use. Genomic DNA was isolated with the cetyltrimethylammonium bromide (CTAB) extraction method as described previously (48) with modifications. Paired-end libraries (150 bp) were prepared and sequenced using an Illumina HiSeq 4000 platform by Novogene (Davis, CA). Two Oxford Nanopore libraries of isolate MF34 were constructed from genomic DNA using a one-dimensional (1D) library construction kit (model no. SQK-LSK109). A total of 243,660 reads with an N50 of 9,827 were generated on a FLO-MIN106 flow cell using a Minion. Reads were base called using Albacore v2.3.1. This resulted in 923,997,900 total bases (∼46× coverage).
Genome assembly and annotation.For isolate MF34, a hybrid assembly of Oxford Nanopore long reads and Illumina short reads was generated. An initial assembly of the Oxford reads was generated using Canu v1.5 (49) with parameter genomeSize = 20,000,000. The assembly was inspected for the presence of telomeric repeat (TTAGGG) at contig ends; for two contig ends missing a telomeric repeat, contigs were extended by aligning unassembled Canu contigs to these ends using NUCmer v3.1 (50). Base-called reads were then aligned to the contigs using the Burrows-Wheeler transform algorithm (BWA mem) (51) with flag “-x ont2d” and the alignments used for polishing with nanopolish (52). Two rounds of Pilon v1.13 (53) correction were performed using Illumina BWA read alignments (51). Paired Illumina sequences of C. decagattii (VGVI) were assembled and scaffolded using SPAdes v3.1.1 (54) with k-mer lengths of 21, 33, 55, and 77. A summary of statistics for the assembly is provided in Table 2. Reads were aligned to the assembly with BWA v0.7.4-r385 mem (51), and Pilon v1.12 (53) was further used to improve the assembly. Scaffolds that were less than 1 kb were removed. The genome assembly has been submitted to NCBI (see below for project accession number).
The C. gattii VGV MF34 and VGVI WM1802 genomes were annotated using Genemark (55), BLASTx against Swiss-Prot (56) and KEGG (57), and HMMER hmmscan (58) against PFAM (59). We ran tRNAscan (60) and RNAmmer (61) to identify non-protein-coding genes. Gene predictions were checked for a variety of issues, including overlapping of noncoding genes, overlapping of coding genes, and the presence of in-frame stops. Genes were named according to evidence from BLASTx and HMMER in the following order of precedence: (i) Swiss-Prot (56), (ii) TIGRfam (62), and (iii) KEGG (57) (where BLASTx hits must meet the 70% identity and 70% overlap criteria to be considered a good hit and for the name to be applied). Otherwise, genes were classified as hypothetical proteins.
Genes were functionally annotated by assigning PFAM domains (59), GO terms, KEGG assignment, and ortholog mapping to genes of known function. HMMER3 (58) was used to identify PFAM (release 27) domains, and BLASTx was used against the KEGG v65 database (57) (E value, <1 × 10−10). GO terms were assigned using Blast2GO version2.3.5 (63), with a minimum E value of 1 × 10−10. SignalP 4.0 (64) and TMHMM (65) were used to identify secreted proteins and transmembrane proteins, respectively (see Fig. S5 in the supplemental material). Gene sets were aligned to the 248 core eukaryotic genes (CEGs) and BUSCO basidiomycota_odb9 set to evaluate completeness (Fig. S6).
Read alignment and variant identification.The 36 newly sequenced isolates from this study were compared to an additional 65 isolates that had been sequenced and described in previous studies (20, 26, 38, 66, 67). These additional isolates were obtained from the NCBI Sequence read archive (SRA) and converted from SRA format to FASTQ using SRA Toolkit version 2.3.3-4. Illumina reads were aligned to the C. gattii VGII R265 reference genome assembly using Burrows-Wheeler Aligner (BWA) v0.7.4-r385 mem (51) with default parameters and were converted to sorted BAM format using SAMtools v0.1.9 (r783) (68).
Genome Analysis Toolkit (GATK) v2.7-4-g6f46d11 (69) was used to call both variant and reference nucleotides from the 101 alignments (as previously described [24]). Briefly, the Picard tools AddOrReplaceReadGroups, MarkDuplicates, CreateSequenceDictionary, and ReorderSam were used to preprocess the alignments (http://broadinstitute.github.io/picard/). GATK RealignerTarget-Creator and IndelRealigner were then used to resolve misaligned reads close to indels. Next, GATK Unified Genotyper (with the haploid Genotyper ploidy setting) was run with both SNP and indel genotype likelihood models (GLM). We also ran Base Recalibrator and PrintReads for base quality score recalibration on those initial sites for GLM SNP. We then recalled variants with Unified Genotyper with the parameter “—output_mode EMIT_ALL_SITES.” We merged and sorted all of the calls and then ran Variant Filtration with the parameters “QD < 2.0, FS > 60.0, MQ < 40.0.” Next, we removed any base that had a minimum genotype quality of below 50, a minimum proportion of alternative alleles (AD) of 80%, or a minimum depth of 10. Finally, we removed any positions that were called by both GLMs (i.e., incompatible indels and SNPs), any marked “LowQual” by GATK, any nested indels, and any sites that did not include a PASS flag.
Phylogenetic and population genetic analysis.The variants identified from the 101 alignments were filtered for positions that were homozygous (reference or SNP) and polymorphic in at least one isolate (Fig. 2), resulting in alignment of 1,517,353 nuclear sites and 970 mitochondrial sites. A FASTA file of these positions was created and converted into PHYLIP format, and a phylogenetic tree was generated using RAxML v7.7.8 (70) with 1,000 bootstrap replications. RAxML was run with the generalized time-reversible (GTR) and category (CAT) rate approximation, with final evaluation of the tree performed using GTR plus gamma-distributed rates. The same sites were analyzed using the NeighborNet Network of SplitsTree v4.14.6 (71).
A multisample variant call format (VCF) corresponding to all 101 genomes was made with VCFtools (72) and converted to ped and map file formats for use in PLINK v1.90 (73). Unsupervised ADMIXTURE (74) was run on a moderately linkage disequilibrium (LD)-pruned alignment for values of K between 1 and 15. A value of K = 9 provided the lowest cross-validation error (Fig. 3d and e; see also Fig. S1). To explore finer patterns of population structure among our sampled lineages, we applied a technique designed to characterize patterns of haplotype sharing between panels of “donor” and “recipient” haplotypes within a recombining population. We ran Chromopainter v2 (23) to infer, at each position in a recipient isolate’s genome, the donor to which it is most closely ancestrally related relative to all others in the data set. To do this, we assumed a uniform recombination rate of 1.5 M (morgans)/megabase based on the genome-wide recombination rate previously estimated in C. neoformans (75) and with Chromopainter’s switch and mutation rate parameters estimated using 10 runs of expectation-maximization (−n 190.29; −M 0.0011). We then ran Chromopainter in linked mode using the haploid switch (-j) under an “all-versus-all” framework, painting all samples using all others to produce a pairwise coancestry matrix describing the amount of DNA matches of each isolate to every other under the copying model.
Haplotype-based clustering was then implemented in fineSTRUCTURE (23) with an estimated normalization parameter of c = 0.51, sampling cluster assignments every 10,000 iterations for 1 × 106 Markov chain Monte Carlo (MCMC) iterations after 1 × 106 initial burn-in steps. We then performed an additional 1 × 105 hill-climbing iterations beginning with the MCMC sample with the highest posterior probability. This classified our data into 34 clusters (Fig. S2).
For the C. neoformans VNI H99-rooted C. gattii tree, we identified 1:1 orthologs for each of the nine isolates with Orthofinder v2.1.2 (76) using the Synima pipeline (77). We aligned orthologs with MUSCLE v3.8.31 (78), extracted the coding DNA sequences (CDSs) in a codon context, trimmed to the smallest contiguous sequence, and then concatenated alignments. In total, we aligned 2.16 Mb of transcripts for each genome. Prottest v3.4 (79) was used to determine the best-fitting amino acid transition model (JTT) according to Bayesian information criterion. The final tree was produced using RAxML v7.7.8 (70) and the CAT rate approximation and WAG amino acid replacement matrix with 1,000 bootstrap replicates. Synima (77) was used to visualize synteny between the genomes. The same pipeline was used to compare the previous and updated R265 genomes.
Phenotypic analysis.To determine the growth rate of C. gattii VGV, cells of all six VGV isolates were inoculated in YEPD broth and incubated at 30°C on a shaker (200 rpm) for 18 h. Cells were washed with sterile phosphate-buffered saline (PBS), and 2 × 105 cells/ml were resuspended in PBS. Three-microliter aliquots of 10-fold serial dilutions were spotted onto YEPD agar and incubated at 30°C and 37°C. For biological confirmation of the species, isolates were inoculated on canavanine glycine bromothymol blue (CGB) agar (80) for species-specific CGB reactions and on Christensen’s urea agar (Sigma) and norepinephrine agar (81) for urease and melanin production reactions, respectively, and were incubated at 30°C for 48 h. India ink staining of the cells grown on YEPD broth for 24 h at 30°C was used for microscopic observation of the cell and polysaccharide capsule size. The reference strains used were WM148 or H99 (serotype A, VNI), WM626 (serotype A, VNII), WM179 (serotype B, VGI), WM178, R265 and R272 (serotype B, VGII), WM161 (serotype B, VGIII), and WM779 (serotype C, VGIV) (14). The mating type of each isolate was determined by PCR using primers specific to STE12α and STE20a (82).
Determination of MIC for antifungal antibiotics.MICs for fluconazole (FLC), 5-fluorocytosine (5FC), and amphotericin B were determined using Etest strips according to the Etest technical guide (AB Biodisk, Solna, Sweden), with slight modification. Fungal cells were grown in 5 ml of YEPD at 30°C for 18 h. Harvested cells were diluted in sterile saline solution to an optical density at 600 nm (OD600) of 0.05 and plated on yeast nitrogen base (YNB) agar plates. Etest strips were placed at the center of the plates and incubated at 30°C for 72 h. The susceptibility endpoint was read at the first growth inhibition ellipse. The concentration ranges tested were as follows: for FLC, 0.016 to 256 μg/ml; for both 5-FC and amphotericin B, 0.002 to 32 μg/ml.
URA5 gene RFLP.The URA5 gene of each isolate was amplified from genomic DNA by PCR to identify the molecular type using 50 ng of two primers: URA5 (5′-ATGTCCTCCCAAGCCCTCGACTCCG-3′) and SJ01 (5′-TTAAG ACCTCTGAACACCGTACTC-3′). Reactions were carried out in a total volume of 50 μl as previously described (14). PCR was performed for 40 cycles at 94°C for 2 min of initial denaturation, 30 s of denaturation at 94°C, 30 s of annealing at 55°C, and 2 min of extension at 72°C. The reactions were completed by performing a final extension step for 10 min at 72°C. PCR products were analyzed by 1% agarose gel electrophoresis, and 5 μl of PCR products was double digested with Sau96I (10 U/μl) and HhaI (20 U/μl) for 3 h at 37°C. Then, digested samples were separated by 3% agarose gel electrophoresis at 80 V for 5 h. The RFLP patterns of URA5 gene were analyzed using well-defined bands in the gel images by comparing them with the patterns obtained from the standard reference strains.
Restriction enzyme analysis of the URA5 gene to distinguish VGV from VGIV.We found that the URA5 RFLP banding patterns (14) of VGV and VGIV were not clearly distinguishable using Sau96I and HhaI (Fig. S9A). We compared the DNA sequences of the URA5 gene from MF34 (VGV) and WM779 (VGIV) and found two restriction enzymes, StuI and EarI, that can be used to distinguish the two molecular types based on URA5 RFLP. Three microliters of URA5 PCR products was digested with StuI (10 U/μl) or EarI (20 U/μl) (New England BioLabs Inc.) at 37°C for 4 h, and restriction fragments were separated by electrophoresis in 3% agarose Tris-acetate-EDTA (TAE) gels at 80 V for 5 h. Standard reference strains for molecular typing were used as controls.
Virulence in mice.The virulence of four VGV isolates, two from clade A and two from clade B, was assessed using 7-to-8-week-old female C57BL/6 mice (Taconic Farms). Isolates to be tested in mice were inoculated in YEPD broth and incubated overnight, washed twice, and diluted in PBS to 2.5 × 105 cells/ml. Mice (14 per isolate) were inoculated with 20 μl of cell suspension (5 × 103 cells/mouse) by pharyngeal aspiration. Eight mice were used for each isolate for determination of the survival rate, and six mice each were used for the analyses of fungal burden and histopathology at the indicated time points. Mice were monitored twice per day, and differences in survival were determined using GraphPad Prism, version 7 (GraphPad Software, San Diego, CA).
To assess the organ fungal burden, lungs and brains of four mice from each infected group were inspected. The mice infected with WM779 started to die on day 25 postinfection, and the lungs and brains were harvested immediately from the dead mice on day 25. Mice infected with other isolates were euthanized on day 60, and organs were harvested. Harvested lungs and brains were homogenized in 7 ml and 2 ml of sterile water, respectively, and 5-μl aliquots of 10-fold serial dilutions were plated on YEPD agar and incubated at 30°C for 48 h. Fungal colonies were counted and the tissue fungal burden was analyzed using GraphPad Prism, version 7 (GraphPad Software, San Diego, CA).
Histopathological analysis.For histopathological analysis, organs of infected mice from each group were harvested at 10 and 20 days postinoculation and fixed in 3.7% buffered formalin and embedded in paraffin. Sections were stained with hematoxylin and eosin (H&E) or alcian blue/periodic acid-Schiff (AB/PAS) at Histoserv Inc.
Ethics statement.The Institutional Animal Care and Use Committee of the National Institute of Allergy and Infectious Diseases approved all animal studies (approval no. LCIM-5E). Studies were performed in accordance with recommendations of the Guide for the Care and Use of Laboratory Animals of the National Institutes of Health.
Data availability.The raw sequence and genome assembly of VGV MF34 are available in NCBI under BioProject accession no. PRJNA487802, and the culture has been deposited at the American Type Culture Collection (accession number pending). Raw sequence data were submitted to the NCBI Sequence Read Archive under BioProject identifiers (ID) PRJNA476154 (all C. gattii non-VGV isolates) and PRJNA480403 (all C. gattii VGV isolates).
ACKNOWLEDGMENTS
We thank Elizabeth Geigner and Scott Steelman for assistance with Oxford Nanopore sequencing of MF34 and José Muñoz for advice running SplitsTree. Michael Fisher, Mark and Mel Harvey, Jo and Charlie Harvey, Ian Bruce-Miller, and Matthew Jones assisted with fieldwork. Shannon Moyer assisted with experimental animal work.
M.C.F. was supported by a United Kingdom MRC (MR/R015600/1) and a United Kingdom NERC studentship to M.V. C.A.C. was supported by NIAID grant U19AI110818 to the Broad Institute. M.C.F. and C.A.C. are CIFAR fellows in the Fungal Kingdom program. M.C., M.J.D., D.-H.Y., Y.C.C., and K.J.K.-C. were supported by the Division of Intramural Research (DIR), NIAID, NIH. L.V.D. and F.B. acknowledge financial support from the Newton Fund UK-China NSFC initiative (grants MR/P007597/1 and 8166113800).
R.A.F., L.V.D., and T.R.S. performed the genomic analyses. R.A.F., M.C., C.A.C., M.C.F., K.J.K.-C., H.M.E., L.V.D., and F.B. wrote the manuscript. M.C., M.J.D., and D.-H.Y. performed the phenotypic assays. K.J.K.-C., Y.C.C., W.M., and C.A.C. sequenced the isolates. T.S. and C.A.C. assembled the VGV genome. M.V., D.C., G.K., and M.C.F. conducted the field work.
FOOTNOTES
- Received 4 September 2019
- Accepted 9 October 2019
- Published 12 November 2019
This is a work of the U.S. Government and is not subject to copyright protection in the United States. Foreign copyrights may apply.
REFERENCES
- 1.↵
- 2.↵
- 3.↵
- 4.↵
- 5.↵
- 6.↵
- 7.↵
- 8.↵
- 9.↵
- 10.↵
- 11.↵
- 12.↵
- 13.↵
- 14.↵
- 15.↵
- 16.↵
- 17.↵
- 18.↵
- 19.↵
- 20.↵
- 21.↵
- 22.↵
- 23.↵
- 24.↵
- 25.↵
- 26.↵
- 27.↵
- 28.↵
- 29.↵
- 30.↵
- 31.↵
- 32.↵
- 33.↵
- 34.↵
- 35.↵
- 36.↵
- 37.↵
- 38.↵
- 39.↵
- 40.↵
- 41.↵
- 42.↵
- 43.↵
- 44.↵
- 45.↵
- 46.↵
- 47.↵
- 48.↵
- 49.↵
- 50.↵
- 51.↵
- 52.↵
- 53.↵
- 54.↵
- 55.↵
- 56.↵
- 57.↵
- 58.↵
- 59.↵
- 60.↵
- 61.↵
- 62.↵
- 63.↵
- 64.↵
- 65.↵
- 66.↵
- 67.↵
- 68.↵
- 69.↵
- 70.↵
- 71.↵
- 72.↵
- 73.↵
- 74.↵
- 75.↵
- 76.↵
- 77.↵
- 78.↵
- 79.↵
- 80.↵
- 81.↵
- 82.↵