Bats, Primates, and the Evolutionary Origins and Diversification of Mammalian Gammaherpesviruses

ABSTRACT Gammaherpesviruses (γHVs) are generally considered host specific and to have codiverged with their hosts over millions of years. This tenet is challenged here by broad-scale phylogenetic analysis of two viral genes using the largest sample of mammalian γHVs to date, integrating for the first time bat γHV sequences available from public repositories and newly generated viral sequences from two vampire bat species (Desmodus rotundus and Diphylla ecaudata). Bat and primate viruses frequently represented deep branches within the supported phylogenies and clustered among viruses from distantly related mammalian taxa. Following evolutionary scenario testing, we determined the number of host-switching and cospeciation events. Cross-species transmissions have occurred much more frequently than previously estimated, and most of the transmissions were attributable to bats and primates. We conclude that the evolution of the Gammaherpesvirinae subfamily has been driven by both cross-species transmissions and subsequent cospeciation within specific viral lineages and that the bat and primate orders may have potentially acted as superspreaders to other mammalian taxa throughout evolutionary history.

subsequent viral adaptation and coevolution within the recipient mammalian hosts.

RESULTS
Detection of vampire bat ␥HVs by serology and PCR. Fourteen of 21 D. rotundus and 2 of 3 D. ecaudata bat individuals from Veracruz (Soledad Doblado locality) were positive for herpesviruses, determined by a panherpesvirus PCR targeting a 150to 200-bp region in the viral DNA polymerase gene (dpol) (Table 1; see also Table S1 in the supplemental material). In contrast, only two of six D. rotundus bats from Morelos and none of the bats from Estado de Mexico were positive. However, such low prevalence may be a result of the limited sampling size. Many of the viral sequences identified from vampire bats matched by BLASTN a previously described ␥HV from Pteropus giganteus (PgHV-5) (host, Indian flying fox; GenBank accession number AGW27609.1) (5) with a sequence identity of Ͼ90% (Table 1). Surprisingly, the viral sequences detected in the samples from the D. ecaudata individuals SD16 and SD12 matched those of a Macaca fuscata rhadinovirus isolate, 12E2 (host, Japanese macaque; GenBank accession number JN885137.1), and a Babyrousa babyrussa rhadinovirus 1 isolate (host, golden babirusa hog; GenBank accession number AY177146.2). BLASTX consistently revealed that many of the viral coding sequences were most similar to the PgHV-5 DNA polymerase protein (Pol). However, the viral sequences from D. rotundus individual MOR4 and D. ecaudata SD16 were most similar to the bovine herpesvirus 4 (BoHV-4) Pol (host, cattle; GenBank accession number AIA82756.1). The sequence from D. ecaudata SD12 was highly similar to that of the Myotis ricketti herpesvirus 2 Pol (host, Rickett's big-footed bat; GenBank accession number JN692430.1) and the sequence from D. rotundus individual SD3 to the phascolarctid herpesvirus 1 Pol (host, koala; GenBank accession number AEX15649). An additional PCR targeting 500 bp of the ␥HV glycoprotein B gene (gB) (2) yielded products for two (D. ecaudata SD12 and D. rotundus MOR4) of the 32 samples tested (Table 1). D. ecaudata SD12 matched the Macaca fuscata rhadinovirus isolate 12E2 with a 70% nucleotide sequence identity, while D. rotundus MOR4 yielded a moderate (66% nucleotide identity) match to Saimiriine herpesvirus 2 (host, common squirrel monkey; GenBank accession number AAA46164). BLASTX showed comparable results, supporting similarity to the primate ␥HV gB protein in both cases (Table 1). To determine whether the gB and dpol sequences in these two samples belonged to the same virus, we attempted to amplify a syntenic block containing gB and dpol by long-range PCR (LR-PCR) (2) but failed to obtain any products.
Given the distant genetic relatedness of some of the vampire bat viruses to BoHV-4, we used a BoHV-4-diagnostic enzymelinked immunosorbent assay (ELISA) kit to determine the antigenic similarities and seroprevalences of ␥HVs within the bat populations studied. Serology showed that the sera of four bat individuals from the Soledad Doblado locality (two of which, D. ecaudata SD12 and SD16, were also positive by the dpol PCR) cross-reacted with BoHV-4, suggesting an antigenic relatedness between the vampire bat ␥HVs and BoHV-4 (see Fig. S1 in the supplemental material). There was no cross-reactivity observed to the other vampire bat or to the equid serum controls tested (data not shown).
Confirmation of vampire bat ␥HV sequences by highthroughput sequencing. To provide additional evidence for the presence of ␥HVs in vampire bats, high-throughput sequencing (HTS) was performed on five selected samples that were previously determined to be PCR positive for ␥HVs. Approximately 400 million raw reads with a size distribution of 100 to 300 bp were obtained (48 to 92 million reads per library) and were sequentially filtered to obtain verifiable high-quality reads (see Table S2 in the supplemental material). For D. rotundus MOR4, 32 reads matched 15 different ␥HV genes, with 3 reads matching the gB gene and 1 read matching the dpol gene (see Table S3). For D. rotundus individual SD2, 5 reads were assigned to 4 different viral genes, while for D. rotundus SD3, 10 reads were assigned to 7 different genes, although no reads matched gB or dpol. In the case of D. ecaudata SD12, 31 reads were assigned to more than 15 different ␥HV genes, with 2 reads matching dpol and 1 read matching gB. Finally, for D. ecaudata SD16, 33 reads were assigned to more than 15 viral genes, with two of them matching dpol but none matching gB (see Table S3). In all cases, the viral sequences matched mostly bat, bovid, and primate ␥HVs. Given that the vast majority of sequences obtained were expected to match the host genome, contig assembly was not performed with the raw data. However, assembly from the filtered reads generated extended contigs for three samples (D. rotundus MOR4 and D. ecaudata SD12 and SD16), yielding sequences of up to 735 bp matching, again, bovid and primate ␥HVs (see Table S4). Such results supported the conclusion that vampire bats carry bovine and primate ␥HV-like viruses. Wide distribution of bat ␥HV viruses among mammalian ␥HV lineages. Ten main viral lineages have been described for the Gammaherpesvirinae subfamily: Lymphocryptovirus, Macavirus, Mus musculus rhadinovirus 1 (MmusRHV-1)-like, bat gammaherpesvirus 1 (BatGHV-1)-like, Percavirus, Rhadinovirus Tapirus terrestris gammaherpesvirus 1 (TterGHV-1)-like, Rhadinovirus Herpesvirus saimiri (HVS), Rhadinovirus Human herpesvirus 8 (HHV-8)-like, Rhadinovirus murid herpesvirus 4 (MuHV-4)like, and Rhadinovirus BoHV-4 (2,4). For the gB tree, all previously described lineages were detected, showing a comparable resolution to previously published topologies ( Fig. 1) (2, 4). However, in addition to the BatGHV-1-like group, bat ␥HVs were found to be widely distributed among 6 mammalian viral lineages  Table S5 in the supplemental material. The scale bar denotes amino acid substitutions per site. previously thought to be order specific (2,4). The most important differences observed between our gB tree and the previously published phylogenetic trees (2, 4) were as follows: (i) the identification of a new bat virus cluster (designated here "bat lymphotropic viruses") diverging from the basal lymphocryptoviruses; (ii) the presence of bat viral sequences forming a sister group to the bovine lymphotropic viruses within the Macavirus lineage; (iii) a bat-derived viral sequence basal to the MmusRHV-1-like viruses; (iv) the Percavirus lineage splitting into three subclusters isolated from mustelids/felids, bats, and equids; and (v) the grouping of vampire bat viral sequences between the Rhadinovirus HHV-8like and the Rhadinovirus BoHV-4-like groups. The viral sequences from bats often represented deep branches within the tree, such as for the bat lymphotropic virus group and the MmusRHV-1-like and BatGHV-1-like clusters (Fig. 1). We further compared the gB topology obtained to three different plausible evolutionary scenarios within a maximum-likelihood (ML) inference framework: (i) strict virus-host cospeciation, (ii) a strict bat origin for all ␥HVs, and (iii) monophyly for bat ␥HVs. Our results revealed that the alternative tree topologies were not supported by the data (SH test, P Յ 0.01; expected-likelihood weight [ELW] of best ML tree, posterior probability [PP] ϭ 1.0), indicating that the phylogenetic pattern we observed most likely reflects the evolutionary history of ␥HVs.
Given the short lengths of many of the Pol sequences and the few variable sites for phylogenetic inference by standard approaches, we used the Evolutionary Placement Algorithm (EPA), in which the short bat viral sequences were treated as short reads and assigned to nodes within a reference tree based on their likelihood weight ratios (LWR). If a given sequence has a single high value for the LWR (see Fig. S2 in the supplemental material, red circles), then its placement within a particular branch or node of the tree is supported. If a sequence has many possibilities for placement, then it can have many low LWR values. The overall confidence for sequence placement is expressed by the entropy value of each sequence, where low entropy indicates good confidence for placement. In the absence of a threshold, we considered a placement to be confident only for sequences with a single LWR value of Ն0.4 within a branch or node or with cumulative LWR values of Ն0.3 within a same cluster (Table 2). Although many of the bat sequences could not be placed on the tree with high confidence, an overall pattern similar to that of the gB tree was observed, including a basal position for some of the bat viral sequences and a wide distribution among different mammalian viral lineages (see Fig. S2). Sequences were assigned with confidence to the following viral clusters: Lymphocryptovirus, Macavirus, MmusRHV-1-like, BatGHV-1-like, Percavirus, Rhadinovirus HVS, and Rhadinovirus HHV-8-like groups ( Table 2). The resulting topology is publicly available as an interactive project on the Interactive Tree of Life (iToL) version 3 webserver (http://itol.embl.de/tree/21616595883251465841813).
Multiple bat and primate transmissions to other mammals. The virus and host phylogenies were compared to estimate the numbers of primary and secondary host switches (HS) and cospeciation (CS) events. The resulting tanglegram revealed multiple HS within the gB phylogeny, most of them attributable to the order Chiroptera (Fig. 2). Ten primary HS occurring at the order level were detected, 3 of which were attributed to bat ␥HVs (bat lymphotropic viruses to Elephas maximus gammaherpesvirus 1 [ELMA_GHV1], BatGHV-1-like to mustelid/felid Percavirus, and bat to equid Percavirus), and 2 were attributed to primates (lymphocryptovirus to bat lymphotropic viruses and Rhadinovirus HHV-8-like to the Rhadinovirus MuHV-4-like and BoHV-4-like groups). The remaining 5 HS were single events attributable to different taxonomic groups: ELMA_GHV1 to Macavirus (Afrotheria to Artiodactyla), Macavirus to the MmusRHV-1-like group (Artiodactyla to Rodentia), MmusRHV-1-like to Percavirus (Rodentia to multiple hosts), Percaviruses to the Rhadinovirus supercluster (multiple hosts to multiple hosts), and Tupaia belangeri gammaherpesvirus 1 (TUBEL_GHV1) to the HHV-8-like rhadinoviruses (Scadentia to Primates). Secondary HS events occurring at a species level revealed a total of 6 HS, 3 of which involved bat viruses; these included bovine lymphotropic herpesviruses and the bat viruses Scotophilus kuhlii ␥HV 11HZ76 (SCKUH_76), In agreement with our results, the optimal solution obtained by the cophylogeny analysis revealed that duplications and hostswitching events outnumber the cospeciation events, while this Rhadinovirus HVS a Names of viruses represented by abbreviations here are available in Table S5 in the supplemental material. b Sequence was not assigned to a particular branch, due to a low LWR, but had a cumulative LWR supporting its placement within the given viral cluster.
reconciliation was statistically supported (P Ͻ 0.05). Further supporting our previous observations, most duplication/HS events were detected within the chiropterans, with 34 duplications and 5 HS, followed by 15 duplications in primates, 10 duplications and 2 HS in artiodactyls, 4 duplications and 1 HS in carnivores, and finally, 3 duplications in both rodents and perissodactyls. Within the parsimony framework of minor costs, only 2 cospeciation events were detected (Fig. 2). Limited homology between viral and host proteins. It is possible that some of the accessory ␥HV open reading frames (ORFs) known to have cellular homologs would share a significant sequence identity to the host protein counterparts, if cospeciation had occurred (13). Thus, we examined the amino acid sequence similarity between the viral and host FLICE-inhibitory-like protein (FLIP), B-cell lymphoma-2 apoptosis (BCL-2) mediator protein, and OX-2 membrane glycoprotein. No significant identity to mammalian proteins was detected for the viral BCL-2 (vBCL-2) and vOX-2 proteins. However, vFLIP resembled mammalian CASP8 and FADD-like apoptosis regulator protein (cFLAR) death effector domain 1 and 2 (DED1/DED2; amino acids [aa] 1 to 172). Our results revealed that while most of the ␥HV FLIP proteins shared significant identity to cFLARs of diverse mammalian species (mostly rodents, bats, and primates), only vFLIP from MYVE_HV8 shared identity with the cFLAR protein of the Myotis genus, suggesting cospeciation (see Table S6 in the supplemental material). Nonetheless, such results should be interpreted with caution, as cFLAR is highly conserved among all mammals (Ͼ70% identity in amino acids) and only shares a weak similarity to vFLIP (Ͻ50% identity in amino acids).

DISCUSSION
The genetic and antigenic characterization of the vampire bat viral sequences revealed that these viruses are distantly related to other bat, bovid, and primate ␥HVs. However, the genetic distance among sequences suggests that the vampire bat viruses are divergent and may have become established in the vampire bat population long ago. We detected most of the ␥HV sequences in the spleen, which is consistent with both viral replication tissue tropism and latency occurring in germinal center B cells, as has been described for a number of other mammalian ␥HVs (14). Gilbert, unpublished data), there is no evidence for integration of ␥HVs into the vampire bat genome. Therefore, the novel virus sequences described in this work are unlikely to emanate from endogenized ␥HVs. The relatively small amount of HTS reads obtained suggests that the vampire bat viral sequences stem from latent viruses. However, consistent with the possibility of viral reactivation from splenic B cells, a higher concentration of reads was detected in two vampire bat samples that were also ␥HV positive by PCR and serology (D. ecaudata SD12 and SD16). Vampire bats are the only mammals that feed exclusively on the blood of other animals, and at least in the case of D. rotundus, they have a preference for domestic swine and bovids. Vampire bats have been selectively feeding on the blood of cattle since their introduction in the Americas, as they represent an easily accessible food source (15). Thus, some of the BoHV-4-related ␥HVs in vampire bats might have been introduced into these bat species as a conse-   (16). Therefore, feeding ecology may not be a critical factor in cross-species transmission. It has been recently suggested that the process of host switching is strongly influenced by the opportunity to encounter a new host presented to the parasite and by the compatibility of a parasite for colonizing a new host, given that the host selective pressure may not be strong enough to eliminate the parasite (17). Moreover, parasites can persist for extended periods in suboptimal hosts until reaching a new niche through a stepping-stone process, circulating in different hosts that can be divergent from each other but in relative physical proximity (17). Thus, we speculate that batspecific traits, such as flight, large population sizes, and a wide geographical range, might have been important in enabling or enhancing ␥HV spillover from bats to other taxa.
We used the largest collection of mammalian ␥HV sequences to date, representing all lineages of the Gammaherpesvirinae subfamily within 34 different taxa and including the ␥HV sequences available from 19 different bat species. Because of the lack of viral sequences for many mammalian orders and a sampling bias in primates, ungulates, and rodents, it is likely that the diversity and evolution of ␥HVs is still not fully represented with the current data. This could explain the long branches observed between many viral isolates within different viral phylogenetic clusters. However, only further sampling within a larger diversity of hosts will help determine the full scale of viral diversity, and this will likely reveal additional cross-species transmission events not detected here. Although the availability of different ␥HV groups and genes within GenBank is limited, we used a comprehensive data set to include the largest possible number of viruses within the widest range of hosts, by employing the best-represented viral ORFs (dpol and gB) that are least likely to have reached mutational saturation over the long evolutionary time scale examined (data not shown). Although phylogenetic analyses have been carried out in previous studies (2,4), reduced data sets of 12 to 45 viral sequences were used, with bat ␥HVs being underrepresented. Bat herpesvirus discovery and characterization has relied mostly on sequences obtained by PCR, which often represent short amplicons because of DNA quality and sample limitation issues (2,8,12). Using short sequences for phylogenetic analysis has caveats, but analyzing different viral genes independently, including sufficient full-length sequences, and using alternative phylogenetic approaches, such as the placement of shorter sequences on a reference tree, can increase confidence in phylogenetic inference.
Our results revealed that the overall phylogenetic pattern for ␥HVs observed from two independent viral genes is not congruent with a strict virus-host codivergence scenario. Our data strongly support cross-species transmissions within viral clusters that were thought to be order specific (Macavirus, MmusRHV-1like, and Percavirus). Moreover, several primate and bat viral lineages represent deep branches within the ␥HV phylogeny, such as the Lymphocryptovirus group that is basal to the bat lymphotropic viruses, and may thus represent the oldest viral lineages. Our results further suggest that primates and bats may carry the highest diversity of ␥HVs, while the close phylogenetic relationship between some of the bat and primate viral groups provides evidence for ancient spillover events, as has been observed for other herpesviruses (3,18). Furthermore, the similarity between viruses present in distantly related bat species suggests that some bat ␥HVs are likely to be very old and to have emerged shortly after the divergence of chiropterans at least 60 million years ago (MYA) (19,20). However, these viruses may have maintained the ability to jump between different mammalian species, as observed for the bat ␥HVs that are closely related to the BoHV-6 isolates within the Macavirus clade (12). An origin for ␥HV emergence was estimated at approximately 64 MYA by extrapolating the divergence dates of swine and ruminant hosts to the viruses within the Macavirus genus (2,4). Although this assumption may be valid for the viruses found within artiodactyl hosts, it is likely that ␥HVs in general are much older, possibly coinciding with the origin of placental mammals at least 84 MYA (21). Nonetheless, given the limited length of many of the ␥HV sequences, estimating a chronology for the diversification of the overall viral group and for more-shallow clusters would likely yield inaccurate dates (22).
The evolution of specific ␥HV lineages not being compatible with a strict virus-host cospeciation had been previously noted (2). The ratio of cospeciation versus duplication and hostswitching events, which we detected both manually and by cophylogeny analysis, suggests that although cospeciation might have occurred for particular lineages, it was often preceded by duplication and/or a host-switching event. Host switching was also detected within viral groups previously thought to be order specific. Together, these observations suggest that cross-species transmission followed by lineage-specific cospeciation have been the main evolutionary drivers within the Gammaherpesvirinae subfamily. Furthermore, alternative topology testing revealed that strict cospeciation is not supported by the data, congruent with a polyphyletic origin for most ␥HVs. A strict bat origin for ␥HVs was also not supported, suggesting that many species have played a role in the sequential spread of ␥HVs throughout evolutionary history (8,11,12). Hence, we propose that the Gammaherpesvirinae subfamily has evolved by many interspecies transfers, with specific host codivergence playing a role in ␥HV evolution only after adaptation to a new host. Our data indicate that chiropterans and primates may have played an important role in ␥HV transmission, as has been observed for other viral groups (23). However, future analyses using other viral genomic regions and a greater sampling of viral diversity should help to clarify the full extent and timing of viral cross-species transmission at different evolutionary timescales.

MATERIALS AND METHODS
Nucleic acid extraction and PCR. Bat sample collection was approved by the Internal Committee for Ethics and Animal Welfare (approval no. 2012-09-05) and was carried out in compliance with Mexican regulations (collection permit NUM/SGPA/DGVS/03173/14; export certificate SAG-ARPA 241111524599811488A467371). Twenty-nine D. rotundus and three D. ecaudata bats (n ϭ 32) were captured using mist nets in three different localities in Mexico (San Pablo, Tlaltizapán Morelos, Mexico; Soledad Doblado, Veracruz, Mexico; and La Cabecera, Estado de Mexico, Mexico) (see Table S1 in the supplemental material). Because sampling was dependent on bat seasonality, we were only able to obtain a limited number of individuals for each species and from each locality. Spleen Escalera-Zamudio et al.
tissue from 32 sacrificed animals was used for nucleic acid extraction (QIAamp MinElute virus spin kit; Qiagen) as previously described. A universal nested PCR for the detection of herpesviruses targeting a short fragment (150 to 200 bp) of the viral DNA polymerase gene (dpol) was used to screen each bat tissue sample (2,24). Further PCRs using virusspecific primers targeting a 500-bp region of the ␥HV glycoprotein B gene (gB) and to cover the genetic distance between gB and dpol using longdistance PCR (LD-PCR) were carried out as previously described (2). PCR products were visualized on 1.5% (wt/vol) agarose gels stained with Midori green (Nippon Genetics) and Sanger sequenced using BigDye version 3 chemistry on an ABI 3730xl DNA analyzer (Life Technologies) at LCG Genomics (Berlin, Germany). To determine sequence identity, sequences were analyzed by BLASTN and by BLASTX (https://blast.ncbi .nlm.nih.gov/Blast.cgi).
Serology. Fresh blood from each bat was obtained using the Microvette CB 300-l system (Sarstedt) and centrifuged for 5 min at 10,000 ϫ g at 20°C for serum separation. All sera were stored at Ϫ20°C for further use. Given the lack of standardized enzyme-linked immunosorbent assay (ELISA) kits for wildlife, a commercial kit available for bovine herpesvirus 4 (BoHV-4) diagnostics in cattle (BIO K 263; Bio-X Diagnostics, Belgium) was used. This assay uses whole virus for detection, and thus, cross-reactivity with related ␥HVs is likely. Additionally, it uses a protein G-horseradish peroxidase (HRP) conjugate that is able to detect immunoglobulins from most mammalian species, including bats. ELISA was performed following the manufacturers' instructions, using 5 serial dilutions of each bat's serum (1:10, 1:25, 1:50, 1:100, and 1:200) and including the diluted negative and positive cattle serum controls provided with the kit. An optimal serum dilution of 1:50 was standardized for the bat samples, while a cutoff value of 30% compared to the positive control (value ϭ ⌬OD sample ϫ 100/⌬OD positive-control serum, where OD is optical density) was used to determine positive sera, following the manufacturer's instructions. Further external controls were added to test for cross-reactivity against other mammalian alphaherpesviruses and ␥HVs. For this purpose, 3 equid sera determined to be positive for different ␥HVs by PCR and one serum positive for equine herpesvirus 1 by ELISA were tested under the conditions described above. Given the limited amount of bat samples available, a single test with duplicate reactions was carried out.
High-throughput DNA sequencing. DNA samples from five bat individuals positive for ␥HVs by PCR (D. rotundus MOR4, D. ecaudata SD16, D. ecaudata SD12, D. rotundus SD2, and D. rotundus SD3) were used to prepare double-indexed Illumina libraries (25). Prior viral enrichment steps were not possible given the field collection conditions. Individual genomic libraries were pooled for 2 ϫ 150-bp paired-end sequencing on the Illumina NextSeq 500 platform with the NextSeq version 2 kit on high-output mode at the Berlin Center for Genomics in Biodiversity Research (BeGenDiv). Sequence reads were quality filtered and adapters removed, followed by host DNA filtering and viral taxonomic assignment (26). High-quality reads were filtered to remove bacterial, human, and chiropteran sequences by mapping with SMALT version 0.7.6 (http:// sanger.ac.uk/resources/software/smalt) under a stringency of 50 to 70% against custom-built genomic libraries retrieved from the Reference genomic sequence (refseq_genomic) NCBI database (http://www.ncbi .nlm.nih.gov/refseq/about/) and against the D. rotundus genomic data (Zepeda-Mendoza et al., unpublished data). Viral assignment was performed using BLASTX version 2.2.29 (http://blast.ncbi.nlm.nih.gov/ Blast.cgi) against the GenBank nonredundant protein database and mapped with SMALT against a custom-built herpesviral database under a stringency of 60%. The ␥HV-matching reads were further selected by reciprocal BLASTX analysis using the following criteria: length of Ն100 bp, pairwise identity of Ͼ50%, E-value of Ͻ10 Ϫ6 , and independent hits to two different ␥HV proteins or at least two different regions of the same protein. Although this last step may significantly reduce the final number of reads, it is important in order to obtain verifiable as opposed to sporadic hits. It has been proposed that for metagenomic approaches using wildlife samples, only reads above Ն150 bp in coding sequences and yielding identity to different viral protein targets can be considered accurate for pathogen identification (27). From the filtered reads, contigs were assembled to obtain longer sequences using SAMtools version 1.3.1 (28) and SMALT to map against the consensus sequence at a stringency of 60%.
Sequence alignment and estimation of variable sites. For gB, the 92 available protein-coding sequences from viruses isolated from diverse mammalian species (including most of the bat and reference viruses) were retrieved from the GenBank nonredundant nucleotide database as of May 2016. After collapsing identical sequences and pruning to eliminate redundancy and short/low-quality sequences, a total number of 81 sequences were retained for the analysis (see Table S5 and Data Set S1 in the supplemental material). From the 81 sequences used, only 21 corresponded to full-length protein sequences, while the remaining 60 were partial sequences with an average length of 290 to 163 aa. Saturation within the nucleotide sequences was estimated to discard the possible effects of long-branch attraction (LBA) (data not shown). Translated amino acid sequences were aligned through sequential profile alignments for divergent sequences using MUSCLE, as implemented in SeaView (29,30). The alignment was manually edited to remove highly divergent regions, resulting in a final length of 564 aa, comparable to the data sets used in previous studies (636 aa) (4). For Pol, the same procedure as for gB was followed, resulting in an alignment of 97 OTUs with a length of 894 aa, comparable to data sets used in previous studies (909 aa) (see Table S5 and Data Set S2) (4). From the sequences characterized in this work, only two DR-␥HV sequences, with a length of Ͼ100 aa (D. rotundus MOR4 and D. ecaudata SD12), were included in both gB and Pol alignments for phylogenetic analysis. In order to assess the number of variable sites attributed to the bat sequences, the original gB and Pol data sets were modified to shortened versions trimmed to the average length of the bat sequences. For gB, an alignment of 189 aa (minimum length of 140 aa) was obtained, while for Pol, an alignment of 74 aa (minimum length of 55 aa) was retrieved, excluding outgroup sequences.
Phylogenetic analysis. The best-fit amino acid substitution model for gB was identified using jModelTest2 (31) (LG and empirical residue frequencies ϩF, with among-site rate heterogeneity modeled by the ⌫ distribution with four rate categories) (32,33), while phylogenetic analysis was performed under maximum likelihood (ML) using RAxML version 8.2.8 (34). Ten searches starting from stepwise-addition maximum-parsimony trees were run, while node robustness was assessed by the Shimodaira-Hasegawa [SH]-like (35) approximate-likelihood ratio test (aLRT). Given the short length of the bat viral sequences and the reduced number of variable sites for Pol, we used the Evolutionary Placement Algorithm (EPA) for the assignment of sequence fragments to a reference tree using the maximum-likelihood optimality criterion in RAxML (34,36) with the aforementioned model parameters (LGϩ⌫ 4 ϩF). All viral sequences of Ͻ250 amino acids were treated as short reads and assigned within a reference sequence alignment and ML tree based on their likelihood weight ratios (LWR). To obtain the reference tree, bat viral sequences were pruned from the original full-length alignment, leaving only the 60 longer reference viral sequences (4). The phylogenetic mapping of the short sequences was visualized using the Interactive Tree of Life (iToL) version 3 online tool (http://itol.embl.de) (37).
Alternative evolutionary scenario testing. Phylogenetic testing was performed for three different gB evolutionary scenarios: (i) strict hostvirus cospeciation, (ii) a strict bat origin for all ␥HVs (bat sequences are monophyletic at a basal position on the tree), and (iii) a single origin for bat ␥HVs (bat sequences are monophyletic within the BatGHV-1-like viral cluster). The different evolutionary scenarios were tested in RAxML using (i) the Shimodaira-Hasegawa (SH) test (35) for contrasting the best ML tree and alternative topologies and (ii) the expected-likelihood weight (ELW) procedure (38) to establish a confidence tree set using 100 bootstrap samples.

Comparison of the host-virus phylogeny.
For the host tree, the UCSC 100-way vertebrate genome phylogenetic tree based on the 100-way BLAST search to obtain orthologs of the opsin gene ONP5 (neuropsin) (http://hgdownload.cse.ucsc.edu/goldenpath/hg19/phyloP100way) (39) was manually edited to display an even representation of 39 species belonging to the euarchontoglires and laurasiatherian mammalian superorders. For the host-virus phylogeny comparison, the gB tree was contrasted with the host tree using the tanglegram algorithm for rooted phylogenies implemented in Dendroscope version 3 (40). As bat viruses within the gB tree represent 38% of all sequences used, to minimize the effects of sampling bias (e.g., a larger number of viral sequences available for particular taxonomic groups) in the interpretation of the results, only the number of viral lineages represented for each mammalian order, and not the number of viral sequences available for each taxonomic group, was taken into account.
Cophylogeny analysis. The numbers of primary and secondary hostswitching (HS) events versus cospeciation (CS) events within the gB tree were manually counted. Primary host-switching events were defined on an ordinal level as a viral lineage derived from a host (order) diverging from another viral lineage from another host (order). Secondary HS events were defined on a species level as a viral sublineage derived from a host (species) grouping basally or next to another viral sublineage from a different host (species). CS events were observed as order-or speciesspecific viral lineages that demonstrate a strict viral host codivergence. Under these criteria, only nodes with a support value of Ն80% were considered. Furthermore, Jane4 (41) was used to test for significant congruence between the virus and host trees, searching for evidence for coevolution. Jane4 is suitable for assessing unbalanced numbers of hosts and parasites and multihost parasitism. It uses a heuristic approach based on maximum parsimony to search for tree reconciliation solutions between associated phylogenies by minimizing the overall costs given by individual evolutionary events between host and parasite, as follows: (i) cospeciation, (ii) duplication (a parasite speciates but remains on the same host), (iii) host switching (a parasite speciates and shifts onto a different host), (iv) loss (a host speciates but the parasite remains only on one of the new hosts), and (v) failure to diverge (a host speciates and the parasite remains on both old and new host) (41). The cost regimes tested were as follows: default cost settings within the range of [0, 3]. Generation times of 10, 50, and 100 were run with population sizes set to 10, 30, and 50 with 100 replicates. The optimal solutions were examined, and the probability of each cophylogeny having arisen by chance was calculated. The lower-cost optimal solution was compared within the corresponding simulated empirical distribution.
Accession number(s). GenBank accession numbers for the viral sequences used in the phylogenetic analysis are listed in Table S5 in the supplemental material. Vampire bat viral sequences were deposited in GenBank under the following accession numbers: Desmodus rotundus MOR4 Pol (KU942401), Diphylla ecaudata SD16 Pol (KU942402), Diphylla ecaudata SD12 Pol (KU942403), Desmodus rotundus MOR4 gB (KU942404), and Diphylla ecaudata SD12 gB (KU942405). Given the short length (Յ200 bp) of some of the sequences determined in this study, not all vampire bat viral sequences could be deposited in GenBank, but these are available upon request. HTS data are available from the Dryad Digital Repository (http://dx.doi.org/10.5061/dryad.sg0k6). The HTS reads were deposited on the NCBI Sequence Read Archive (SRA) under BioProject number PRJNA348455.