How, When, and Where Relic DNA Affects Microbial Diversity

ABSTRACT Extracellular or “relic” DNA is one of the largest pools of nucleic acids in the biosphere. Relic DNA can influence a number of important ecological and evolutionary processes, but it may also affect estimates of microbial abundance and diversity, which has implications for understanding environmental, engineered, and host-associated ecosystems. We developed models capturing the fundamental processes that regulate the size and composition of the relic DNA pools to identify scenarios leading to biased estimates of biodiversity. Our models predict that bias increases with relic DNA pool size, but only when the species abundance distributions (SADs) of relic and intact DNA are distinct from one another. We evaluated our model predictions by quantifying relic DNA and assessing its contribution to bacterial diversity using 16S rRNA gene sequences collected from different ecosystem types, including soil, sediment, water, and the mammalian gut. On average, relic DNA made up 33% of the total bacterial DNA pool but exceeded 80% in some samples. Despite its abundance, relic DNA had a minimal effect on estimates of taxonomic and phylogenetic diversity, even in ecosystems where processes such as the physical protection of relic DNA are common and predicted by our models to generate bias. Our findings are consistent with the expectation that relic DNA from different taxa degrades at a constant and equal rate, suggesting that it may not fundamentally alter estimates of microbial diversity.

W hen microorganisms die, their DNA leaks into the surrounding environment. The fate of this relic DNA has important implications for evolutionary and ecological processes. For example, relic DNA can be taken up and incorporated into the genomes of some microorganisms via transformation, thereby serving as a reservoir of genetic information that can confer new traits and fitness benefits to distantly related taxa (1). In addition, relic DNA is a high-quality resource containing carbon, nitrogen, and phosphorus that is consumed by a diverse array of bacteria with consequences for microbial community structure and ecosystem processes (2,3).
Relic DNA also has the potential to alter cultivation-independent estimates of diversity, which are widely used for addressing questions concerning the assembly, biogeography, and functioning of microbial communities. Microbial DNA extracted from environmental and host-associated samples is not derived solely from metabolically active organisms (4). A large portion of the individuals in a microbial community is dormant or dead (5,6). Although nucleic acids can be temporarily retained in nonviable cells, DNA is ultimately released into the environment when individuals die from autolysis, senescence, viral infection, or predation (7). Together, these sources of mortality can create large pools of relic DNA (8,9). For example, there is an estimated 0.45 Pg of relic DNA in global ocean sediments, which is 70-fold greater than the amount of DNA contained in intact cells from the same environments (2). If included in estimates of biodiversity, relic DNA could distort our understanding of the ecological and evolutionary processes that regulate the distribution, abundance, and function of microbial taxa. The processes that regulate relic DNA dynamics vary among ecosystems. In wellmixed microbial habitats, the residence time of relic DNA can be short owing to high rates of hydrolysis, oxidation, and UV-mediated damage (9). For example, in surface waters of freshwater and marine environments, the extracellular DNA pool turns over in less than 1 day (10,11) while plasmid DNA begins to degrade in just minutes (12). As a result, the size distribution of relic DNA is skewed toward small fragments ranging between 100 and 500 bp (13,14). In more-structured microbial habitats, other factors contribute to the size and turnover of the relic DNA pool. For example, binding with inorganic and organic substances in soils and sediments reduces the rate of relic DNA degradation (8). Likewise, biofilms, aggregates, and outer membrane vesicles can protect relic DNA from hydrolytic enzymes (7,15,16). Similarly, the heterogeneous distribution of microorganisms in structured habitats can create "hot spots" of metabolic activity (17) that increase the microscale turnover of relic DNA. Collectively, these processes may help explain the abundance and diversity of relic DNA sequences in a range of ecosystems, including ocean sediments (2), the built environment (18), and the mammalian gastrointestinal tract (19).
To date, the documented effects of relic DNA on estimates of diversity are idiosyncratic. Even in samples with large amounts of relic DNA, bias can be nonexistent or substantial and can lead to the overestimation or underestimation of diversity (20,21). Such observations reflect the need for a more mechanistic understanding of relic DNA dynamics so that microbial communities can be consistently and accurately characterized across ecosystems. Here, we develop a theoretical framework that considers the processes regulating the size and turnover of the relic DNA pool. We emphasize that the direction and magnitude of bias are influenced by sampling from a joint species abundance distribution (SAD) consisting of sequences from the relic and intact DNA pools. We evaluated our models by quantifying the contribution of relic DNA to the abundance and diversity of bacterial communities in a set of ecosystem types that have contrasting relic DNA dynamics.

RESULTS
Theoretical framework for relic DNA dynamics and diversity. We developed a set of interrelated models to identify conditions that we hypothesized would affect relic DNA dynamics and lead to biased estimates of microbial diversity. We began with a conceptual model representing the dynamics of intact and relic DNA (Fig. 1a). The size and composition of the intact DNA pool reflect sequences contained in viable bacteria belonging to different species, which are influenced by births, deaths, and immigration. For the relic DNA pool, the size and composition are determined by the input of sequences associated with the death of bacteria from the intact DNA pool and losses associated with the degradation of relic DNA sequences. With this framework established, we then developed a sampling model to explore how diversity estimates are affected when draws come from a joint distribution of sequences derived from the intact and relic pools. Finally, we integrated our sampling-based model with a processbased model to describe relic DNA dynamics and their effects on diversity estimation.
(i) Sample-based model. Simulations from our sampling-based model indicate that a sufficiently large relic DNA pool is required but not sufficient to create bias in estimates of species richness, the number of taxa in a sample (Fig. 1b). Critically, the simulations reveal that there must also be differences between the species abundance distributions (SADs) of the intact and relic DNA pools for bias to arise. When the relic DNA pool has a more even SAD, sampling from the total pool (intact ϩ relic) leads to overestimation of true richness (Fig. 1b). Conversely, when the relic DNA pool has a less even SAD, sampling from the total pool leads to underestimation of true richness (Fig. 1b). The amount of relic DNA in a microbial environment is determined by inputs associated with the mortality of viable individuals with intact DNA and by losses associated with the degradation of relic DNA. If the diversity of sequences contained in the relic DNA pool is sufficiently different from that in the intact DNA pool, then relic DNA may bias estimates of microbial biodiversity (as indicated by different colored boxes) when sampling from the total (intact ϩ relic) DNA pool. (b) We developed a sampling-based simulation model to explore the effects of mixing intact and relic DNA on estimates of diversity. We populated intact and relic communities with individuals from a lognormal species abundance distribution (SAD). We altered the diversity (i.e., evenness, E) of the relic community by changing the scale parameter of the lognormal distribution describing the SAD. We then sampled and mixed the intact and relic communities so that the relic contribution to total community ranged from 0.01 to 0.96. (c) To gain mechanistic insight into how bias arises, we created a stochastic process-based model that captures features that influence relic DNA dynamics, including immigration, birth, death, and degradation (a). We simulated a range of degradation rates to achieve relic DNA pool sizes with proportions ranging between 0.05 and 0.95. To explore how degradation alters the SAD of the relic community, we explored three scenarios. First, we simulated a neutral scenario where relic DNA sequences produced by different species degrade at the same rate. Second, we simulated conditions under which physical, chemical, or biological processes reduce the degradation rate of relic DNA belonging to some species via protection. Third, we simulated "hot spots" where more abundant relic DNA sequences experience higher rates of relic DNA degradation, a condition that may arise in structured habitats where there are patchy distributions of individuals and their metabolic products (i.e., enzymes). We ran simulations for 10,000 time steps and then sampled the intact and relic communities. To quantify bias in diversity (b and c), we calculated "richness ratios" which reflect the number of species in the total DNA pool (intact ϩ relic) divided by the number of species in the intact DNA pool in a simulation. When values for richness ratios equal 1, relic DNA has no effect on estimates of diversity; when richness ratios are Ͼ1, relic DNA overestimates true diversity; and when richness ratios are Ͻ1, relic DNA underestimates true diversity.
Relic DNA and Microbial Diversity ® (ii) Process-based model. Simulations from our process-based model identify conditions that lead to relic DNA bias. We determined that relic DNA pool size reaches a stable equilibrium when R ϭ m · I/d, where R is the size (number of sequences) of the relic DNA pool, m is the per capita mortality rate of bacteria in the intact DNA pool, I is the size (number of sequences) of the intact DNA pool, and d is the per capita degradation rate of the relic DNA sequences. From this, it can be shown that R increases with the residence time ( ϭ R/d) of the relic DNA pool (see Fig. S1 in the supplemental material). When we considered the neutral scenario where the rates at which relic DNA sequences degrade are equivalent among species, we found that the shapes of relic SAD and intact SAD are nearly identical. No bias arises under these conditions, regardless of the relic DNA pool size (Fig. 1c). Next, we simulated physical protection by reducing the degradation rates of relic DNA sequences belonging to randomly selected species (see Materials and Methods). This created a less even relic SAD, resulting in the underestimation of richness in the total DNA pool (Fig. 1c). Finally, we considered the scenario where there are "hot spots" of relic DNA degradation. When simulating accelerated degradation rates for abundant relic DNA sequences, there was a more even relic SAD that resulted in the overestimation of richness in the total DNA pool (Fig. 1c).
Empirical assessment of relic DNA abundance and effects on diversity estimation. Guided by the insight from our model, we obtained samples from replicate sites in four different ecosystem types (soil, sediment, water, and the mammalian gut). Each sample was treated with DNase I in a paired fashion to quantify the abundance of relic DNA and its effect on taxonomic and phylogenetic measures of diversity. We tested the hypothesis that the importance of relic DNA would vary among ecosystems owing to features that might retard or enhance the species-specific degradation of relic DNA sequences. We also tested the prediction that bias should increase with increasing relic DNA pool size.
(i) Variation in relic DNA pool size. Relic DNA accounted for a substantial but variable portion of the total bacterial DNA pool. The proportion of relic DNA estimated using quantitative PCR (qPCR) was normally distributed (mean Ϯ standard deviation [SD], 0.33 Ϯ 0.218) and ranged from 0 to 0.83 across 34 samples obtained from different ecosystem types (Fig. 2). Even though host and environmental features associated with different habitats are thought to influence relic DNA dynamics (9,20), there was only a marginally significant effect of ecosystem type on the proportion of relic DNA (one-way analysis of variance [ANOVA], F 3, 30 ϭ 2.43, P ϭ 0.084), which likely reflected the contrast between soil and water samples (Tukey's honestly significant difference [HSD], adjusted P ϭ 0.058). Nevertheless, the large average pool size and range in relic We quantified the amount of intact DNA in a sample after removing relic DNA with a DNase treatment. We then estimated the proportion of relic DNA as 1 Ϫ (intact DNA/total DNA), where the total DNA concentration was quantified without DNase treatment. Relic DNA constituted an appreciable portion of the total DNA pool but was not affected by the ecosystem type from which the sample was collected (gut, soil, sediment, and water). Gray symbols are the observed data, and black symbols represent means Ϯ 95% confidence intervals.
DNA across samples provided us with the opportunity to explore features of our model, specifically the magnitude and direction of bias that relic DNA should have on estimates of microbial diversity.
(ii) Magnitude and direction of bias on microbial diversity. Despite accounting for a substantial portion of the total DNA (Fig. 2), relic DNA had a minimal effect on estimates of richness, evenness, phylogenetic diversity (PD), or the species abundance distribution (SAD) based on 16S rRNA gene sequencing (Table S1). We standardized the effect size of these alpha-diversity metrics on a per-sample basis as a ratio (total/intact), where values of Ͼ1 represent overestimation bias and values of Ͻ1 represent underestimation bias. Relic DNA had no effect on the diversity ratios based on the observation that the 95% confidence intervals overlapped with 1.0. Furthermore, the 95% confidence intervals of the diversity ratios overlapped across ecosystem types, indicating that the contribution of relic DNA to all measures of diversity was generally low irrespective of the microbial habitat (Fig. 3). We used Kolmogorov-Smirnov tests to determine whether relic DNA altered the SAD. For each paired sample, we tested if the FIG 3 Effect of relic DNA on within-sample (alpha) bacterial diversity in different ecosystem types. We tested for the effects of bias caused by relic DNA by calculating diversity ratios for richness (a), evenness (b), and phylogenetic diversity (c). The ratios reflect the diversity of the total DNA pool (intact ϩ relic) divided by the diversity of the intact DNA pool. Relic DNA did not bias any measures of diversity in any of the ecosystem types. Richness was calculated as the number of operational taxonomic units (97% sequence similarity of the 16S rRNA gene), evenness was calculated using Simpson's evenness index, and phylogenetic diversity was calculated using Faith's D index. Gray symbols are the observed data, and black symbols represent means Ϯ 95% confidence intervals. distribution of abundances for taxa in the total DNA pool was different from distribution of abundances for taxa in the intact DNA pool. From these analyses, we conclude that the abundance distributions of taxa are similar and not affected by ecosystem type (P Ն 0.3), which suggests that patterns of commonness and rarity are not influenced by relic DNA. Finally, using indicator variable multiple regression, we evaluated whether the magnitude of bias increased with relic DNA pool size, a predicted outcome from some of our sampling-and process-based simulations ( Fig. 1b and c). For all diversity ratios (richness, evenness, and PD), the slopes were not different from zero (P Ն 0.65), the intercepts were not different from 1.0 (P Ն 0.65), and the estimates were not affected by ecosystem type (P Ն 0.26) (Fig. 4). Moreover, only a very small amount of variation (R 2 Յ 0.01) was explained by relic DNA and ecosystem type in the regression models.
(iii) Contribution of relic DNA to community composition. Based on 16S rRNA gene sequencing of the intact and total DNA pools, relic DNA had a minimal effect on the compositional (beta) diversity of bacterial communities within and across ecosystem types. First, the intact and total DNA pools were significantly and very highly correlated with one another when we performed a taxonomically based (Bray-Curtis distance) Mantel test (P ϭ 0.001, r ϭ 0.959) and a phylogenetically based (UniFrac distance) Mantel test (P ϭ 0.001, r ϭ 0.996). Second, we tested for the effect of relic DNA on bacterial community composition for the intact and total DNA pools using a modified beta-dispersion metric (22). Specifically, we calculated centroid distance ratios, which directly compared the dispersion between pairs of DNase-treated and DNase-control subsamples (see Fig. S6). With this approach, if the centroid distance ratio was Ͼ1, we concluded that relic DNA inflated beta-diversity; if the distance ratio was Ͻ1, we concluded that relic DNA homogenized beta-diversity. We found that relic DNA had no effect on the centroid distance ratios based on the observation that the 95% confidence intervals overlapped with 1.0 ( Fig. 5 and S6). Furthermore, the 95% confidence intervals for the centroid distance ratios overlapped across ecosystem types, indicating that the contribution of relic DNA to taxonomic and phylogenetic betadiversity was low irrespective of the microbial habitat. Last, we used indicator variable multiple regression to test the prediction that bias in estimates of community composition would increase with increasing relic DNA pool size (see Fig. S1). The proportion of relic DNA in a sample had no statistical effect on the slopes for centroid distance ratios regardless of whether they were calculated using taxonomic (P ϭ 0.63) or phylogenetic (P ϭ 0.59) distance matrices. Moreover, the intercepts for these regression relationships were not different from 1.0 (P Ն 0.74), and the amount of explained variation was very small (R 2 Յ 0.09), suggesting that the overall effect of relic DNA on beta-diversity was negligible (Fig. 6). The parameter estimates from the regression analyses were not affected by ecosystem type (P Ն 0.29).

DISCUSSION
Rarely, if ever, are biological communities completely censused. As a result, estimates of diversity are often based on incomplete sampling, which introduces uncertainty and potential bias (23). For example, microbiologists using cultivation-independent approaches commonly estimate the diversity of a community using hundreds of thousands of sequences (10 5 ) from samples that contain in excess of a billion individuals (10 9 ). Diversity estimation of microbial communities is further complicated (a and b) We tested for the effects of bias caused by relic DNA by calculating a beta-diversity ratio based on centroid distances. Centroid distances were estimated after performing principal-coordinate analyses (PCoAs) using taxonomic (a) and phylogenetic (b) distance metrics (Bray-Curtis and UniFrac, respectively). The centroid distance ratio was calculated on each sample within an ecosystem type and reflects the composition of the total DNA pool (intact ϩ relic) relative to the intact DNA pool. Relic DNA had no effect on beta-diversity for any of the ecosystem types sampled. Gray symbols are the observed data, and black symbols represent means Ϯ 95% confidence intervals.
Relic DNA and Microbial Diversity ® by the accumulation and persistence of relic DNA, a pool of sequences that may not accurately reflect the composition of viable microorganisms. We developed a set of models that capture the fundamental processes regulating relic DNA dynamics along with the effects of sampling from a joint species abundance distribution (SAD) that contains sequences from both the relic and intact DNA pools. Our models reveal that two criteria are required in order for relic DNA bias to emerge. First, the relic DNA pool must be large enough so that it diminishes the probability of sampling sequences from the intact DNA pool. Second, the relic SAD must be distinct from the intact SAD. If these conditions are not satisfied, relic DNA should have a minimal effect on estimates of biodiversity. We then explored the model's expectations by analyzing the intact and total (relic ϩ intact) DNA in a range of ecosystems where relic DNA dynamics are known to vary. Despite its making up a large portion of the total DNA, we found that relic DNA had minimal effect on estimates of taxonomic and phylogenetic diversity, consistent with model scenarios where relic DNA degradation rates are equivalent among species.
Despite sampling a range of habitats with attributes that affect relic DNA turnover, our data suggest that estimates of taxonomic and phylogenetic diversity remained relatively unbiased by relic DNA (Fig. 2 and 3) even when it accounted for Ͼ80% of the total DNA ( Fig. 4 and 6). Similar findings have been reported elsewhere. From a survey of North American soils, no bias was detected for taxonomic richness in 30% of bacterial samples (7/31) and 55% of fungal samples (17/31), even when relic DNA made up a substantial portion of the total DNA pool (21). Likewise, more than 90% of the bacterial DNA recovered from porcine lung tissues was sensitive to DNase, yet diversity values in the enzymatically treated (i.e., DNase) and untreated samples were the same, indicating that large relic DNA pools do not necessarily obscure estimates of microbial diversity  Fig. S6 for a description of how centroid distance ratios were calculated. When included in the regression model as an indicator variable, ecosystem type had no effect on the slopes or intercepts of the regressions (P Ն 0.29).

Lennon et al.
® (20). Such findings are supported by our modeling results: relic DNA has no effect on estimates of microbial diversity when the SADs of the relic and intact DNA pools are equivalent (Fig. 1b). When this assumption is met, the abundance distributions of relic and intact DNA pools should be identical. Therefore, sampling from the relic DNA pool should not bias estimates of diversity in the intact DNA pool, even if relic DNA makes up a large portion of the total DNA pool (Fig. 1b).
Since bias has been documented in some studies, the important question becomes under what conditions are we likely to see differences in the SADs of the relic and intact DNA pools? One leading hypothesis is that relic DNA can be protected from degradation inside aggregates, biofilms, or other complexes that reduce the contact rate of nucleic acids and extracellular enzymes. Importantly, our simulations reveal that protection can increase bias only if it alters relic DNA degradation in a species-specific manner. While most studies have emphasized the potential for protection to overestimate diversity, our model predicts the opposite (Fig. 1b). Protection creates a less even relic SAD that results in the increased dominance of relic sequences associated with a few species. Assuming a large relic DNA pool, preferential sampling of these protected sequences will lead to the underestimation of richness, which has been observed but only in a few samples (21).
In contrast, we identified at least one scenario where relic DNA can create overestimation bias. Microscale heterogeneity in structured habitats can lead to nonuniform distributions of microorganisms and their metabolic activities. Our simulations indicate that the resulting "hot spots" can degrade relic DNA sequences in a density-dependent manner, resulting in a more even relic SAD. Although not statistically significant, sediment samples trended toward positive bias for richness and PD (Fig. 3), which could potentially reflect "hot spots" of relic DNA degradation. There are other situations where inflation bias could potentially arise, including when species in the regional pool disperse into habitats for which they are poorly adapted. For example, our simulations indicate that when poorly adapted immigrants immediately die, they can enrich the relic DNA pool with allochthonous sequences that are dissimilar to sequences found in the local community and thereby influence estimates of diversity (see Fig. S7 in the supplemental material). A similar effect could arise when dead bacteria are transported across ecosystem boundaries, a phenomenon that occurs, for example, when marine snow is exported from surface waters to marine sediments (24). More generally, bias may arise under nonequilibrium conditions where community turnover of the intact DNA pool is faster than turnover of the relic DNA pool. Any abiotic or biotic perturbation that removes a substantial amount of living biomass could result in transient divergence in the composition of sequences in the intact and relic DNA pools. For example, virulent phage can rapidly reduce the abundance of bacterial prey and in the process release a large pulse of dissolved DNA into the environment (25). In this way, relic DNA might distort estimates of temporal stability in microbial communities. Last, although not fully explored here, shifts in rank abundance of taxa that are independent of parameters describing the SAD could also result in biased estimates of microbial diversity. In sum, there are ways to deviate from neutral expectations about the degradation of relic DNA, which should lead to biased estimates of microbial diversity. And, of course, many of the processes described above can operate simultaneously, which may amplify or dampen the effects of relic DNA bias. Data from our study suggest that such conditions may not be prevalent, but additional sampling across an even broader range of ecosystems is needed to determine whether or not relic DNA bias is a major concern for estimating biodiversity.
Our understanding of the microbial biosphere has been transformed by the development and application of molecular biology-based cultivation-independent techniques. The ability to rapidly obtain millions of gene sequences and transcripts from a range of environments has yielded valuable insight into the processes that regulate community assembly and function (26,27) and has also paved the way for the discovery of new metabolisms (28), tests for unifying patterns of biodiversity (29), and an updated tree of life (30). There are limitations, however, associated with culture-independent techniques, which include inefficient nucleic acid extraction methods (31) and "universal" primers that overrepresent some taxonomic groups while overlooking others (32). Sequencing of relic DNA is another important concern, which can potentially lead to the overestimation or underestimation of microbial diversity. However, this bias requires the decoupling of processes that regulate the compositional turnover of the relic and intact DNA pools. While some recent evidence suggests that this can arise (21), our findings suggest that at least in some ecosystems, relic DNA appears to contribute minimally to the characterization of microbial community structure.

MATERIALS AND METHODS
Theoretical framework for relic DNA dynamics and diversity. (i) Sampling model. We used a set of sampling-based simulations to explore how the mixing of intact and relic DNA influences estimates of diversity. For each simulation, we defined a regional species pool consisting of 10,000 taxa with a lognormal abundance distribution (33). The intact community consisted of 1,000,000 individuals randomly sampled from the regional species pool. To evaluate the hypothesis that the species abundance distributions (SADs) must be different between the intact and relic DNA in order for bias to arise, we altered the evenness of the regional pool from which the relic community was sampled by changing the scale parameter of the lognormal distribution. To decrease the evenness of the relic DNA pool, we increased the scale parameter from 0.98 to 1.8. To increase the evenness of the relic DNA pool, we decreased the parameter from 0.98 to 0.25. After mixing the intact and relic communities at the defined proportions ranging from 0.01 to 0.96, we rarified the total community to 10,000 observations before estimating richness and compositional dissimilarity of DNA pools in a simulated sample. We represent the effects of relic DNA bias for a given sample using diversity ratios (total DNA/intact DNA) for richness and Bray-Curtis distances. For example, if relic DNA results in no bias, then the diversity ratio for a sample will equal 1.0. However, if relic DNA inflates diversity, then the diversity ratio will be greater than 1. If diversity ratios are less than 1, then relic DNA should underestimate the true diversity of a sample.
(ii) Process-based model. We developed a set of stochastic simulations to explore how the processes regulating relic DNA dynamics can give rise to bias in the estimation of microbial diversity. For each simulation, we defined a regional species pool consisting of 4,000 taxa with a lognormal abundance distribution (33). We initiated an intact community by randomly sampling 20,000 individuals from the regional species pool. Subsequent dynamics were controlled by the processes depicted in our conceptual model (Fig. 1a). Specifically, we simulated immigration by sampling j individuals from the regional species pool and adding them to the intact community. We simulated birth by randomly selecting I ϫ r individuals from the intact community and adding them to the intact community again, where I is the size of the intact community and r is the per capita birthrate. We simulated death by selecting (I ϫ m) ϩ j individuals from the intact community and moving them to the relic community (R), where m is the per capita mortality rate. Finally, we simulated degradation by selecting (R ϫ d) ϩ j individuals from the relic community and removing them, where d is the per capita decay rate of relic DNA.
With the process-based model in place, we investigated three scenarios that we hypothesized could affect relic DNA dynamics and its effects on diversity estimates. First, we considered a neutral scenario. At each time step, we simulated degradation by randomly selecting relic DNA sequences and removing them. Second, we considered the scenario where some relic DNA sequences have lower rates of degradation owing to processes such as physical protection. Each species was randomly assigned a degradation susceptibility probability from a beta distribution. The beta distribution is a continuous probability distribution bound by 0 and 1 with two shape parameters. We selected shape parameters (␣ ϭ 0.7, ␤ ϭ 0.7) that resulted in a U-shaped distribution of susceptibilities such that values close to 0 had low probabilities of degradation and values close to 1 had high probabilities of degradation. Susceptibility probabilities were randomly assigned to species for each model iteration and were used to weight the probability of selecting relic DNA sequences for degradation at each time step. Last, we considered the scenario where there are "hot spots" of relic DNA degradation, reflecting the clumped distribution of microbial taxa and their associated metabolic activities. At each time step, we calculated density-dependent selection probabilities by weighting the probability of selecting an individual from the relic community by its species abundance. Therefore, relic DNA sequences belonging to more abundant species would have a higher probability of degrading. For each of the three scenarios, we ran each simulation for 10,000 time steps. We used a constant immigration rate (j ϭ 2,000) to maintain intact community diversity and used equal and constant birth and mortality rates (b and m ϭ 0.1) to maintain intact community size. Likewise, we accounted for immigration during death and degradation to prevent uncontrolled growth. For each set of simulations, we used a range of decay rates (d), which yielded relic DNA proportions between 0.05 and 0.95 according to the following equation: d ϭ m/p Ϫ m, where p is the target proportion and m is the fixed mortality rate (see the supplemental material for derivation). At the end of each simulation, we determined the effect of relic DNA on diversity estimates by comparing the total community (intact ϩ relic) with the intact community. All simulations and estimations were performed in the R statistical environment (v 3.3.2) (34) using the "vegan" package as well as custom functions.
Empirical assessment of relic DNA abundance and effects on diversity estimation. (i) Sample collection and DNA pools. We collected samples from a diverse set of environmental and hostassociated ecosystems. First, we sampled sediments and surface water from lakes near the Michigan State University W. K. Kellogg Biological Station (KBS) in Hickory Corners, MI. Soils were sampled from the main sites and surrounding areas at the KBS Long-Term Ecological Research site (35). We also collected fresh feces as representative gut samples from cows, dogs, horses, rabbits, and humans. In each of the ecosystem types, we obtained samples from 6 to 8 independent sites. In the laboratory, we applied a DNase treatment to half of a well-homogenized sample to remove relic DNA (Fig. S2). This enzymatic removal procedure is based on methods that have previously been used to quantify relic DNA in marine sediments (36,37), host tissue (36,37), and drinking-water biofilms (36,37). DNase is effective at removing not only extracellular DNA but also DNA contained inside dead cells while not compromising the integrity of living cells (36,37). After assessing the optimal DNase concentration (Fig. S3), we determined that the DNase method was extremely efficient (98%) at removing a spike of 16S rRNA into a soil sample (Fig. S4). Thus, we refer to the DNA remaining following enzymatic treatment as "intact DNA" and assume that it was derived from viable cells. "Total DNA" refers to the DNA in the other half of the sample that was not treated with DNase (negative control), which reflects the sum of intact DNA and relic DNA.
For aquatic samples, we filtered 250 ml lake water through a 47-mm 0.2-m Supor polyethersulfone (PES) membrane filter using a 10-mm Hg vacuum. We cut the filter in half and randomly assigned one half of the filter to a DNase treatment and used the other half as the control. We then rolled up each piece of filter using sterile forceps and inserted it into a 2-ml centrifuge tube containing 1.5 ml of pH 7.3 phosphate-buffered saline (PBS). After vortexing at room temperature for 5 min, we removed the filter from the tube and centrifuged for 5 min at 10,000 ϫ g. We then discarded the supernatant and resuspended the pellet in 375 l of PBS, which was transferred to a 2-ml centrifuge tube. For nonaqueous materials (soil, sediments, and feces), we directly added 0.25 g of well-homogenized sample to a 2-ml centrifuge tube. At this stage in the procedure, aquatic and nonaquatic samples were identically processed. For each sample in its 2-ml centrifuge tube, we added DNase digestion buffer, which consisted of 382.5 l of Nanopure water, 5 l of 1 M MgCl 2 , 2.5 l of bovine serum albumin (10 mg/ml), and 50 l of 0.5 M Tris-HCl (pH 7.5). For subsamples treated with DNase, we added 40 l of a 10-U/l stock of DNase I (Roche catalog no. 04536282001) and 20 l of Nanopure water, resulting in a 500-l final working volume with an 0.8-U/l DNase concentration. For untreated subsamples, we substituted 40 l of Nanopure water for DNase solution. For each sample, we measured pH using a micro-pH probe (Orion 9110DJWP; Thermo Scientific) and adjusted the pH to 7.3 to 7.7, which is in the optimum range for DNase. We then incubated the samples horizontally on a shaker table at 37°C for 60 min. Following this, we transferred each sample to a 15-ml Falcon tube containing 1 ml of 1ϫ hexadecyltrimethylammonium bromide (CTAB) buffer, which consisted of one part solution 1 (1 g CTAB plus 0.58 g NaCl in 10 ml of Nanopure water) and one part solution 2 (0.58 g NaCl plus 0.82 K 2 HPO 4 plus 0.04 KH 2 PO 4 in 10 ml of Nanopure water). We then added 25 l of 0.5 M EDTA to each tube and vortexed. We stopped the DNase reaction by incubating the tubes at 75°C for 10 min. We began DNA extraction by adding 1 ml of phenol-chloroform-isoamyl alcohol (25:24:1) to the 2-ml tube, which was then vortexed for 10 min. This was followed by centrifugation at 7,100 ϫ g for 10 min. We transferred the top aqueous layer to a 15-ml Falcon tube, combined it with an equal volume of chloroform-isoamyl alcohol (24:1), and then vortexed the mixture. After centrifuging the Falcon tubes at 7,100 ϫ g for 5 min, we transferred 400 l of the top aqueous layer to a new 2-ml centrifuge tube. For cleanup, we used the Mo Bio PowerLyzer PowerSoil DNA isolation kit starting from step 9. At step 21 of the DNA isolation kit, we eluted the DNA in 50 l of solution C6 and centrifuged it for 1 min at 10,000 ϫ g. We stored extracted DNA at Ϫ20°C for short-term storage or at Ϫ80°C for long-term storage.
(ii) Contribution of relic DNA to bacterial abundance. We used 16S rRNA gene copy numbers generated from quantitative PCR (qPCR) assays to estimate the proportion of relic DNA in a sample as 1 Ϫ (intact DNA/total DNA). Following procedures described elsewhere (38), we performed qPCR assays where 30-l reaction mixtures contained 1 l of DNA template, 0.5 l of each primer (10 M), 14.5 l of nuclease-free H 2 O, and 13.5 l of 5 Prime 2.5ϫ RealMasterMix SYBR ROX. We amplified a 200-bp fragment of the 16S rRNA gene with Eub338 (forward) and Eub518 (reverse) primers (39). PCR assays were performed with an Eppendorf Mastercycler Realplex 2 system using previously reported thermal cycle conditions (39). We generated qPCR standards from bacterial genomic DNA (Micrococcus sp.) using the TOPO TA cloning kit (Invitrogen). We extracted plasmids from transformed cells (40) and used the M13 forward and reverse primers to generate PCR products. The PCR products were quantified using a noise-band threshold and then used to generate a standard curve capturing a range of 10 2 to 10 7 gene copies/l. The coefficients of determination (r 2 ) for our assays ranged from 0.96 to 0.99, while amplification efficiencies fell between 0.93 and 0.99. Based on melting curve analyses, we found no evidence for primer dimers. All unknown samples, no-template controls, and standards were run in triplicate on every plate. The mean coefficient of variation (standard deviation/mean) for our 16S rRNA qPCR assay was 0.16.
(iii) Contribution of relic DNA to bacterial diversity. We estimated the contribution of relic DNA to bacterial diversity using high-throughput sequencing of the 16S rRNA gene. Specifically, we amplified the V4 region of the 16S rRNA gene from the intact and total DNA pools of each sample using bar-coded primers (515F and 806R) designed to work with the Illumina MiSeq platform (41). We cleaned the sequence libraries using the AMPure XP purification kit, quantified the resulting products using the QuantIt PicoGreen kit (Invitrogen), and pooled libraries at equal molar ratios (final concentration, 20 ng per library). We then sequenced the pooled libraries with the Illumina MiSeq platform using paired-end reads (Illumina Reagent kit v2, 500-reaction kit) at the Indiana University Center for Genomics and Bioinformatics Sequencing Facility. Paired-end raw 16S rRNA sequence reads were assembled into contigs using the Needleman algorithm (42). We obtained a total of 12,916,632 16S rRNA sequences. After quality trimming with a moving average quality score (window, 50 bp; minimum quality score, 35), we aligned the sequences to the Silva Database (version 123) using the Needleman algorithm. Chimeric sequences were removed using the UCHIME algorithm (43). After this filtering, there was an average (Ϯ standard error of the mean [SEM]) of 222,701 Ϯ 9,560 sequences per site. We created operational taxonomic units (OTUs) by first splitting the sequences based on taxonomic class (using the RDP taxonomy) and then binning sequences into OTUs based on 97% sequence similarity. Our depth of sequencing led to a high degree of coverage across samples (minimum Good's coverage ϭ 0.98). For phylogenetic analysis, we picked representative sequences for each OTU by using the most abundant unique sequence. We used FastTree (44) to generate a phylogenetic tree from the representative sequences using the generalized time-reversible model of nucleotide evolution. We calculated phylogenetic distances using weighted UniFrac distances (45). All initial sequence processing was completed using the software package mothur (version 1.38.1) (46).
We estimated the effects of relic DNA on alpha-diversity by calculating richness, evenness, and phylogenetic diversity for the intact and total DNA pools within a sample. To estimate the number of OTUs (richness), we used a resampling approach that subsampled each sample to an equal number of sequences per sample and summed the number of OTUs that were represented (47). Briefly, we subsampled to 30,000 observations, resampled 999 additional times, and then calculated the average richness estimates (Ϯ SEM) for each sample. To estimate the equitability in abundance among taxa in a sample (evenness), we used the same resampling approach and calculated average evenness estimates (Ϯ SEM) using Simpson's evenness index (48). To test whether relic DNA affected the phylogenetic diversity within a sample, we subsampled communities to 30,000 observations and then calculated Faith's D statistic, which sums the branch lengths for each species found in a sample from the root to the tip of the phylogenetic tree (49). In addition, we used two-sample Kolmogorov-Smirnov tests to determine whether or not the abundance distribution of taxa in the intact pool was different from that in the total pool for paired samples. We then calculated the average D statistics and P values from the Kolmogorov-Smirnov tests to evaluate the effect of relic DNA across ecosystem types. All estimations were performed in the R statistical environment (v 3.3.2) (34) using the "vegan," "ape," "ade4," "picante," and "plyr" packages, along with custom functions.
We estimated the effects of relic DNA on beta-diversity by comparing the taxonomic and phylogenetic diversity of bacterial communities in the intact and total DNA pools. First, we conducted a principal-coordinate analysis (PCoA) on log 10 -transformed relative abundance data to visualize the effects of relic DNA removal (via DNase treatment) on bacterial community composition within and among ecosystem types. The PCoA was performed with Bray-Curtis and UniFrac distances to assess taxonomic and phylogenetic effects, respectively. In addition, we used permutational multivariate analysis of variance (PERMANOVA) to test for differences in taxonomic and phylogenetic composition based on ecosystem type for the total DNA pool. Second, we conducted a Mantel test to assess the correlation between the community resemblance matrices (either Bray-Curtis or UniFrac) represented by the intact and total DNA pools. Last, we tested whether relic DNA altered beta-diversity within an ecosystem type by comparing centroid distances. To calculate this metric of sample dispersion, we determined the centroid from a PCoA with either Bray-Curtis or UniFrac distances for the total DNA pool for all sites within a given ecosystem type. We then measured the Euclidean distances between the centroid and all samples (total and intact) to determine the centroid distances (see Fig. S6 for more detail).
Data and software availability. All code and data used in this study can be found in a public GitHub repository (https://www.github.com/LennonLab/relicDNA) and the NCBI SRA (BioProject PRJNA464404).