Novel Autotrophic Organisms Contribute Significantly to the Internal Carbon Cycling Potential of a Boreal Lake

ABSTRACT Oxygen-stratified lakes are typical for the boreal zone and also a major source of greenhouse gas emissions in the region. Due to shallow light penetration, restricting the growth of phototrophic organisms, and large allochthonous organic carbon inputs from the catchment area, the lake metabolism is expected to be dominated by heterotrophic organisms. In this study, we test this assumption and show that the potential for autotrophic carbon fixation and internal carbon cycling is high throughout the water column. Further, we show that during the summer stratification carbon fixation can exceed respiration in a boreal lake even below the euphotic zone. Metagenome-assembled genomes and 16S profiling of a vertical transect of the lake revealed multiple organisms in an oxygen-depleted compartment belonging to novel or poorly characterized phyla. Many of these organisms were chemolithotrophic, potentially deriving their energy from reactions related to sulfur, iron, and nitrogen transformations. The community, as well as the functions, was stratified along the redox gradient. The autotrophic potential in the lake metagenome below the oxygenic zone was high, pointing toward a need for revising our concepts of internal carbon cycling in boreal lakes. Further, the importance of chemolithoautotrophy for the internal carbon cycling suggests that many predicted climate change-associated fluctuations in the physical properties of the lake, such as altered mixing patterns, likely have consequences for the whole-lake metabolism even beyond the impact to the phototrophic community.

A ll life on Earth depends on carbon fixation, where phototrophic organisms convert inorganic carbon dioxide into organic compounds and living biomass. Currently, oxygenic phototrophs, deriving their energy from light, are regarded as the most important carbon fixers. However, on planetary time scales anoxygenic phototrophs and chemotrophs have been more prevalent (1). Even today, chemolithoautotrophy is a major strategy in many environments, such as deep-sea vents and sediments (2)(3)(4). In these environments, carbon fixation is driven by a redox gradient (i.e., a biogeochemical gradient of reductants and oxidants sorted by their redox potentials) often located in the border zone between oxic and anoxic conditions. These redox transition zones constitute a major share of the Earth's biosphere and have a significant impact on surrounding entities, such as elemental cycles and food webs (5)(6)(7).
Chemolithotrophy may have a large impact on the carbon cycle in many environments. For example, in boreal lakes there is a steep redox gradient at the oxic-anoxic border, which could facilitate carbon fixation via chemolithotrophy. Still, our knowledge on the chemolithotrophic energy generation in these habitats is poor. Some first studies highlighted the importance of chemolithotrophy for carbon assimilation in freshwater lakes such as Kivu, a freshwater lake in central Africa with a high methane (CH 4 ) concentration (2). Furthermore, two recent studies confirmed the genetic potential for autotrophy in microbial communities in the anoxic water masses of boreal lakes (8,9). This suggests that chemolithotrophy could be a common process fueling assimilation of inorganic carbon in boreal lakes. Thus, dark carbon fixation may play a significant role in internal carbon cycling and, thus, could modulate lake food webs and ultimately whole-lake carbon balances.
Small boreal lakes are important drivers of global greenhouse gas (GHG) emissions (10,11), and their importance was highlighted in the recent report of the Intergovernmental Panel on Climate Change (IPCC) (12). Globally, water bodies smaller than 0.001 km 2 contribute 40% of all methane emissions from inland waters (13). Typically, these small lakes and ponds are characterized by high concentrations of dissolved organic carbon (DOC) and shallow light penetration depth, which leads to steep stratification of oxygen and other electron acceptors and donors through most of the year. The stratification coincides with a distinct set of bacterial and archaeal phyla organized according to a vertical redox gradient (9,14,15). The microbial communities in these lakes may harbor organisms that have the potential for photoautotrophy under low light intensity (14,16,17) and for chemoautotrophy throughout the water column (9). These predictions are based on taxonomic information derived from 16S rRNA genes combined with functional gene inventories and genomic data of related cultivated representatives. Therefore, they are not a true reconstruction of the genetic makeup of individual organisms or whole communities along the redox tower. This lack of a detailed metabolic picture limits our understanding of the functional potential of microbes in boreal lakes.
We studied the potential of the microbial community for chemoautotrophy in the lake Alinen Mustajärvi, a well-characterized boreal lake located in southern Finland (15,18). This lake exhibits the typical characteristic features of boreal lakes, including (i) a high load of terrestrial organic carbon resulting in net heterotrophy of the system (18); (ii) a gradient of oxygen, temperature, and light; and (iii) stratification of the microbial community (9,14,15). We combined dark carbon fixation measurements with a survey of the functional potential of the microbial community (shotgun sequencing of the total DNA) from a vertical transect of the lake water column. Our aim was to link lake chemistry to the prevalence of genes related to energy generation via redox reactions and inorganic carbon assimilation. Moreover, we used annotated metagenomeassembled genomes (MAGs) to obtain metabolic reconstructions of uncharacterized and novel lake microbes and to identify the key chemoautotrophs in the lake. Our hypotheses were that (i) the chemical stratification of the lake was linked to microbial activity and would be reflected in the functional potential and structure of the communities, (ii) the lake harbors an abundant and diverse chemoautotrophic com-munity, and (iii) chemoautotrophic pathways would be enriched below the euphotic zone of the lake, leading to a high potential for internal carbon cycling.

RESULTS AND DISCUSSION
Microbial community structure and CO 2 incorporation in the water column. The metabolic pathways and organisms that could be involved in autotrophic processes in the water column of the lake Alinen Mustajärvi were studied based on molecular (metagenomic shotgun sequencing and 16S rRNA gene amplicons) and geochemical analyses from samples taken in 2013. This vertical transect of the lake covered 13 depths. Additionally, the community composition was surveyed in 2008 using a clone library covering 4 depths and 303 clones of 16S rRNA genes. We also measured inorganic carbon dynamics biweekly in 2008, monthly in 2009, and on 3 occasions in 2010. The carbon measurements indicated that at the 3-m depth the incorporation of inorganic carbon dominated over respiration during parts of the ice-free season ( Fig. 1; see also Fig. S1 in the supplemental material). Time of net autotrophy coincided with stratification of the lake and vertical structuring of the microbial community along the vertical gradient ( Fig. 2A). The maximum value for dark carbon assimilation at the 3-m depth was 10% of the average net primary production (carbon assimilation in the light) within the euphotic zone ( Fig. 1) (18). It should be taken into account that these are point measurements and could be highly time and depth specific. However, they are well in line with dark carbon fixation values measured from lake sediments (19), which share many of the same physicochemical characteristics with the overlying water, such as a gradient of electron donors and acceptors suitable for dark carbon fixation.
The analysis of 16S rRNA genes indicated that the communities were dominated by similar phyla in 2008 and 2013 ( Fig. 2A and B) and that community composition was consistent with previously published profiles from the same lake (15). The water column exhibited a physicochemical stratification with water temperature, oxygen (O 2 ), and sulfate (SO 4 ) concentrations decreasing with depth, whereas the concentrations of CH 4 , DOC, ammonia (NH 4 ), nitrite (NO 2 ), and phosphate (PO 4 ) were highest at the lake bottom ( Fig. 3A to D). The microbial community was stratified into distinct layers with positive correlation between community composition in samples taken within a 1-m Autumn mix Stratification distance of each other and negative correlations among samples taken more than 2.6 m apart (Fig. 3E). While the bacterial community was homogenous throughout the epilimnion, below there was a sharp change coinciding with the decrease in oxygen concentration in the metalimnion and a subsequent succession of microbial taxa in the hypolimnion.
Stratification of microbial functional potential follows the redox gradient. The functional potential of the microbial community was estimated based on shotgun sequencing of the total DNA. Sequence coverage among samples (i.e., the proportion of reads from each of the samples that could be mapped to contigs) varied from 40.2% at the bottom to 75.5% in the surface ( Table 1). The variation in the coverage is a result of an increase in diversity toward the lake bottom; thus, the sequences in the hypolimnion samples were divided across a higher number of distinct genomes, resulting in only the most abundant ones having enough sequencing coverage to be assembled. Thus, our functional analysis may be missing some of the pathways present in the lake but is expected to reflect the dominant metabolic potential in the water column. The abundances of different genes were normalized sample-wise using the abundance of 139 single-copy genes (20). With this approach, the abundance of the markers is expressed as occurrences per genome equivalent (OGE), e.g., a value of 1 suggests the presence of the gene in every genome of the sample. The limitations of the approach are that (i) in the deeper layers a high proportion of the community was composed of members of candidate phyla that are known to lack some of the single-copy genes (21) and (ii) some of the single-copy genes might not be recognized in our screening. Thus, especially in the deeper layers we might underestimate the true genome abundances, leading to overestimation of OGE. It should also be noted that the values suggesting abundances of 1 and above could be explained by the presence of multiple gene copies in the genomes harboring these markers or, as stated above, reflect the impact of underestimated total number of genomes.
Markers indicative of phototrophy could be found throughout the water column (Fig. 4). The genes encoding oxygen-evolving photosystem II and aerobic anoxygenic photosynthesis (AAP) decreased rapidly with depth, with combined abundance of these pathways at the surface being 1 OGE. Despite the shallow light penetration depth (Fig. S2), a second peak in photosynthetic potential was observed in the hypolimnion, where the abundance of the marker for anaerobic anoxygenic photosynthesis (AnAP; bacterial photosynthesis) was up to 0.32 OGE (Fig. S3).
In the hypolimnion, there were multiple potential pathways for energy generation, some of which were even more abundant than AnAP. One of these was hydrogen oxidation, for which the potential increased toward the lake bottom with the nickel- dependent hydrogenase being more abundant right below the metalimnion and the iron-dependent hydrogenase being more abundant at the lake bottom ( Fig. 4), where the abundance of hydrogenases was close to 1 OGE. The relative importance of these different pathways for the actual energy metabolism of the community depends on the activity of the organisms, which we cannot judge based on their genetic profiles. An iron gradient in the water column suggested that redox reactions related to iron could be important Functional Stratification in a Boreal Lake ® pathways for energy acquisition (Fig. 3D), and putative iron oxidizers (order Ferrovales) were abundant in the metalimnion. However, there were no specific markers for this pathway; thus, while we have support for the potential for using iron as an energy source, we cannot visualize the trend for this pathway in the water column.
Based on their concentrations, the most important inorganic electron donors in the water column appeared to be CH 4 in the upper hypolimnion and sulfide-sulfur, which spanned almost throughout the hypolimnion. The peak of the marker indicative for sulfide-sulfur oxidation coincided with the maxima in phototrophic sulfur-oxidizing  PF00124  TIGR01157  TIGR01115  TIGR03052  TIGR03049  TIGR03047  TIGR03045  TIGR03041  TIGR03039  TIGR03053  TIGR03043  TIGR01153  TIGR01335  TIGR01336  TIGR02866  TIGR02891  TIGR02745  TIGR04244  TIGR02376  TIGR00351  PF14710  PF02300  PF02313  PF09242  PF01077  TIGR02064  TIGR02066  PF00374  PF14720  PF02906  PF02256  PF10657  PF02327  PF01558  PF01855  PF02249  PF02745  TIGR03256  TIGR03257   6 Chlorobium, which were major community members in the hypolimnion. At this depth, the abundance of the markers for sulfide oxidation was 0.39 OGE. The marker specific for sulfate reduction suggested highest potential in the lower hypolimnion. Markers for bacteria using anaerobic ammonia oxidation (anammox) were not detectable in the data set. We could find markers for ammonia monooxygenase, but a manual inspection of the hits to these hidden Markov models (HMMs) indicated that these were in fact to its paralog methane monooxygenase. Thus, these HMMs were used as markers for methanotrophy instead of nitrification.
The potential for aerobic respiration was stable from epilimnion to upper hypolimnion, whereas the highest potential for microaerophilic respiration was right above the depth where oxygen concentration dropped below detection limit, and the potential for anaerobic respiration was highest in the upper hypolimnion. Potentials for using alternative electron acceptors for respiratory reactions followed the classic redox tower, being nitrous oxide (N 2 O), NO 2 , NO 3 , Fe 3ϩ , SO 4 , and CO 2 from metalimnion to the lake bottom. The profile suggested that the reactions in the denitrification pathway (reduction of nitrate to N 2 O or further to nitrogen gas N 2 ) were divided between multiple organisms inhabiting different redox zones in the oxic-anoxic boundary layer, as the highest potentials for different parts of the pathway were found at different depths (Fig. 4). Previous results on stratified lakes suggest that N 2 O commonly accumulates at the oxycline (22), as it is produced through both anaerobic nitrate reduction in the anoxic hypolimnion and aerobic nitrification in the oxic water layers. Furthermore, the potential for nitrite reduction was higher than the potential for N 2 O reduction, potentially explaining N 2 O accumulation (9,22). This is due to the expression of the N 2 O reductase gene (nosZ) being more sensitive to oxygen than expression of the nitrite reductase gene (nirK) (23). In our Pfam/TIGRFAM profiles, the potential for nitrate reduction occurred in the upper hypolimnion and that for N 2 O reduction occurred in the lower metalimnion, while markers for nitrification were sparse, suggesting that N 2 O dynamics is driven by anaerobic processes, potentially using NO 3 as electron acceptor. The general patterns for Fe 3ϩ and SO 4 reduction could not be assessed due to lack of pathway-specific HMMs, but these were investigated using metagenome-assembled genomes (MAGs) ( Table S2).
Unrecognized metabolic and chemoautotrophic potential in novel bacterial taxa. In general, the microbial community had potential for three different pathways of autotrophic carbon fixation (Fig. 5). In the epilimnion, the Calvin-Benson-Bassham cycle (CBB; also called the reductive pentose phosphate pathway) was highly abundant, while the reductive citric acid cycle (rTCA) and the Wood-Ljungdahl pathway (WL; also called the reductive acetyl coenzyme A [acetyl-CoA] cycle) were mainly found in the hypolimnion. For the fourth carbon assimilation pathway present in the communities, the 3-hydroxypropionate cycle (3HP), no specific HMMs could be found. However, protein annotations from Prokka (24) suggested that multiple organisms possessed an almost full 3HP pathway, but certain genes in the pathway, such as mcr (malonyl-CoA reductase), were not present in the data set. The data suggested increasing autotrophic potential toward the bottom of the lake with a peak right below the oxycline and the highest potential at the lake bottom with the total abundance of pathways related to carbon fixation being close to 1 OGE (Fig. 5).
We were able to construct a total of 270 metagenome-assembled genomes (MAGs). Of these, we analyzed all the genomes with Ͼ40% completeness and Ͻ4% contamination, resulting in 93 MAGs characterized through functional profiling. The cutoff for completeness was set lower than recommended for high-quality MAGs (25) as our focus was on putative chemotrophs and autotrophs, rather than doing a complete metabolic mapping of all the MAGs. Thus, some of these MAGs likely are missing some pathways present in their genomes. However, as the contamination threshold was set to the level of high-quality MAGs (25), we are confident that the pathways that we found truly are present in the MAG in question. These 93 MAGs represented a diverse group of organisms (as determined through PhyloPhlAn [ Fig. S4] [26]) and had the potential to use a wide range of different electron donors and acceptors and a full range of metabolic pathways from chemolithoautotrophy to photoheterotrophy (Table S2). Here, we concentrate on the most abundant organisms with chemotrophic potential.
As stated above, the chemotrophic organisms appeared to be deriving their energy mainly from oxidation of sulfur, iron, and hydrogen compounds. The most abundant MAGs with potential for sulfur oxidation (sox operon typically including genes sox-ACXYZ) included organisms related to Chlorobium (bin 4), Polynucleobacter (bins 15 and 120), Comamonadaceae (bin 108), Ferrovales (bin 139a), Rhizobiales (bin 139b), and Acetobacteraceae (bins 72 and 168). However, only the Chlorobium and Ferrovales MAGs had potential for autotrophy. Chlorobium was located in the lake hypolimnion and had a near-complete complex for oxidation of reduced sulfur compounds and also a near-complete CBB pathway for autotrophic carbon assimilation. Further, two other Chlorobium MAGs (bins 52 and 92) had a dsr operon, which is used in reverse in Chlorobium to oxidize sulfur (27). Chlorobium can grow both autotrophically and mixotrophically (28), and they derive the energy for autotrophy from sunlight using sulfur oxidization to produce reducing equivalents for growth. Thus, they are photoautotrophic organisms rather than chemoautotrophs.
The other MAG with capacity for sulfur oxidation and inorganic carbon assimilation was a MAG closely related to the betaproteobacterial Ferrovales order (bin 139a). This MAG was 4.6 Mb in size (91.3% complete with 0.67% contamination) and included all of the genes in the CBB and rTCA pathways, suggesting the potential for autotrophy. While the MAG had potential for sulfur oxidation, it also included genes for iron oxidation: cytochrome cyc1, cytochrome c oxidase (ctaCDE), ubiquinol-cytochrome c ® reductase (petABC), and NADH:quinone oxidoreductase (nuoABCDEFGHIJKLMN). Further, the genome included an iron oxidase, which was located on the same contig as cytochrome cyc1, completing the pathway. This is consistent with the closest cultivated organism, iron-oxidizing Ferrovum myxofaciens (29), which thrives in acidic mine drainage (29). The MAG also included markers indicative of aerobic, anaerobic, and microaerophilic respiration. Thus, the electron acceptor could be O 2 when it is available. However, we could not identify any other electron acceptors for this organism. Thus, the genomic features together with the abundance distribution in the lake suggest that this organism is a chemoautotroph inhabiting suboxic to anoxic environments, using iron or sulfur oxidation as an energy source. In our data, this organism was rather abundant in both 16S rRNA gene amplicon data and the metagenomes but has not been previously found in boreal lakes. This particulate taxon was abundant in a narrow zone (between 2.5 and 3.6 m); thus, a possible reason why it has been previously missed is the sparse sampling schemes of many experiments. Moreover, it represents a recently established bacterial order (29), and this taxon may have been classified as "uncultured Betaproteobacteria" in previous studies. MAGs with autotrophic potential also included organisms that would appear to acquire energy by combining oxidation of hydrogen with sulfate or nitrate reduction. Hydrogenases found in the data represented FeFe-and NiFe-type hydrogenases as described in reference 30, with the latter type being more prevalent among the MAGs. For example, a MAG closely related to Gallionellaceae (bin 129) carried a 1e-type hydrogenase, which is specifically used for electron input to sulfur respiration, and the MAG did have potential for sulfur reduction. It also had a full CBB pathway. Bin 129 was closely related to betaproteobacterial Sideroxydans lithotrophicus, which thrives in the same environment as Ferrovales (31). Similar to bin 129, this organism has the potential for the CBB cycle; however, it has been suggested to derive its energy from iron rather than hydrogen oxidation (28). Another MAG, closely related to Desulfobulbaceae (bin 93), had a 1c-type hydrogenase, also typically related to sulfate respiration, and an almost complete pathway for sulfate reduction. It also appeared to have the potential for CO 2 fixation through the reductive tricarboxylic acid (rTCA) cycle. The closest relative to this bin, Desulfotalea psychrophila, is also a sulfate reducer but has been reported to be heterotrophic (32). However, D. psychrophila inhabits cold environments, which is consistent with bin 93, which was most abundant in the deep layer of the lake, where the water temperature is around 4°C throughout the year.
In accordance with the redox potentials in the water column and with the literature (33,34), the reduction of nitrate to N 2 was dispersed among multiple organisms. "Candidatus Methyloumidiphilus alinensis" (bin 10) (35) and a MAG closely affiliated with Crenotrix (bin 149) both had a complete narGHIJ operon for nitrate reduction and also genes for methane oxidation, as has been previously described for a member of the Methylobacter family (36). Gene nosZ, coding for N 2 O reductase, was present in two high-quality MAGs, which were taxonomically assigned to Myxococcales (bin 233) and Bacteroidetes (bin 64). However, these did not appear to be autotrophic organisms. The norCB operon, encoding nitric oxide (NO) reductase, was complete in two MAGs, in "Candidatus Methyloumidiphilus alinensis" and in a MAG affiliated with Comamonadaceae (bin 239). The latter was also carrying the potential for the CBB cycle. The possible electron acceptors for the organism were sulfur and hydrogen. At the very bottom of the lake, we could identify three archaeal MAGs. Two of these were hydrogen oxidizing (hydrogenotrophic methanogenesis; bins 133 and 155, closely related to Methanolinea and Methanoregula, respectively), while the third one was using acetate (acetogenic methanogenesis; bin 74, closely related to Methanosaeta).
Conclusions. Consistent with our expectations, the microbial community included a variety of different chemotrophic pathways. Further, the microbial community driving these processes contained abundant novel organisms with the potential for autotrophy. The assembly of the functional potentials was in accordance with redox potentials of the electron acceptors in the lake (Fig. 6). Further, consistent with our hypothesis, there was a diverse set of abundant chemoautotrophic organisms below the euphotic zone of the lake. The autotrophic community had an unexpected major community member closely related to Ferrovales, presumably thriving via iron oxidation. We also identified other novel autotrophs, such as an organism related to recently identified autotrophic Sideroxydans lithotrophicus.
The fact that many of the potentially autotrophic microbes were among the most abundant microbes in the lake and the abundance of photoautotrophic Chlorobium emphasize the potential role of internal carbon cycling as a process that mitigates the flow of CO 2 from boreal lakes to the atmosphere. Our results suggest that autotrophic iron-, sulfur-, and hydrogen-oxidizing microbes have a high potential to significantly contribute to inorganic carbon fixation in the lake. In fact, our measurements of inorganic carbon incorporation suggested that a significant amount of CO 2 originating from degradation of autochthonous and terrestrial carbon can be reincorporated into biomass in the poorly illuminated, anoxic layer of the lake. This is also well in line with results showing that chemolithoautotrophy significantly contributes to carbon and energy flow in meromictic Lake Kivu (2) and with recent results regarding the autotrophic potential in boreal lakes (8). These processes are strongly dependent on the prevalent environmental conditions, which have been predicted to change following the warming of the climate. Thus, our results suggest that if the predicted alterations in lake environment, such as changes in mixing patterns, should happen, we may expect reorganization of the metabolic processes in the lake, which would have unknown implications for the carbon flow in the water column.

MATERIALS AND METHODS
Site description and sampling. The study lake, Alinen Mustajärvi, is situated in southern Finland (61°12=N, 25°06=E). It is a 0.007-km 2 headwater lake with a maximum depth of 6.5 m and an estimated volume of 31 ϫ 10 3 m 3 . The catchment area is Ͻ0.5 km 2 , and it consists of Ͼ90% coniferous forest and Ͻ10% peatland. The lake is characterized by steep oxygen stratification during summer and also during the ice cover period, which lasts from late November until late April. The stratification is disrupted by regular autumn and irregular spring mixings. As such, Alinen Mustajärvi is a representative for the millions of lakes and ponds in the arctic and boreal zones.
The metagenome sampling was conducted in the beginning of September 2013, at the end of the stratification period. The lake was sampled at 13 depths: the oxic epilimnion was sampled at 0.1, 1.1, 1.6, 2.1, and 2.3 m; the metalimnion was sampled at 2.5 and 2.9 m; and the hypolimnion was sampled at 3.6, 4.1, 4.6, 5.1, 5.6, and 6.1 m. Water samples were taken with a 20-cm-long acrylic tube sampler (Limnos; volume, 1.1 liter) and subsequently analyzed for nutrients (NO 2 /NO 3 , PO 4 , NH 4 , total N, total P, and SO 4 ), gases (CH 4 and CO 2 ), and dissolved organic carbon (DOC) concentration. Nutrient analyses were conducted using standard methods (http://www.sfs.fi/). Gas and DOC analyses were done as described in reference 37, and iron analysis was performed as described in reference 38. The dark carbon fixation was measured in 2008 as an increase or decrease of dissolved inorganic carbon (DIC) concentration during a 24-h incubation at the depth of 3 m in two foil-covered 50-ml glass-stoppered biological oxygen demand (BOD) bottles. The measurements were conducted every second week from the beginning of May until the end of October, and the changes in DIC concentration were analyzed according to reference 39. In August 2008, a clone library was created from the 0.5-, 2.5-, 4.5-, and 5.5-m depths, consisting of a total of 303 sequences (40).
Amplicon and metagenome analysis. The samples for metagenomic analysis of lake microbiota were taken by filtering water through 0.2-m polycarbonate filters which were then frozen at Ϫ78°C until further analysis. The DNA was extracted from the filters using the MoBio PowerSoil DNA extraction kit (MoBio Laboratories). Sample preparation for 16S rRNA gene analysis and the following sequence processing were conducted as previously described (41).
Shotgun metagenomic libraries were prepared from 10 ng of genomic DNA. First, the genomic DNA was sheared using a focused ultrasonicator (Covaris E220), and subsequently, sequencing libraries were prepared with the Thruplex FD Prep kit from Rubicon Genomics according to the manufacturer's protocol (R40048-08, QAM-094-002). Library size selection was made with AMPure XP beads (Beckman Coulter) in a 1:1 ratio. The prepared sample libraries were quantified using the KAPA Biosystems next-generation sequencing library quantitative PCR (qPCR) kit and run on a StepOnePlus (Life Technologies) real-time PCR instrument. The quantified libraries were then prepared for sequencing on the Illumina HiSeq sequencing platform with a TruSeq paired-end cluster kit, v3, and Illumina's cBot instrument to generate a clustered flow cell for sequencing. Sequencing of the flow cell was performed on the Illumina HiSeq2500 sequencer using Illumina TruSeq SBS sequencing kits, v3, following a 2 ϫ 100 indexed high-output run protocol.
The sequencing produced a total of 120.5 Gb of sequence data. Reads were filtered based on their quality scores using Sickle (version v1.33) (42) and subsequently assembled with Ray (version v2.3.1) (43). Assembled contigs from kmer sizes of 51, 61, 71, and 81 were cut into 1,000-bp pieces and scaffolded with Newbler (454 Life Sciences, Roche Diagnostics). Mapping of the original reads to the Newbler assembly was done using Bowtie 2 (version v2.15.0) (44), while duplicates were removed using Picardtools (version 1.101; https://github.com/broadinstitute/picard), and for computing coverage, BEDTools (45) was used. Details on the assembly results are presented in Table 1. The data were then normalized using the counts of 139 single-copy genes as previously described (20). Assembled contigs were binned with MetaBAT (version v0.26.3) (46) to reconstruct genomes of the most abundant lake microbes (metagenome-assembled genomes [MAGs]). The quality of the MAGs was evaluated using CheckM (version v1.0.6) (47). Cutoffs for high-quality MAGs were set to Ն40% for completeness and Յ4% for contamination. The placement of the MAGs in the microbial tree of life was estimated using PhyloPhlAn (version v1.1.0) (26).
The functional potential of the metagenomes was assessed from assembled data using the hidden Markov models (HMMs) of the Pfam and TIGRFAM databases (48,49) and the HMMER3 software (version v3.1b2) (50). Special attention was paid to pathways linked to energy metabolism and carbon cycle. To ensure pathway specificity, marker HMMs were chosen to be unique to specific pathways (see Table S1 in the supplemental material). Normalized coverage information on the contigs combined with HMMs of specific marker genes (Table S1) was used to predict protein domains related to energy metabolism and carbon incorporation to biomass. Only marker genes that were found to exhibit a significantly different distribution between layers are reported (P values in Table S1). All of the MAGs were also annotated using Prokka (version v1.11) (24). The metabolic potentials of all high-quality MAGs were evaluated based on Prokka annotations (Table S2). In the case of novel functional combinations, such as combination of AAP with CBB, the contigs were blasted against the NCBI nr database to verify that the closest relatives of the overall genes in these contigs matched the PhyloPhlAn annotation. MAGs with special functional properties were also visualized with nonmetric multidimensional scaling to check the placement of the contigs with markers within all the contigs comprising the MAG in question. All statistical analyses were done using R software (http://www.R-project.org [51]) and packages vegan (52) and mpmcorrelogram (53). Differences in the abundance of marker HMMs between layers were tested using a permutation test (1,000 permutations) on the t statistics with the package MetagenomeSeq (54).
Accession number(s). The raw data have been deposited in the NCBI Sequence Read Archive under accession no. SRP076290.