Functional Metagenomics Reveals an Overlooked Diversity and Novel Features of Soil-Derived Bacterial Phosphatases and Phytases

Phosphorus (P) is a key element involved in numerous cellular processes and essential to meet global food demand. Phosphatases play a major role in cell metabolism and contribute to control the release of P from phosphorylated organic compounds, including phytate. Apart from the relationship with pathogenesis and the enormous economic relevance, phosphatases/phytases are also important for reduction of phosphorus pollution. Almost all known functional phosphatases/phytases are derived from cultured individual microorganisms. We demonstrate here for the first time the potential of functional metagenomics to exploit the phosphatase/phytase pools hidden in environmental soil samples. The recovered diversity of phosphatases/phytases comprises new types and proteins exhibiting largely unknown characteristics, demonstrating the potential of the screening method for retrieving novel target enzymes. The insights gained into the unknown diversity of genes involved in the P cycle highlight the power of function-based metagenomic screening strategies to study Earth’s phosphatase pools.

This is especially the case for phosphatases. Phosphatases have evolved across all living organisms and contribute to the regulation of diverse cellular functions (5,6). A specific group of phosphatases named phytases can release phosphorus from phytic acid, which is one of the most important phosphorus reserves in plants and soils (7,8).
Phosphorus (P) reserves are globally important, due to the enormous growth of the world population, and the ensuing demand for this macroelement. Large amounts of P are and will be required in order to fulfill the increasing world agroalimentary needs (9). However, global rock phosphorus reservoirs are currently being rapidly depleted, and the supplementation of P to animal feed and plant fertilizers has become more expensive during the last decades (10). Plant-based animal feeds often contain large amounts of phytate, which cannot be utilized by monogastric animals due to the lack of phytases (7,11). As a consequence, P levels in soils and water bodies increase. This eutrophication causes for instance algal blooms in aquatic ecosystems, leading to deoxygenated areas disturbing the life of many species (12). To meet future requirements, minimize losses of P, and reduce the environmental impact, it is necessary to use P compounds more efficiently and develop economical recycling technologies. In this context, phosphatases/phytases have proved to be remarkably useful (13). These enzymes are currently used in agroindustry to minimize P losses and to improve the levels of bioavailable P (14). A more recently described role of the phytases is their involvement in pathogenicity causing tissue damage in humans, coordination of the virulence program in Dickeya dadantii, and mediation of plant infection by Candida albicans and Xanthomonas, respectively (5,15,16).
The diversity and potential of environmental phytases remain largely unexplored as so far almost all reported functionally characterized phytases were derived from cultured organisms, including plants, fungi, and bacteria. Based on their catalytic characteristics, four classes of phytases have been described: histidine acid phytase (HAPhy), ␤-propeller phytase (BPPhy), purple acid phytase (PAPhy), and protein tyrosine phytase (PTPhy). These enzymes are structurally and catalytically dissimilar (14,17).
In this study, we use a function-based screening approach (18) to identify environmental phosphatases/phytases. By using soil metagenomes as a source, we were able to recover novel genes encoding phosphatases with phytase activity. Some of the recovered genes encode protein domains that were not associated with phosphatase activity before, and others represent new types or subtypes of phytases.

RESULTS
Phosphatase detection strategy. The metagenomic libraries contained approximately 38,122 to 166,040 clones and were screened for candidates exhibiting phosphatase activity using plates with phytate as phosphorus source and BCIP as indicator (see Fig. S1 in the supplemental material). The quality of the libraries was controlled by determining the average insert sizes and the percentage of insert-bearing Escherichia coli clones. The average insert sizes of metagenomic DNA-containing plasmids ranged from 2.8 to 6.7 kb, and the frequency of clones carrying plasmid inserts was at least 89% (Table 1).
We recovered 21 positive E. coli clones from functional screens carrying plasmids harboring one or more ORFs associated with known phosphatase genes and domains (designation of plasmids is given in Table 1). The entire inserts of the positive clones were sequenced and taxonomically classified, showing that in all cases the cloned environmental DNA is of bacterial origin. Most inserts of the positive clones were affiliated with Terrabacteria, Proteobacteria, and the PVC superphylum with seven, six, and four representatives, respectively. Within the Terrabacteria group, most of the inserts (4) were affiliated with Actinobacteria (Table S1).
Thirty-one ORFs encoding putative gene products with similarity to known phosphatase enzymes were identified. Signal peptides were detected for 12 of them. The deduced gene products comprised 214 to 819 amino acids with calculated molecular masses ranging from 12 to 65.5 kDa and amino acid sequence identities to the closest known phosphatases ranging from 25% (Pho14B) to 83% (Pho13) over the full-length protein ( Table 2).
From the 21 positive clones, seven harbored more than one putative phosphataserelated gene (Table 2). Thus, if two or more potential phosphatase activity-related genes were present in a positive clone, individual heterologous expression and subsequent phosphatase activity verification were performed. The analysis of colonies showed that the individual heterologous expression of 24 out of 31 genes led to phosphatase activity and the corresponding positive phenotype of the respective recombinant E. coli strains ( Table 2).
High phosphatase diversity recovered from soil metagenomes. Phosphatases can be classified according to the structural fold of the catalytic domains and subclassified into families and subfamilies based on sequence similarities of the phosphatase domains, as well as by conserved amino acid motifs not belonging to the catalytic domain (6,19). However, some are still classified based on their biochemical properties and biological functions (20).
Among the putative gene products encoded by the 31 candidate genes, alkaline phosphatases were identified as the most abundant group (five representatives), followed by histidine phosphatases and phospholipases with four representatives each. Phosphoserine-phosphatases and protein-tyrosine phosphatases were represented by three putative genes each. Acid phosphatases were encoded by two genes, while the plasmid pLP10 harbored an ORF with a deduced gene product showing similarity to a mismatch repair ATPase ( Table 2).
The amino acid sequence analysis revealed the presence of 10 different domains in the 31 deduced proteins. We detected the alkaline phosphatase and sulfatase superfamily domain (ALP-like cl23718) as the most frequent domain, represented in eight sequences. The second highest abundance showed the haloacid dehydrogenase domain (HAD cl21460), which was identified in six protein sequences. Three out of four classical phosphatase/phytase domains were detected in this study: the histidine phosphatase domain (HP with five protein sequences), the tyrosine phosphatase domain (PTPc with two protein sequences), and the acid phosphatase domain (PAP with two protein sequences) (Fig. 1). The phylogenetic analyses of the enzyme sequences and those harboring the above-mentioned domains revealed different clustering patterns in relation to reference phosphatase sequences for the different groups. Within the analyzed groups, the clustering of the metagenome-derived enzymes ranged from clear separation to integrated clustering (Fig. S2).
The HP superfamily (cl11399) is represented by a diverse group of proteins divided into two branches exhibiting numerous functions (21). Classical members of the HAPhy share a conserved motif, RHGXRXP, characteristic for this enzyme class. The HAPhy catalytic reactions are based on the conserved histidine residue in the RHGXRXP motif (21,22). In this study, all five phosphatases belonging to the HP superfamily harbored this histidine residue ( Fig. 2A). Three out of five HPs in this survey were encoded by Novel Soil-Derived Phosphatases and Phytases ® plasmid pLP08. The analysis of the plasmid sequence revealed a tandem organization of these genes with slight individual sequence variations ( Fig. 2A; Fig. S3). PTPs are well-studied proteins with a characteristic motif (HCX5R) (23,24). In this study, two new PTPs (Pho14A and Pho16B) harboring the typical catalytic signature of the group (Fig. 2B) were detected. Interestingly, Pho16B showed the specific signature of the MptpB-like phosphatases characterized by the presence of the unique active site P-loop submotif HCXXXKDRT. This type of protein has been predicted in several microorganisms, including pathogens, but never in environmental samples. For the remaining group of classic phytases detected in this study (PAP), the literature describes two branches, the PAP1 enzymes, which are Mg 2ϩ -dependent enzymes, and the PAP2 enzymes, which are Mg 2ϩ independent, but in all cases the active forms of PAP phytases were derived from plants (25). We detected the PAPs Pho18 and Pho24, which are affiliated with bacteria and belong to the Mg 2ϩ -independent branch (PAP2 cl00474) ( Fig. 1 and 2C). Alpha/beta hydrolases (abhydrolases) represent a group of proteins with a high number of substrates and catalytic functions (26). Two gene products (Pho03B and Pho25C) contained an abhydrolase domain (Fig. 1). However, only Pho25C showed phosphatase/phytase activity after individual heterologous expression of the corresponding gene. Abhydrolases exhibit broad substrate specificity, and some members have been reported with phospholipase activity (27).
Other ORFs such as Pho16A carry the EAL domain, which is present in diverse bacterial signaling proteins and encodes a phosphodiesterase function (28). Analysis of Pho10 and Pho14D amino acid sequences indicates the presence of the P-loop_NTPase superfamily domain (Fig. 1). Enzymes harboring this domain hydrolyze the beta-gamma phosphate bond of, e.g., ATP and GTP (29). In this study, Pho10 showed phosphatase activity, while Pho14D as part of the clone harboring plasmid pLP14 showed none. Pho14C showed no phosphatase activity after individual heterologous expression of the corresponding gene. The pho14C gene product harbors the phospholipase D catalytic domain (PLDC_SF domain) (30).
Novel Soil-Derived Phosphatases and Phytases ® associated with catalytic activity of phosphatases. In contrast, the phosphatase-related genes of plasmids pLP04 and pLP15 did not encode known catalytic domains or signatures directly or indirectly associated with phosphatases. Clones carrying these plasmids showed significant phosphatase activity, and the products Pho04 and Pho15 showed sequence similarity to other previously reported proteins carrying the SNARE domain. However, both proteins shared overall sequence identity to previously reported phosphatases ( Table 2). After individual heterologous expression of pho04 and pho15, phosphatase activity was confirmed for both gene products. Pho04 and Pho15 hold the SNARE-associated domain DedA. SNARE-associated proteins are classified as structural proteins that function as a protein-protein interaction module (31). To our knowledge, no proteins with SNARE domains have been previously discovered to possess phosphatase activity.
We performed an alignment based on the pho04 and pho15 gene products, which revealed a shared conserved region (Fig. 3). Next, we analyzed all 56,539 sequences associated with the SNARE-associated Golgi proteins InterPro entry (IPR032816) with respect to motifs that were similar to those found in Pho04 and Pho15. A total of 905 sequences showed the conserved sequence pattern or a similar form. The sequence analysis revealed that Pho04 and Pho15 and the other 905 SNARE-associated (IPR032816) sequences share the particular amino acid arrangement ESSF(F/L/I/V)P. Notably, with respect to all analyzed proteins the identified motif was mostly from bacteria and detected outside the SNARE domain (cl00429) (examples are depicted in Fig. 3). Pho04 harbors the SNARE domain but shows 48% sequence identity to a putative membrane-associated alkaline phosphatase from Acetobacter tropicalis, while the closest phosphatase-related hit for Pho15 was an alkaline phosphatase from an Acidobacteria representative (43% identity) ( Table 2).
ALP-like superfamily and non-plant-derived PAP representatives showing phytase activity. We selected the gene products of pho07 and pho18 for comprehensive  biochemical characterization. The gene product of pho07 does not contain any of the currently known catalytic domains associated with phytase activity. The only detected match of Pho07 was a nonspecific hit for the ALP-like superfamily (cl23718). In the case of pho18, the corresponding gene product comprises a domain of the purple acid phosphatases (PAP-like), which represents a type of phytase reported to be present in many organisms but is significantly expressed only in a very limited number of plant species (17,32).
We successfully detected phytase activity of both purified enzymes, Pho07 and Pho18. Thus, to our knowledge Pho07 represents a new type of phytase and Pho18 represents the first PAP2 bacterial phytase. Furthermore, these two enzymes represent two out of the three reported environmental phytases derived from functional metagenomics. Both enzymes are putatively secreted by the natural bacterial host (Table S1) as the protein sequences harbor potential signal peptides of 30 (Pho07) and 22 (Pho18) amino acids at the N terminus. Pho07 shows the presence of an ALP-like superfamily domain (cl23718) (Fig. 1) and highest similarity to a phosphoesterase from a Pseudonocardiales representative (51% identity) ( Table 2). Pho18 was most similar (50% identity) to an acid phosphatase from the Verrucomicrobiaceae member GAS474 (Table 2).
Pho07 and Pho18 exhibited optimal activity at 30 and 50°C, respectively (Fig. 4). After incubation of Pho07 for 4 h at 30°C, the enzyme retained more than 80% activity (Fig. S4). Incubation for 3 h at 45 and 60°C resulted in a substantial reduction (approximately 50%) and complete loss of enzyme activity, respectively. Pho18 retained FIG 4 Effect of temperature on the relative activity of Pho07 and Pho18. All measurements were performed following the phytase standard assay at temperatures between 10 and 70°C. A 100% relative activity represented 2.9 and 1.04 U/mg for Pho07 and Pho18, respectively. approximately 80% activity after incubation for 6 h at 40°C but lost more than 50% of its activity at temperatures Ն50°C (Fig. S4).
We evaluated the optimal pH range using different buffer systems at 30°C for Pho07 and at 50°C for Pho18. Pho07 exhibited the highest activity at pH 4.0 (Fig. 5) and retained more than 80% of its activity between pH 5.0 and 7.0. Low or no enzymatic activity was detected at pH values lower than 2.0 and higher than 8.0. Pho18 showed the highest activity at pH 6.0 and retained more than 70% of its activity at pH 5.0 and 7.0 (Fig. 5). To determine the substrate specificity of Pho07 and Pho18, we tested several phosphorylated compounds as the substrates (Fig. 6). Pho07 released phosphate from all tested compounds with the highest activity toward phytate and lowest activity toward pyrophosphate. Pho18 showed the highest relative activity with pyro-   phosphate as the substrate and no significant activity with pyridoxal phosphate and NADP. As Pho07 and Pho18 exhibited the highest activity with phytate and pyrophosphate, respectively, we used these substrates for calculation of kinetic constants (Table 3).
Finally, we measured the effect of various metal ions and potential enzyme inhibitors on the activity of Pho07 and Pho18 with phytate as the substrate (Fig. 7). The metal ions showed different effects on the activity of the analyzed proteins. Al 3ϩ , Mn 2ϩ , and Zn 2ϩ increased the activity of Pho07, while the activity of Pho18 decreased in the presence of Zn 2ϩ . Fe 2ϩ had a strong inhibitory effect on the activity of both enzymes. With respect to potential inhibitors, the strongest inhibitory effects were observed at concentrations of 1 mM. Pho07 and Pho18 activities were reduced by most of the  Novel Soil-Derived Phosphatases and Phytases ® tested inhibitors. Oxalate was the strongest inhibitor for Pho07, while the activity of Pho18 was completely depleted in the presence of SDS (Fig. 7).

DISCUSSION
Apart from the relationship with pathogenesis and the economic relevance, phosphatases/phytases are also important for reduction of phosphorus pollution and its impact on diverse environments (8,11,13). However, only a few phosphatases, most of them from cultivable organisms, have been comprehensively analyzed. The discovery of new phosphatases from environmental samples as well as engineering of available representatives of this enzyme group is considered a major research challenge (33). So far, few studies have attempted to discover phosphatases/phytases encoded by metagenomes using a function-based approach. Within these studies, only three genes and one of the corresponding proteins which exhibited phytase activity were recovered and described (34)(35)(36). We found 31 candidate genes, and 24 of them encoded phosphatase activity after individual heterologous expression ( Table 2). For the remaining seven genes, activity was not detected at individual gene level. The corresponding gene products might be part of larger phosphatase units or require other components encoded by the insert to show phosphatase activity.
Approximately 55% of the gene products described in this study showed low protein sequence identity to known phosphatases (50% or less) ( Table 2), which demonstrates the capacity of our screening method to identify novel enzymes with phosphatase activity from environmental samples. It has been previously discovered that the absence of free phosphate and the addition of phytate to medium induce the expression of phytases (37). Therefore, it is indicated that many of the detected genes encode new enzymes with phytase activity as observed for Pho07 and Pho18.
ALP phosphomonoesterases widely occur in nature. They preferably hydrolyze phosphate esters at pH levels higher than 7.0 (38). The ALP-like superfamily (cl23718) was the most abundant domain we detected in the recovered hits derived from our soil metagenomic libraries. The pH of the soil samples used ranged from 3.1 to 4.5 (39). Nevertheless, acid phosphatase genes are considered to be more abundant than alkaline phosphatase genes in low-pH soils. This might be due to the fact that most studies on the prevalence of alkaline and acid phosphatase genes are based on PCR-based gene amplification using specific known genes from cultured individual species as starting point for primer design (40). This approach covers only a small fraction of the existent functional phosphatase genes. Here, we revealed the existence of so-far-unknown functional ALPs with low identity toward known phosphatases, evidencing the potential of our functional metagenomic approach for the discovery of new ALP-phosphatases from environmental samples.
To our knowledge, enzymes from the ALP-like superfamily entry (cl23718) exhibiting phytase activity have not been described or comprehensively characterized yet. Nevertheless, numerous proteins are mentioned in literature or annotated in databases as alkaline phosphatases with phytase activity, but their molecular signatures and domains are associated mostly with the classic phytases (14). The analysis by Lim et al. (41) focusing on the distribution and diversity of phytate-mineralizing bacteria considers alkaline phosphatases to be ubiquitous in living organisms and shows that they dephosphorylate a wide range of P compounds, but not phytate. Thus, the functional proteins carrying the ALP-like superfamily domain reported in this study (7) represent a new group of phytase enzymes. The phylogenetic analysis of the ALP-like members revealed that most of our metagenome-derived enzymes cluster separately from previously reported alkaline phosphatases/phytases (see Fig. S2 in the supplemental material).
The biochemical analysis of a selected ALP-like member, Pho07, showed that its temperature optimum is similar to the metagenome-derived alkaline phosphatase (mAP). This enzyme is one of the few reported phosphatases derived from environmental samples and not associated with cultures (42). Furthermore, the optimal pH range of Pho07 (4.0 to 5.0) is similar to that of other soil bacterial phytases (43). Among the tested substrates, Pho07 showed the highest activity toward phytate, indicating that its primary activity is related to the degradation of this compound. Several studies report an enhancing effect of Ca 2ϩ and Mn 2ϩ on phytase activity (43). Nevertheless, the activity of Pho07 increased in the presence of Mn 2ϩ , but it was not affected by Ca 2ϩ . Among the potential inhibitors, wolframate and oxalate did not show significant effects on the activity of a phytate-degrading enzyme from Pantoea agglomerans (44) but reduced the relative activity of Pho07 to values lower than 20%. Since Pho07 is the first reported phytase carrying an ALP-like domain, it is not possible to compare its kinetic parameters (Table 3) with those from phytases of the same type.
The enzyme Pho18 belongs to the known PAPphy group of phytases. Only a few examples of characterized PAP proteins with phytase activity have been previously reported, and all of them were derived from plants (25). However, the presence of PAP-related genes in mammals, fungi, and bacteria has been indicated based on annotated genome sequences. The taxonomic analysis of pho18 and the complete insert harboring it revealed a bacterial origin and a phylogenetic association with the genus Terrimicrobium of the Verrucomicrobia phylum (Table S1). In addition, biochemical analysis confirmed phytase activity of Pho18. Therefore, we report here for the first time a PAP2 phosphatase with phytase activity, which is of nonplant origin and metagenome derived. Moreover, the phylogenetic analysis showed that Pho18 clusters separately from other previously reported PAPs with phytase activity. The reason for this is most likely the vegetal origin of the previously reported PAP phytases (Fig. S2). To our knowledge, the study of Ghorbani Nasrabadi et al. (45) is the only attempt to identify PAP phytases derived from bacteria. In their study, an indirect association between phytase activity and the amplification of a putative PAP gene in the bacterial host was established (45).
The optimal temperature of Pho18 (50°C) is similar to optimal temperatures of other PAPs derived from wheat (45°C) and soybeans (58°C) (14). Furthermore, the behavior of Pho18 at temperatures higher than 55°C (Fig. 4) is similar to that reported for soybean phytases (46). An increase of phytase activity mediated by the addition of Mn 2ϩ was reported for PAP phytases (32,43). We did not register significant increases in the activity of Pho18 in the presence of any cation. However, the enzyme was strongly inhibited by Zn 2ϩ , which is in contrast to other PAP phytases showing higher activity in the presence of this ion. Although Pho18 exhibits higher affinity to pyrophosphate, the kinetic parameters using phytate as the substrate are similar to PAP phytases from Arabidopsis (Table 3) (47).
We found the HAD (cl21460) domain as the second most abundant domain in our survey. The HAD domain is present in proteins of diverse organisms, including bacteria, archaea, and eukaryotes (48). This domain is carried by proteins able to catalyze a variety of biological functions and act on a wide range of substrates (19). Numerous members of the HAD superfamily can transfer phosphoryl groups or act as phosphoanhydride hydrolase P-type ATPases (49). Since proteins harboring this domain are involved in a variety of cellular processes, it is not surprising that they can be isolated through functional metagenomic screening for phosphatases.
One of the most remarkable findings in this study was the detection of the SNARE-associated domain (DedA, InterPro entry IPR032816) of Pho04 and Pho15. So far, the role of the SNARE-associated domain (DedA) has not been deeply studied. Bacterial DedA family mutants display phenotypes evidencing cell division defects, temperature sensitivity, and altered membrane phospholipid composition among others (50). DedA-SNAREs have been reported to promote or block membrane fusion, particularly during bacterial pathogenic processes (51). To our knowledge no phosphatase activity has been reported for proteins harboring SNARE-associated domains. Moreover, the particular signature ESSF(F/L/I/V)P has been overlooked until now.
In conclusion, we demonstrate here for the first time the potential of functional metagenomics to exploit the phosphatase pools hidden in environmental samples. Our study revealed new phosphatases/phytases with diverse and, so far, largely unknown characteristics. Furthermore, we discovered the existence of a new type of phytases flow rate at room temperature was utilized. Pho18 was purified through ion-exchange chromatography, by using a cation exchanger (SOURCE15S) in a prepacked Tricorn column (4.6/100 PE) (GE Healthcare, Little Chalfont, United Kingdom) with a gel bed volume of 1.7 ml at a 1-ml/min flow rate and room temperature. The purity of the resulting protein preparations was analyzed by sodium dodecyl sulfatepolyacrylamide gel electrophoresis (SDS-PAGE), and the detection of V5 epitope-carrying proteins was achieved by Western blot hybridization, as described by Waschkowitz et al. (62).
Enzyme assays. Phosphatase activity was determined at 355 nm by detecting the release of inorganic phosphorus according to the ammonium molybdate method developed by Heinonen and Lahti with modifications (44,63) as follows: the enzyme solutions (10 l) were preincubated for 3 min at 40°C in 380 l of 50 mM sodium acetate buffer (pH 5). Subsequently, 10 l of 100 mM phytic acid dipotassium salt (Sigma-Aldrich, Munich, Germany) was added, and the mixture was incubated for 30 min at 40°C. To stop the reaction, 1.5 ml of freshly prepared AAM solution (acetone-5 N H 2 SO 4 -10 mM ammonium molybdate) and 100 l of 1 M citric acid were added. Blanks were prepared by adding AAM solution prior to the addition of enzyme. The absorbance (355 nm) was measured using the Ultrospec 3300 Pro (Amersham plc, Little Chalfont, United Kingdom).
To assess the influence of pH on purified enzymes, the activity was measured at 40°C in a pH range from 1 to 9. The following overlapping buffer systems were used: 50 mM glycine-HCl (pH 1.0 to 3.5), 50 mM sodium acetate (pH 3.5 to 6.0), 50 mM Tris-maleate acid (pH 6.0 to 8.0), and 50 mM glycine-NaOH (pH 7.0 to 9.0). After the optimal pH was determined for Pho07 and Pho18, the influence of temperature on enzymatic activity was analyzed. The thermal stability was checked after incubation of the purified enzymes at different temperatures.
The substrate specificity of the phosphatases was determined using the standard assay described above under the optimal temperature and pH for each enzyme (substrate concentration, 10 mM). Furthermore, the effects of cations (Al 3ϩ , Ca 2ϩ , Co 2ϩ , Fe 2ϩ , Fe 3ϩ , Mn 2ϩ , Ni 2ϩ , and Zn 2ϩ ) and the potential inhibitors (EDTA, citrate, tartrate, wolframate, oxalate, sodium dodecyl sulfate (SDS), and dithiothreitol (DTT) at concentrations of 0.1 and 1 mM were analyzed.
For the kinetic constants, all measurements were performed in triplicate under optimal pH and temperature conditions using phytic acid and pyrophosphate as the substrates. The data were analyzed by the Sigma Plot Enzyme Kinetic Module version SigmaPlot 12.0 (Systat Software, Inc., San Jose, CA).
Sequence accession numbers. The nucleotide sequences of plasmids listed in Table 1