The human oral microbiome is shaped by shared environment rather than genetics: evidence from a large family of closely-related individuals

The human microbiome is affected by multiple factors, including the environment and host genetics. In this study, we analyzed the oral microbiome of an extended family of Ashkenazi Jewish individuals living in several cities and investigated associations with both shared household and host genetic similarities. We found that environmental effects dominated over genetic ones. While there was weak evidence of geographic structuring at the level of cities, we observed a large and significant effect of shared household on microbiome composition, supporting the role of immediate shared environment in dictating the presence or absence of taxa. This effect was also seen when including adults who had grown up in the same household but moved out prior to the time of sampling, suggesting that the establishment of the oral microbiome earlier in life may affect its long-term composition. We found weak associations between host genetic relatedness and microbiome dissimilarity when using family pedigrees as proxies for genetic similarity. However this association disappeared when using more accurate measures of kinship based on genome-wide genetic markers, indicating that environment rather than host genetics is the dominant factor affecting the composition of the oral microbiome in closely-related individuals. Our results support the concept that there is a consistent core microbiome conserved across global scales, but that small-scale effects due to shared living environment significantly affect microbial community composition. IMPORTANCE Previous research shows that relatives have a more similar oral microbiome composition than non-relatives, but it remains difficult to distinguish the effects of relatedness and shared household environment. Furthermore, pedigree measures may not accurately measure host genetic similarity. In this study, we include genetic relatedness based on genome-wide SNPs (rather than pedigree measures) and shared environment in the same analysis. We quantify the relative importance of these factors by studying the oral microbiome in members of a large extended Ashkenazi Jewish family who share a similar diet and lifestyle despite living in different locations. We find that host genetics plays no significant role and that the dominant factor is shared environment at the household level. We also find that this effect appears to persist in individuals who have moved out of the parental household, suggesting that the oral microbiome established earlier in life persists long-term.

environmental effects dominated over genetic ones. While there was weak evidence 23 of geographic structuring at the level of cities, we observed a large and significant 24 effect of shared household on microbiome composition, supporting the role of 25 immediate shared environment in dictating the presence or absence of taxa. This 26 effect was also seen when including adults who had grown up in the same household 27 but moved out prior to the time of sampling, suggesting that the establishment of the 28 oral microbiome earlier in life may affect its long-term composition. We found weak 29 associations between host genetic relatedness and microbiome dissimilarity when 30 using family pedigrees as proxies for genetic similarity. However this association 31 disappeared when using more accurate measures of kinship based on genome-wide 32 genetic markers, indicating that environment rather than host genetics is the dominant 33 factor affecting the composition of the oral microbiome in closely-related individuals. 34 Our results support the concept that there is a consistent core microbiome conserved 35 across global scales, but that small-scale effects due to shared living environment 36 significantly affect microbial community composition. 37 38 Word count: 145 39 IMPORTANCE. Previous research shows that relatives have a more similar oral 40 microbiome composition than non-relatives, but it remains difficult to distinguish the 41 effects of relatedness and shared household environment. Furthermore, pedigree 42 measures may not accurately measure host genetic similarity. In this study, we 43 include genetic relatedness based on genome-wide SNPs (rather than pedigree 44 measures) and shared environment in the same analysis. We quantify the relative 45 importance of these factors by studying the oral microbiome in members of a large 46 extended Ashkenazi Jewish family who share a similar diet and lifestyle despite living 47 in different locations. We find that host genetics plays no significant role and that the 48 dominant factor is shared environment at the household level. We also find that this 49 6 microbiome composition attributable to genetics compared with previous studies. 119 Furthermore, due to shared cultural practices we can be reasonably confident that 120 environmental factors such as diet and lifestyle are largely controlled for, compared to 121 other studies where they may be significant confounders (17). For this reason, this 122 cohort represents a unique opportunity to compare the oral microbiome within a large 123 number of individuals living in separate locations but nevertheless sharing a similar 124 diet, lifestyle, and genetic background, and to investigate the long-term effect of 125 shared upbringing on oral microbiome composition. 126

Description of cohort 128
We found 271 phylotypes in the total dataset, all of which were present when 129 considering just Family A. 49 of these phylotypes were present in >95% of 130 individuals within Family A, with the Firmicutes the most abundant phyla (Figure 1a) 131 as observed in previous oral microbiome studies (15,24). The most abundant genera 132 were Streptococcus (30.4%), Rothia (18.5%), Neisseria (17.1%), and Prevotella 133 (17.1%). Composition of samples was similar between the two families (A and B) and 134 the unrelated controls ( Figure 1b). These groupings had a small but significant effect 135 in an analysis of variance (R 2 =0.015, p<0.01) but this is typical of comparisons 136 between such large groups that may differ in an unknown number of confounded 137 variables (e.g. diet, genetics, lifestyle). We concluded that Family A was at the very 138 There was no significant effect of any of the MDS axes, suggesting that host genetics 172 in closely-related individuals does not significantly affect microbiome composition. 173 We investigated the effect of environment using two levels of geography: city and 174 household (Table 1a). A city-only model showed no significant effect of environment 175 (R 2 =0.08, p=0.4), whereas a household-only model showed a significant effect 176 (R 2 =0.30, p=0.001). This was reproduced in a model containing both geographic 177 variables, with permutations stratified by city, where household was still a significant 178 effect (R 2 =0.22, p=0.001), suggesting that differences at the level of household are 179 more important than at larger geographical scales. We confirmed that city-level 180 effects were small by extending our sample to 82 individuals across the four cities 181 who were not necessarily cohabiting with others (I: 48, II: 13, III: 12, IV: 9), and 182 found that city still had a small effect, although it was significant (R 2 =0.053, p<0.01).

8
In this analysis we also found no significant effect of genetics, but age was significant 184 (R 2 =0.028, p=0.0101) (Supplementary Table 1). 185 186

Spouses share taxa at the species level 187
Restricting the analysis to only married couples within Family A (n=16, eight 188 couples), shared household explained even more of the variance (R 2 =0.591, p=0.001). 189 Subtle variations in the relative abundance of phylotypes within the same genus 190 between households were observable, even within the same city location. For  To test whether our conclusions required using kinships estimated from genome-wide 217 SNP data for individuals, or whether pedigree information was sufficient, we also 218 repeated our analyses using pedigree kinships (see Methods). Using pedigree kinships 219 resulted in a small but significant amount of variation in microbiome composition 220 being attributable to host genetics via MDS axis 4 (R 2 =0.016, p<0.01, Table 2). 221

222
We have conducted, to our knowledge, the first simultaneous investigation of the role 223 of environment and host genetics in shaping the human saliva microbiome in a cohort 224 of closely-related individuals within a large Ashkenazi Jewish family. We found a 225 weak correlation between host kinship and oral microbiome dissimilarity before 226 taking shared household into account, and an apparent small but significant effect of 227 genetics when using kinships based on the family pedigree as proxies for genetic 228 similarity. However, when using kinship estimates based on genome-wide SNPs 229 between individuals and simultaneously controlling for shared household with a 230 permutational analysis of variance, we find no support for any clear effect of human 231 genetics, suggesting that shared environment has a much larger effect than genetics 232 and is the dominant factor affecting the oral microbiome (R 2 >0.18). We also observe 233 that shared household explains more variation for spousal pairs than for children, and Because all individuals in our main cohort were members of the same extended 297 Ashkenazi Jewish family, the genetic variation in our dataset is therefore much lower 298 than between individuals from a wider population. It is conceivable that host genetics 299 between more distantly-related individuals may play a significant role in affecting 300 oral microbiome composition. Furthermore, our results only looked at overall genetic 301 similarity, assessed using community comparison metrics based on taxa abundances. In summary, our results incorporating a measure of genetic relatedness using SNPs 324 demonstrate that the overall composition of the human oral microbiome in a large 325 Ashkenazi Jewish family is largely influenced by shared environment rather than host 326 genetics. An apparent significant effect of host genetics using pedigree-based 327 estimates disappears when using genetic markers instead, which recommends caution 328 in future microbiome research using pedigree relatedness as a proxy for host genetic 329 similarity. Geographic structuring occurs to a greater extent at household level within 330 cities than between cities on different continents. Living in the same household is 331 associated with a more similar oral microbiome, and this effect persists after 332 individuals have left the household. This is consistent with the long-term persistence 333 of the oral microbiome composition established earlier in life due to shared 334 upbringing. 335

336
Ethics. Ethical and research governance approval was provided by the National 337

Research Ethics Service London Surrey Borders Committee and the UCL Research 338
Ethics Committee. Written informed consent was provided by all participants. We also resequenced a subset of samples without spikes to verify whether spikes 386 affected our analyses and observed the same qualitative differences (Supplementary 387 Figure 2), implying that the addition of spikes did not have a negative impact on 388 downstream analysis. Paired-end reads were merged with fastq-mergepairs in 389 VSEARCH v1.11.1 (32), discarding reads with an expected error >1. As the expected 390 length of the V5-V7 region was 369 bases, we discarded sequences with <350 or 391  individuals who are currently living together (filled circles), those who had moved out 592 of their childhood home (empty circles), and those for whom data was missing (faint 593 circles). This clustering could be due to shared environment or also due to shared 594 genetics, as is obvious from (b) the pedigree.   Table 2. Comparison of pedigree-based and genome-wide measures of kinship to take 610 host genetics into account in a permutational analysis of variance (adonis) on oral 611 microbiome dissimilarities of n=111 individuals. Using pedigree information to 612 produce kinship results in a significant association with human genetics via the fourth 613 MDS axis, which is not present using kinships calculated with LDAK based on 614 genome-wide SNPs. 615