Random forest classification statistics for the sewage bacterial community composition as a predictor of obesity levels in city populations

Data setaClassification
No. correct/total no.Accuracy (% correct)
All samples,b quartilesc44/5445/5482
All samples, SDd26/3844/4683
Cities,e quartiles16/2118/2181
Cities, SD14/1717/1889
  • a City populations were classified as “lean” or “obese” based on the estimated percentage of obese people in each city.

  • b All samples for a city were included in the model and classified separately.

  • c Samples in the first (lean) and fourth (obese) quartiles for the distribution of city obesity percentages in the random forest classification model. A “lean city” versus “obese city” designation corresponds to populations with ≤22.8% obesity versus ≥30.4% obesity, respectively.

  • d Samples >1 standard deviation from the mean city obesity percentage in the random forest classification model. A city was considered to be lean at populations with ≤21.5% obesity and obese at populations with ≥31.3% obesity.

  • e Average bacterial community composition in all samples for a city.