The emergence of high-throughput DNA sequencing methods provides unprecedented opportunities to further unravel bacterial biodiversity and its worldwide role from human health to ecosystem functioning. However, despite the abundance of sequencing studies, combining data from multiple individual studies to address macroecological questions of bacterial diversity remains methodically challenging and plagued with biases. Here, using a machine-learning approach that accounts for differences among studies and complex interactions among taxa, we merge 30 independent bacterial data sets comprising 1,998 soil samples from 21 countries. Whereas previous meta-analysis efforts have focused on bacterial diversity measures or abundances of major taxa, we show that disparate amplicon sequence data can be combined at the taxonomy-based level to assess bacterial community structure. We find that rarer taxa are more important for structuring soil communities than abundant taxa, and that these rarer taxa are better predictors of community structure than environmental factors, which are often confounded across studies. We conclude that combining data from independent studies can be used to explore bacterial community dynamics, identify potential ‘indicator’ taxa with an important role in structuring communities, and propose hypotheses on the factors that shape bacterial biogeography that have been overlooked in the past.
Bibliographical noteWe thank all the people who contributed data and input to this study. This study was conducted at a workshop (May 2015, Manchester, UK) funded by the British Ecological Society’s special interest group Plants-Soils-Ecosystems and organized by F.T.d.V. and K.S.R. This study and participants were funded in part by ERC Advanced Grant 26055290 (K.S.R., and W.H.v.d.P.); BBSRC David Phillips Fellowship (BB/L02456X/1) (F.T.d.V.); ERC Grant Agreements 242658 (BIOCOM) and 647038 (BIODESERT) (F.T.M.); the European Regional Development Fund (Centre of Excellence EcolChange) (J.D.); Yorkshire Agricultural Society, Nafferton Ecological Farming Group, and the Northumbria University Research Development Fund (C.H.O.); BBSRC Training Grant (BB/K501943/1) (C.H.); Wallenberg Academy Fellowship (KAW 2012.0152), Formas (214-2011-788) and Vetenskapsrådet (612-2011-5444) (E.D.); the Glastir Monitoring & Evaluation Programme (contract reference: C147/2010/11) and the full support of the GMEP team on the Glastir project (D.L.J., S.C., and D.A.R.). Computing was facilitated by the University of Manchester Condor pool and the CLIMB infrastructure (http://www.climb.ac.uk).
- microbial ecology