There has been growing interest in the potential of ‘big data’ to enhance our understanding in medicine and public health. Although there is no agreed definition of big data, accepted critical components include greater volume, complexity, coverage and speed of availability. Much of these data are ‘found’ (as opposed to ‘made’), in that they have been collected for non-research purposes, but could include valuable information for research. The aim of this paper is to review the contribution of ‘found’ data to obesity research to date, and describe the benefits and challenges encountered. A narrative review was conducted to identify and collate peer-reviewed research studies. Database searches conducted up to September 2017 found original studies using a variety of data types and sources. These included: retail sales, transport, geospatial, commercial weight management data, social media, and smartphones and wearable technologies. The narrative review highlights the variety of data uses in the literature: describing the built environment, exploring social networks, estimating nutrient purchases or assessing the impact of interventions. The examples demonstrate four significant ways in which ‘found’ data can complement conventional ‘made’ data: firstly, in moving beyond constraints in scope (coverage, size and temporality); secondly, in providing objective, quantitative measures; thirdly, in reaching hard-to-access population groups; and lastly in the potential for evaluating real-world interventions. Alongside these opportunities, ‘found’ data come with distinct challenges, such as: ethical and legal questions around access and ownership; commercial sensitivities; costs; lack of control over data acquisition; validity; representativeness; finding appropriate comparators; and complexities of data processing, management and linkage. Despite widespread recognition of the opportunities, the impact of ‘found’ data on academic obesity research has been limited. The merit of such data lies not in their novelty, but in the benefits they could add over and above, or in combination with, conventionally collected data.
The ESRC Strategic Network for Obesity was funded via Economic and Social Research Council grant number ES/N00941X/1. We would like to thank all of the network investigators (www.cdrc.ac.uk/research/obesity/investigators/) and members (www.cdrc.ac.uk/research/obesity/network-members/) for their participation in network meetings and discussion, which contributed to the development of this paper. Additional thanks are owed to Daniel Lewis for his insightful comments on the manuscript.