Improving Subsurface Characterisation with ‘Big Data’ Mining and Machine Learning

Rachel Brackenridge* (Corresponding Author), Vasily Demyanov, Oleg Vashutin , Ruslan Nigmatullin

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)
6 Downloads (Pure)


Large databases  of  legacy  hydrocarbon  reservoir  and  well data  provide an  opportunity  to use  modern data mining  techniques  to improve our understanding of  the subsur-face  in  the  presence of uncertainty and improve predictability of reservoir properties. A da-ta mining approach provides a way to screen dependencies in reservoir and fluid data and enable  subsurface specialists  to  estimate absent properties in partial or incomplete datasets. This  allows  for uncertainty  to  be managed  and reduced. An  improvement in reservoir  characterisation using  machine learning  results from the capacity of machine learning methods to detect and model hidden dependencies in large multivariate datasets with noisy and missing data.  This study presents a workflow applied to a large basin-scale reservoir characterization database. The study aims to understand the dependencies between reservoir attributes in order to allow for predictions to be made to improve the data coverage. The machine learning workflow comprises the following steps: (i) exploratory data analysis; (ii) detection of outliers and data partitioning into groups showing similar trends using clustering; (iii) identification of dependencies within reservoir data in multivariate feature space with self-organising maps; and (iv) feature selection using supervised learning to identify relevant properties to use for predictions where data are absent. This workflow provides an opportunity to reduce the cost and in-crease accuracy of hydrocarbon exploration and production in mature basins.
Original languageEnglish
Article number1070
Number of pages23
Issue number3
Early online date31 Jan 2022
Publication statusPublished - 31 Jan 2022

Bibliographical note

Funding: This research was supported by Wood Mackenzie through funding of a Postdoctoral Research Associate position at Heriot Watt University, and through access to data from two basins.
Acknowledgments: This work was supported by Wood Mackenzie through funding research collab- oration with Heriot-Watt University. All the data were anonymised and supplied by Wood Mackenzie and authors are thankful for the opportunity to publish the outcomes of this research. Authors also thank Mikhail Kanevski of University of Lausanne for the peer exchange on feature selection and the opportunities opened during his course on Machine Learning hands-on applications. Authors acknowledge the use of Orange Data Mining [27] and ML Office for SOM application [30]. We thank Susan Agar, who reviewed the paper most comprehensively and helped improve it along with two anonymous reviewers.

Data Availability Statement

The data used in this study are held by Wood Mackenzie.


  • reservoir
  • subsurface characterisation
  • big data
  • unsupervised learning
  • supervised learning
  • multivariant analysis
  • machine learning
  • hydrocarbon exploration


Dive into the research topics of 'Improving Subsurface Characterisation with ‘Big Data’ Mining and Machine Learning'. Together they form a unique fingerprint.

Cite this