Machine-learning algorithm for estimating oil-recovery factor using a combination of engineering and stratigraphic dependent parameters

Kachalla Aliyuda*, John Howell

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

16 Citations (Scopus)


The methods used to estimate recovery factor change through the life cycle of a field. During appraisal, prior to development when there are no production data, we typically rely on analog fields and empirical methods. Given the absence of a perfect analog, these methods are typically associated with a wide range of uncertainty. During plateau, recovery factors are typically associated with simulation and dynamic modeling, whereas in later field life, once the field drops off the plateau, a decline curve analysis is also used. The use of different methods during different stages of the field life leads to uncertainty and potential inconsistencies in recovery estimates. A wide range of interacting, partially related, reservoir and production variables controls the production and recovery factor. Machine learning allows more complex multivariate analysis that can be used to investigate the roles of these variables using a training data set and then to ultimately predict future performance in fields. To investigate this approach, we used a data set consisting of producing reservoirs all of which are at plateau or in decline to train a series of machine-learning algorithms that can potentially predict the recovery factor with minimal percentage error. The database for this study consists of categorical and numerical properties for 93 reservoirs from the Norwegian Continental Shelf. Of these, 75 are from the Norwegian Sea, the Norwegian North Sea, and the Barents Sea, whereas the remaining 18 reservoirs are from the Viking Graben in the UK sector of the North Sea. The data set was divided into training and testing sets: The training set comprised approximately 80% of the total data, and the remaining 20% was the testing set. Linear regression models and a support vector machine (SVM) models were trained with all parameters in the data set (30 parameters); then with the 16 most influential parameters in the data set, the performance of these models was compared from results of fivefold crossvalidation. SVM training using a combination of 16 geologic/engineering parameters models with Gaussian kernel function has a root-mean-square error of 0.12, mean square error of 0.01, and R-squared of 0.76. This model was tested on 18 reservoirs from the testing set; the test results are very similar to crossvalidation results during models training phase, suggesting that this method can potentially be used to predict the future recovery factor.

Original languageEnglish
Pages (from-to)SE151-SE159
Number of pages10
Issue number3
Early online date16 Jul 2019
Publication statusPublished - 1 Aug 2019

Bibliographical note

Funding Information:
This work has been supported by SAFARI III and the Petroleum Technology Development Fund, we are grateful to them for providing funds for this project. We greatly appreciate all of the anonymous reviewers, whose comments have greatly improved the manuscript.

Publisher Copyright:
© 2019 Society of Exploration Geophysicists and American Association of Petroleum Geologists.


  • algorithm
  • artificial intelligence
  • stratigraphy


Dive into the research topics of 'Machine-learning algorithm for estimating oil-recovery factor using a combination of engineering and stratigraphic dependent parameters'. Together they form a unique fingerprint.

Cite this