Using national electronic health records for pandemic preparedness: validation of a parsimonious model for predicting excess deaths among those with COVID-19-a data-driven retrospective cohort study

Mehrdad A Mizani, Ashkan Dashtban, Laura Pasea, Alvina G Lai, Johan Thygesen, Chris Tomlinson, Alex Handy, Jil B Mamza, Tamsin Morris, Sara Khalid, Francesco Zaccardi, Mary Joan Macleod, Fatemeh Torabi, Dexter Canoy, Ashley Akbari, Colin Berry, Thomas Bolton, John Nolan, Kamlesh Khunti, Spiros DenaxasHarry Hemingway, Cathie Sudlow, Amitava Banerjee* (Corresponding Author), CVD-COVID-UK Consortium

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)


OBJECTIVES: To use national, pre- and post-pandemic electronic health records (EHR) to develop and validate a scenario-based model incorporating baseline mortality risk, infection rate (IR) and relative risk (RR) of death for prediction of excess deaths.

DESIGN: An EHR-based, retrospective cohort study.

SETTING: Linked EHR in Clinical Practice Research Datalink (CPRD); and linked EHR and COVID-19 data in England provided in NHS Digital Trusted Research Environment (TRE).

PARTICIPANTS: In the development (CPRD) and validation (TRE) cohorts, we included 3.8 million and 35.1 million individuals aged ≥30 years, respectively.

MAIN OUTCOME MEASURES: One-year all-cause excess deaths related to COVID-19 from March 2020 to March 2021.

RESULTS: From 1 March 2020 to 1 March 2021, there were 127,020 observed excess deaths. Observed RR was 4.34% (95% CI, 4.31-4.38) and IR was 6.27% (95% CI, 6.26-6.28). In the validation cohort, predicted one-year excess deaths were 100,338 compared with the observed 127,020 deaths with a ratio of predicted to observed excess deaths of 0.79.

CONCLUSIONS: We show that a simple, parsimonious model incorporating baseline mortality risk, one-year IR and RR of the pandemic can be used for scenario-based prediction of excess deaths in the early stages of a pandemic. Our analyses show that EHR could inform pandemic planning and surveillance, despite limited use in emergency preparedness to date. Although infection dynamics are important in the prediction of mortality, future models should take greater account of underlying conditions.

Original languageEnglish
Pages (from-to)10-20
Number of pages11
JournalJournal of the Royal Society of Medicine
Issue number1
Early online date14 Nov 2022
Publication statusPublished - Jan 2023

Bibliographical note

This work is carried out with the support of the BHF Data Science Centre led by HDR UK (BHF Grant no. SP/19/3/34678) and makes use of de-identified data held in NHS Digital’s TRE for England, made available via the BHF Data Science Centre’s CVD-COVID-UK/COVID-IMPACT consortium. This work uses data provided by patients and collected by the NHS as part of their care and support. We would also like to acknowledge all data providers who make health relevant data available for research.

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The British Heart Foundation Data Science Centre (grant no. SP/19/3/34678, awarded to Health Data Research (HDR) UK) funded co-development (with NHS Digital) of the TRE, provision of linked datasets, data access, user software licences, computational usage, and data management and wrangling support, with additional contributions from the HDR UK data and connectivity component of the UK Government Chief Scientific Adviser’s National Core Studies programme to coordinate national COVID-19 priority research. Consortium partner organisations funded the time of contributing data analysts, biostatisticians, epidemiologists and clinicians. AB, MAM, MHD and LP were supported by research funding from AstraZeneca. AB has received funding from the National Institute for Health Research (NIHR), British Medical Association and UK Research and Innovation. AB, SD and HH are part of the BigData@Heart Consortium, funded by the Innovative Medicines Initiative-2 Joint Undertaking under grant agreement No 116074. KK is supported by the National Institute for Health Research (NIHR) Applied Research Collaboration East Midlands (ARC-EM) and NIHR Lifestyle BRC.

Data Availability Statement

Supplemental material for this article is available online: sj-pdf-1-jrs-10.1177_01410768221131897


  • clinical
  • epidemiology
  • health informatics
  • infectious diseases
  • public health


Dive into the research topics of 'Using national electronic health records for pandemic preparedness: validation of a parsimonious model for predicting excess deaths among those with COVID-19-a data-driven retrospective cohort study'. Together they form a unique fingerprint.

Cite this