Identifying Long Covid Using Electronic Health Records: A National Observational Cohort Study in Scotland

Karen Jeffrey, Lana Woolford, Rishma Maini, Siddharth Basetti, Ashleigh Batchelor, David Weatherill, Chris White, Vicky Hammersley, Tristan Millington, Calum Macdonald, Jennifer K Quint, Robin Kerr, Steven Kerr, Syed Ahmar Shah, Adeniyi Fagbamigbe, Simpson Colin, Srinivasa Vittal Katikireddi, Chris Robertson, Lewis D Ritchie, Aziz SheikhLuke Daines

Research output: Working paperPreprint

Abstract

Background: Long COVID is a debilitating multisystem condition. To estimate prevalence and identify risk factors, we analysed routinely collected data from almost the entire adult population of Scotland.

Methods: A cohort of adults (≥18 years) resident in Scotland between March 1, 2020, and October 20, 2022, was created by linking data from primary care, secondary care, laboratory testing and prescribing. Four outcome measures were used to identify long COVID: clinical codes, free text in primary care records, free text on sick notes, and a novel operational definition. The latter was developed using Poisson regression to identify clinical encounters indicative of long COVID from a sample of negative and positive COVID-19 cases matched on time-varying propensity to test positive for SARS-CoV-2.

Findings: Of 5,104,198 participants, 90,712 (1·8%) were identified as having long COVID by one or more outcome measures. Clinical codes were rarely recorded (n=1,092, 0·02%). More people were identified using free text (n= 8,368, 0·2%), sick notes (n=14,471, 0·3%) and the operational definition (n=73,767, 1·4%). Compared with the general population, a higher proportion of people with long COVID were female, middle-aged, overweight/obese, had at least two comorbidities, were immunosuppressed, shielding, or hospitalised within 28 days of testing positive, and had tested positive before Omicron became the dominant variant.

Interpretation: The prevalence of long COVID presenting in general practice was estimated to be 0·02 - 1·8%, depending on the measure used. Of the four outcome measures used, clinical codes identified the fewest cases. With limited use of long COVID clinical codes, we consider free text analysis to be the most promising approach should future surveillance of long COVID at a national level be required.

Funding: Chief Scientist Office (Scotland) and Medical Research Council.

Declaration of Interest: AS reports grants from HDRUK, grants from NIHR, grants from MRC, grants from ICSF, during the conduct of the study; and Member of Scottish Government's CMO COVID-19 Advisory Group and Standing Committee on Pandemics. CR reports support from PHS and MRC. CS reports grants from MBIE (New Zealand), Ministry of Health (New Zealand), and HRC (New Zealand). JKQ reports grants from MRC, HDR UK, GlaxoSmithKline, BI, Asthma+Lung UK, and AstraZeneca and consulting fees from GlaxoSmithKline, Evidera, AstraZeneca, Insmed. All other authors declare no competing interests.

Ethical Approval: The EAVE II study obtained approvals from the West of Scotland Research Ethics Committee (reference: 22/WS/0071), and the Public Benefit and Privacy Panel for Health and Social Care (reference: 1920-0279).
Original languageEnglish
PublisherSSRN
Number of pages24
DOIs
Publication statusPublished - 7 Mar 2023

Bibliographical note

Funding: Chief Scientist Office (Scotland) and Medical Research Council.

Keywords

  • Long COVID
  • population surveillance
  • primary health care
  • clinical coding
  • matched-pair analysis

Fingerprint

Dive into the research topics of 'Identifying Long Covid Using Electronic Health Records: A National Observational Cohort Study in Scotland'. Together they form a unique fingerprint.

Cite this