Deriving and validating an asthma diagnosis prediction model for children and young people in primary care

Luke Daines * (Corresponding Author), LJ Bonnett , Holly Tibble, Andrew Boyd, R Thomas, David Price, Stephen Turner, Steff C Lewis, Aziz Sheikh, Hilary Pinnock

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Downloads (Pure)


Introduction: Accurately diagnosing asthma can be challenging. We aimed to derive and validate a prediction model to support primary care clinicians assess the probability of an asthma diagnosis in children and young people.
Methods: The derivation dataset was created from the Avon Longitudinal Study of Parents and Children (ALSPAC) linked to electronic health records. Participants with at least three inhaled corticosteroid prescriptions in 12-months and a coded asthma diagnosis were designated as having asthma. Demographics, symptoms, past medical/family history, exposures, investigations, and prescriptions were considered as candidate predictors. Potential candidate predictors were included if data were available in ≥60% of participants. Multiple imputation was used to handle remaining missing data. The prediction model was derived using logistic regression. Internal validation was completed using bootstrap re-sampling. External validation was conducted using health records from the Optimum Patient Care Research Database (OPCRD).
Results: Predictors included in the final model were wheeze, cough, breathlessness, hay-fever, eczema, food allergy, social class, maternal asthma, childhood exposure to cigarette smoke, prescription of a short acting beta agonist and the past recording of lung function/reversibility testing. In the derivation dataset, which comprised 11,972 participants aged <25 years (49% female, 8% asthma), model performance as indicated by the C-statistic and calibration slope was 0.86, 95% confidence interval (CI) 0.85–0.87 and 1.00, 95% CI 0.95–1.05 respectively. In the external validation dataset, which included 2,670 participants aged <25 years (50% female, 10% asthma), the C-statistic was 0.85, 95% CI 0.83–0.88, and calibration slope 1.22, 95% CI 1.09–1.35.
Conclusions: We derived and validated a prediction model for clinicians to calculate the probability of asthma diagnosis for a child or young person up to 25 years of age presenting to primary care. Following further evaluation of clinical effectiveness, the prediction model could be implemented as a decision support software.
Original languageEnglish
Article number195
Number of pages20
JournalWellcome open research
Issue number195
Early online date7 Sept 2023
Publication statusPublished - 7 Sept 2023

Bibliographical note

Grant information: This work was supported by Wellcome [086118,], the UK Medical Research Council
and the University of Bristol who provided core support for ALSPAC. LD was supported by a clinical academic fellowship from the Chief
Scientist Office, Edinburgh (CAF/17/01). The ALSPAC data linkage programme and authors AB and RT were supported by the Medical
Research Council (MR/L012081) and the Wellcome Trust [086118]. A comprehensive list of grants funding is available on the ALSPAC
website ( Neither funder(s) nor sponsor (University of
Edinburgh) contributed to the manuscript.
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Data Availability Statement

Underlying data

ALSPAC. ALSPAC data access is through a system of managed open access. The steps below highlight how to apply for access to the data referred to in this article and all other ALSPAC data. The datasets presented in this article are linked to ALSPAC project number B2830, please quote this project number during your application. The ALSPAC variable codes highlighted in the dataset descriptions can be used to specify required variables.

1. Please read the ALSPAC access policy ( which describes the process of accessing the data and samples in detail, and outlines the costs associated with doing so.

2. You may also find it useful to browse our fully searchable research proposals database (, which lists all research projects that have been approved since April 2011.

3. Please submit your research proposal for consideration by the ALSPAC Executive Committee. You will receive a response within 10 working days to advise you whether your proposal has been approved. If you have any questions about accessing ALSPAC data, please email The study website also contains details of all the data that is available through a fully searchable data dictionary:

Extended data

Open Science Framework: Extended data for ‘Clinical prediction model for the diagnosis of asthma in children and young people in primary care’,

This project contains the following extended data:

AsthmaSpecific_ReadcodeList.txt (Asthma-specific read codes.)

LungFunctionAndReversibility_ReadCodeList.txt (Lung function/reversibility testing read codes.)


  • Asthma
  • Diagnosis
  • Primary Care
  • Children and Young People
  • Prediction Model
  • Electronic Health Records


Dive into the research topics of 'Deriving and validating an asthma diagnosis prediction model for children and young people in primary care'. Together they form a unique fingerprint.

Cite this