Handling Class Imbalance in Machine Learning-based Prediction Models: A Case Study in Asthma Management

Arif Budiarto* (Corresponding Author), Andrew Wilson, David Price, Syed Ahmar Shah, Aziz Sheikh

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingPublished conference contribution

Abstract

A data-driven prediction tool has the potential to provide early warning of an asthma attack and improve asthma management and outcomes. Most previous machine learning (ML)-based studies for asthma attack prediction have reported a severe class imbalance, with major implications for model performance. We aimed to undertake a systematic comparison of several class imbalance handling techniques in the context of risk prediction models for asthma prognosis. We used data from 9,835 asthma patients extracted from the Medical Information Mart for Intensive Care (MIMIC) IV database and deployed five class imbalance handling methods based on synthetic minority oversampling technique (SMOTE) and cost function customisation. We then compared their performances in improving two-class classifier models developed using logistic regression (LR) and extreme gradient boosting (XGBoost) for three different prediction tasks with varying severity of class imbalance (proportion of majority class ranging from 90.86% to 98.98%). The cost function customisation technique substantially outperformed the SMOTE-based methods in all tasks. XGBoost combined with cost function customisation achieved the highest prediction performance for the outcome with the most extreme class imbalance ratio (AUC = 0.72). Our findings suggest that the cost function customisation-based approach to tackle class imbalance provides substantially better performance compared to oversampling in the context of asthma management.Clinical Relevance— This study underscores the challenge of class imbalance in the context of prediction tools to improve asthma management and outcomes and provides a methodological solution that addresses the challenge. Accurate asthma prediction tools can provide early warning and potentially prevent deterioration thereby improving the quality of life of patients with asthma.
Original languageEnglish
Title of host publication2023 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC)
PublisherIEEE Explore
Number of pages5
DOIs
Publication statusPublished - 11 Dec 2023
Event45th Annual International Conference of the IEEE Engineering in Medicine and Biology Society - International Convention Centre (ICC), Sydney, Australia
Duration: 24 Jul 202327 Nov 2023
Conference number: 45
https://embc.embs.org/2023/

Conference

Conference45th Annual International Conference of the IEEE Engineering in Medicine and Biology Society
Abbreviated titleIEEE EMBS
Country/TerritoryAustralia
CitySydney
Period24/07/2327/11/23
Internet address

Fingerprint

Dive into the research topics of 'Handling Class Imbalance in Machine Learning-based Prediction Models: A Case Study in Asthma Management'. Together they form a unique fingerprint.

Cite this