Predicting amyotrophic lateral sclerosis mortality with machine learning in diverse patient databases

Ling Guo, Ian Qian Xu, Sonakshi Nag, Jing Xu, Josiah Yui Huei Chai , Zachary Simmons, Savitha Ramasamy* (Corresponding Author), Crystal Yeo* (Corresponding Author)

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Introduction
Predicting mortality in Amyotrophic Lateral Sclerosis (ALS) guides personalized care and clinical trial optimization. Existing statistical and machine learning models often rely on baseline or diagnosis visit data, assume fixed predictor-survival relationships, lack validation in non-Western populations, and depend on features like genetic tests and imaging not routinely available. This study developed ALS mortality prediction models that address these limitations.

Methods
We trained Royston-Parmar and eXtreme Gradient Boosting models on the PRO-ACT database for 6- and 12-month mortality predictions. Each visit was labeled positive (for death) if death occurred within 6 or 12 months, negative if survival was confirmed beyond that, and excluded if follow-up was insufficient, assuming patients were alive up to their last recorded visit. Models were validated on independent datasets from the North American Celecoxib trial and a Singapore ALS clinic population. Feature importance and the impact of reducing predictors on performance were evaluated.

Results
Models predicted mortality from any clinical visit with area under the curve (AUC) of 0.768–0.819, rising to 0.865 for 12-month prediction using 3-month windows. Albumin was the top predictor, reflecting nutritional and inflammatory status. Other key predictors included ALS Functional Rating Scale-Revised slope, limb onset, absolute basophil count, forced vital capacity, bicarbonate, body mass index, and respiratory rate. Models maintained robust performance on the independent datasets and after reducing inputs to seven key predictors.

Discussion
These visit-agnostic models, validated across diverse populations, identify key prognostic features and demonstrate the potential of predictive modeling to enhance ALS care and trial design.
Original languageEnglish
Pages (from-to)653-661
Number of pages9
JournalMuscle & nerve
Volume72
Issue number4
Early online date28 Jul 2025
DOIs
Publication statusPublished - Oct 2025

Bibliographical note

Open Access via the Wiley agreement

We would like to thank Dr. James Berry, Massachusetts General Hospital, and the NEALS consortium for providing the Celecoxib trial database and all contributors to the PRO-ACT database. We would also like to thank all the patients who contributed to the PRO-ACT, Celecoxib, and SG ALS databases.

Data Availability Statement

The PRO-ACT data is publicly available at https://ncri1.partners.org/proact. The Celecoxib trial dataset is provided through DTUA2021A009305 with Massachusetts General Hospital/Harvard Medical School and the NEALS consortium. The SG ALS dataset and analysis code are available from the corresponding authors upon reasonable request.

Keywords

  • amyotrophic lateral sclerosis
  • diverse external validation
  • machine learning
  • mortality prediction
  • survival analysis

Fingerprint

Dive into the research topics of 'Predicting amyotrophic lateral sclerosis mortality with machine learning in diverse patient databases'. Together they form a unique fingerprint.

Cite this