Abstract
Introduction
Predicting mortality in Amyotrophic Lateral Sclerosis (ALS) guides personalized care and clinical trial optimization. Existing statistical and machine learning models often rely on baseline or diagnosis visit data, assume fixed predictor-survival relationships, lack validation in non-Western populations, and depend on features like genetic tests and imaging not routinely available. This study developed ALS mortality prediction models that address these limitations.
Methods
We trained Royston-Parmar and eXtreme Gradient Boosting models on the PRO-ACT database for 6- and 12-month mortality predictions. Each visit was labeled positive (for death) if death occurred within 6 or 12 months, negative if survival was confirmed beyond that, and excluded if follow-up was insufficient, assuming patients were alive up to their last recorded visit. Models were validated on independent datasets from the North American Celecoxib trial and a Singapore ALS clinic population. Feature importance and the impact of reducing predictors on performance were evaluated.
Results
Models predicted mortality from any clinical visit with area under the curve (AUC) of 0.768–0.819, rising to 0.865 for 12-month prediction using 3-month windows. Albumin was the top predictor, reflecting nutritional and inflammatory status. Other key predictors included ALS Functional Rating Scale-Revised slope, limb onset, absolute basophil count, forced vital capacity, bicarbonate, body mass index, and respiratory rate. Models maintained robust performance on the independent datasets and after reducing inputs to seven key predictors.
Discussion
These visit-agnostic models, validated across diverse populations, identify key prognostic features and demonstrate the potential of predictive modeling to enhance ALS care and trial design.
Predicting mortality in Amyotrophic Lateral Sclerosis (ALS) guides personalized care and clinical trial optimization. Existing statistical and machine learning models often rely on baseline or diagnosis visit data, assume fixed predictor-survival relationships, lack validation in non-Western populations, and depend on features like genetic tests and imaging not routinely available. This study developed ALS mortality prediction models that address these limitations.
Methods
We trained Royston-Parmar and eXtreme Gradient Boosting models on the PRO-ACT database for 6- and 12-month mortality predictions. Each visit was labeled positive (for death) if death occurred within 6 or 12 months, negative if survival was confirmed beyond that, and excluded if follow-up was insufficient, assuming patients were alive up to their last recorded visit. Models were validated on independent datasets from the North American Celecoxib trial and a Singapore ALS clinic population. Feature importance and the impact of reducing predictors on performance were evaluated.
Results
Models predicted mortality from any clinical visit with area under the curve (AUC) of 0.768–0.819, rising to 0.865 for 12-month prediction using 3-month windows. Albumin was the top predictor, reflecting nutritional and inflammatory status. Other key predictors included ALS Functional Rating Scale-Revised slope, limb onset, absolute basophil count, forced vital capacity, bicarbonate, body mass index, and respiratory rate. Models maintained robust performance on the independent datasets and after reducing inputs to seven key predictors.
Discussion
These visit-agnostic models, validated across diverse populations, identify key prognostic features and demonstrate the potential of predictive modeling to enhance ALS care and trial design.
| Original language | English |
|---|---|
| Pages (from-to) | 653-661 |
| Number of pages | 9 |
| Journal | Muscle & nerve |
| Volume | 72 |
| Issue number | 4 |
| Early online date | 28 Jul 2025 |
| DOIs | |
| Publication status | Published - Oct 2025 |
Bibliographical note
Open Access via the Wiley agreementWe would like to thank Dr. James Berry, Massachusetts General Hospital, and the NEALS consortium for providing the Celecoxib trial database and all contributors to the PRO-ACT database. We would also like to thank all the patients who contributed to the PRO-ACT, Celecoxib, and SG ALS databases.
Data Availability Statement
The PRO-ACT data is publicly available at https://ncri1.partners.org/proact. The Celecoxib trial dataset is provided through DTUA2021A009305 with Massachusetts General Hospital/Harvard Medical School and the NEALS consortium. The SG ALS dataset and analysis code are available from the corresponding authors upon reasonable request.Keywords
- amyotrophic lateral sclerosis
- diverse external validation
- machine learning
- mortality prediction
- survival analysis