Predicting missing quality of life data that were later recovered: an empirical comparison of approaches

Shona Fielding, Peter Fayers, Craig Ramsay

Research output: Contribution to journalArticlepeer-review

11 Citations (Scopus)
1 Downloads (Pure)


Background and Purpose: The aim was to compare simple imputation, multiple imputation, and modeling approaches to deal with ‘missing’ quality of life data. Data were obtained from five clinical trials, which employed a reminder system for follow-up questionnaires. Previous studies have compared imputation strategies by artificially removing data according to prespecified mechanisms. Our approach differs from previous study as actual collected data are utilized.

Methods: Data obtained by reminder were initially treated as missing. These missing values were imputed using a variety of simple and multiple imputation strategies. The trials were analyzed using the imputed datasets, and the resulting treatment effects compared to analyses using the full dataset including responses following reminders. A repeated measures model was also carried out on the available data and the pattern mixture models were employed. The accuracy of the different strategies was assessed by calculating the bias seen in the calculated treatment difference compared to the actual observed treatment difference.

Results: Baseline carried forward or last value carried forward were shown to be the best simple imputation methods in this setting. Multiple imputation using a regression model or predictive mean match model tended to provide treatment difference estimates with the least bias when compared to the actual observed data. Pattern mixture models did not perform well. Overall, the multiple imputation procedures were generally the least biased approaches.

Limitations: A number of imputation and modeling procedures have been investigated but this list is not exhaustive. All the example datasets come from the same data source and perhaps studies from additional disease areas would have been useful. However, we feel the results are generalizable to other quality of life outcomes and clinical areas.

Conclusions: Multiple imputation is recommended for missing quality of life data as it makes the assumption of missing at random which in the quality of life setting is more plausible than the assumption of missing completely at random for which most simple imputation methods are based. Pattern mixture models can be complex and did not perform well in this setting. Clinical Trials 2010; 7: 333—342.
Original languageEnglish
Pages (from-to)333-342
Number of pages10
JournalClinical Trials
Issue number4
Early online date24 Jun 2010
Publication statusPublished - Aug 2010

Bibliographical note

We would like to thank the Centre for Health Care Randomized Trials based within the Health Services Research Unit and their staff for providing the data
used for this study. Particularly, Gladys McPherson, Alison McDonald, Graeme MacLennan, Jonathan Cook and Samantha Wileman who assisted with data queries and provided background to the trials. The Health Services Research Unit is funded by the Chief Scientist Office of the Scottish Government Health Directorate. Shona Fielding was funded by the Chief Scientist Office on a Research Training Fellowship (CZF/1/31) while carrying out this study. The views expressed are, however, not
necessarily those of the funding body.


Dive into the research topics of 'Predicting missing quality of life data that were later recovered: an empirical comparison of approaches'. Together they form a unique fingerprint.

Cite this