BACKGROUND: Clinical prediction models are useful in estimating a patient's risk of having a certain disease or experiencing an event in the future based on their current characteristics. Defining an appropriate risk threshold to recommend intervention is a key challenge in bringing a risk prediction model to clinical application; such risk thresholds are often defined in an ad hoc way. This is problematic because tacitly assumed costs of false positive and false negative classifications may not be clinically sensible. For example, when choosing the risk threshold that maximizes the proportion of patients correctly classified, false positives and false negatives are assumed equally costly. Furthermore, small to moderate sample sizes may lead to unstable optimal thresholds, which requires a particularly cautious interpretation of results.
MAIN TEXT: We discuss how three common myths about risk thresholds often lead to inappropriate risk stratification of patients. First, we point out the contexts of counseling and shared decision-making in which a continuous risk estimate is more useful than risk stratification. Second, we argue that threshold selection should reflect the consequences of the decisions made following risk stratification. Third, we emphasize that there is usually no universally optimal threshold but rather that a plausible risk threshold depends on the clinical context. Consequently, we recommend to present results for multiple risk thresholds when developing or validating a prediction model.
CONCLUSION: Bearing in mind these three considerations can avoid inappropriate allocation (and non-allocation) of interventions. Using discriminating and well-calibrated models will generate better clinical outcomes if context-dependent thresholds are used.
This work was developed as part of the international initiative of strengthening analytical thinking for observational studies (STRATOS). The objective of STRATOS is to provide accessible and accurate guidance in the design and analysis of observational studies (http://stratos-initiative.org/). Members of the STRATOS Topic Group ‘Evaluating diagnostic tests and prediction models’ are Gary Collins, Carl Moons, Ewout Steyerberg, Patrick Bossuyt, Petra Macaskill, David McLernon, Ben van Calster, and Andrew Vickers.
The study is supported by the Research Foundation-Flanders (FWO) project G0B4716N and Internal Funds KU Leuven (project C24/15/037). Laure Wynants is a post-doctoral fellow of the Research Foundation – Flanders (FWO). The funding bodies had no role in the design of the study, collection, analysis, interpretation of data, nor in writing the manuscript.
LW and BVC conceived the original idea of the manuscript, to which ES, MVS and DML then contributed. DT acquired the data. LW analyzed the data, interpreted the results and wrote the first draft. All authors revised the work, approved the submitted version, and are accountable for the integrity and accuracy of the work.
- Clinical risk prediction model
- Threshold, Decision support techniques
- Risk, Data science
- Diagnosis, Prognosis