Assessing calibration of multinomial risk prediction models
Statistics in Medicine , Volume 33 - Issue 15 p. 2585- 2596
Calibration, that is, whether observed outcomes agree with predicted risks, is important when evaluating risk prediction models. For dichotomous outcomes, several tools exist to assess different aspects of model calibration, such as calibration-in-the-large, logistic recalibration, and (non-)parametric calibration plots. We aim to extend these tools to prediction models for polytomous outcomes. We focus on models developed using multinomial logistic regression (MLR): outcome Y with k categories is predicted using k-1 equations comparing each category i (i=2,...,k) with reference category 1 using a set of predictors, resulting in k-1 linear predictors. We propose a multinomial logistic recalibration framework that involves an MLR fit where Y is predicted using the k-1 linear predictors from the prediction model. A non-parametric alternative may use vector splines for the effects of the linear predictors. The parametric and non-parametric frameworks can be used to generate multinomial calibration plots. Further, the parametric framework can be used for the estimation and statistical testing of calibration intercepts and slopes. Two illustrative case studies are presented, one on the diagnosis of malignancy of ovarian tumors and one on residual mass diagnosis in testicular cancer patients treated with cisplatin-based chemotherapy. The risk prediction models were developed on data from 2037 and 544 patients and externally validated on 1107 and 550 patients, respectively. We conclude that calibration tools can be extended to polytomous outcomes. The polytomous calibration plots are particularly informative through the visual summary of the calibration performance.
|, , ,|
|Statistics in Medicine|
|Organisation||Erasmus MC: University Medical Center Rotterdam|
van Hoorde, K, Vergouwe, Y, Timmerman, D, Van Huffel, S, Steyerberg, E.W, & Van Calster, B. (2014). Assessing calibration of multinomial risk prediction models. Statistics in Medicine, 33(15), 2585–2596. doi:10.1002/sim.6114