Does ignoring clustering in multicenter data influence the performance of prediction models? A simulation study

Wynants, L.; Vergouwe, Yvonne; Van Huffel, Sabine; Timmerman, Dirk; Van Calster, Ben

doi:10.1177/0962280216668555

L. Wynants, Y. Vergouwe (Yvonne), S. Van Huffel (Sabine), D. Timmerman (Dirk) and B. Van Calster (Ben)

2018-06-01

Does ignoring clustering in multicenter data influence the performance of prediction models? A simulation study

Statistical Methods in Medical Research , Volume 27 - Issue 6 p. 1723- 1736

Clinical risk prediction models are increasingly being developed and validated on multicenter datasets. In this article, we present a comprehensive framework for the evaluation of the predictive performance of prediction models at the center level and the population level, considering population-averaged predictions, center-specific predictions, and predictions assuming an average random center effect. We demonstrated in a simulation study that calibration slopes do not only deviate from one because of over- or underfitting of patterns in the development dataset, but also as a result of the choice of the model (standard versus mixed effects logistic regression), the type of predictions (marginal versus conditional versus assuming an average random effect), and the level of model validation (center versus population). In particular, when data is heavily clustered (ICC 20%), center-specific predictions offer the best predictive performance at the population level and the center level. We recommend that models should reflect the data structure, while the level of model validation should reflect the research question.

Additional Metadata
Keywords	bias, calibration, clinical prediction model, discrimination, logistic regression, Mixed model, predictive performance
Persistent URL	doi.org/10.1177/0962280216668555, hdl.handle.net/1765/106551
Journal	Statistical Methods in Medical Research
Organisation	Department of Public Health
Citation APA Style AAA Style APA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Wynants, L., Vergouwe, Y., Van Huffel, S., Timmerman, D., & Van Calster, B. (2018). Does ignoring clustering in multicenter data influence the performance of prediction models? A simulation study. Statistical Methods in Medical Research, 27(6), 1723–1736. doi:10.1177/0962280216668555

Free Full Text ( Final Version , 613kb )

Does ignoring clustering in multicenter data influence the performance of prediction models? A simulation study

Publication

Publication

About

Does ignoring clustering in multicenter data influence the performance of prediction models? A simulation study

Publication

Publication

Workflow

Workflow

Add Content