Objectives: Clinical prediction models that support treatment decisions are usually evaluated for their ability to predict the risk of an outcome rather than treatment benefit–the difference between outcome risk with vs. without therapy. We aimed to define performance metrics for a model's ability to predict treatment benefit.
Study Design and Setting: We analyzed data of the Synergy between Percutaneous Coronary Intervention with Taxus and Cardiac Surgery (SYNTAX) trial and of three recombinant tissue plasminogen activator trials. We assessed alternative prediction models with a conventional risk concordance-statistic (c-statistic) and a novel c-statistic for benefit. We defined observed treatment benefit by the outcomes in pairs of patients matched on predicted benefit but discordant for treatment assignment. The ‘c-for-benefit’ represents the probability that from two randomly chosen matched patient pairs with unequal observed benefit, the pair with greater observed benefit also has a higher predicted benefit.
Results: Compared to a model without treatment interactions, the SYNTAX score II had improved ability to discriminate treatment benefit (c-for-benefit 0.590 vs. 0.552), despite having similar risk discrimination (c-statistic 0.725 vs. 0.719). However, for the simplified stroke–thrombolytic predictive instrument (TPI) vs. the original stroke-TPI, the c-for-benefit (0.584 vs. 0.578) was similar.
Conclusion: The proposed methodology has the potential to measure a model's ability to predict treatment benefit not captured with conventional performance metrics.

Additional Metadata
Keywords Acute ischemic stroke, Concordance, Coronary artery disease, Discrimination, Individualized treatment decisions, Prediction models, Treatment benefit
Persistent URL dx.doi.org/10.1016/j.jclinepi.2017.10.021, hdl.handle.net/1765/104573
Journal Journal of Clinical Epidemiology
Citation
van Klaveren, D, Steyerberg, E.W, Serruys, P.W.J.C, & Kent, D.M. (David M.). (2018). The proposed ‘concordance-statistic for benefit’ provided a useful metric when modeling heterogeneous treatment effects. Journal of Clinical Epidemiology, 94, 59–68. doi:10.1016/j.jclinepi.2017.10.021