Various modeling methods have been proposed to estimate the potential predictive ability of polygenic risk variants that predispose to various common diseases. However, it is unknown whether differences between them affect their conclusions on predictive ability. We reviewed input parameters, assumptions and output of the five most common methods and compared their estimates of the area under the receiver operating characteristic (ROC) curve (AUC) using hypothetical data representing effect sizes and frequencies of genetic variants, population disease risk and number of variants. To assess the accuracy of the estimated AUCs, we aimed to reproduce the AUCs of published empirical studies. All methods assumed that the combined effect of genetic variants on disease risk followed a multiplicative risk model of independent genetic effects, but they either assumed per allele, per genotype or dominant/recessive effects for the genetic variants. Modeling strategy and input parameters differed. Methods used simulation analysis or analytical formulas with effect sizes quantified by odds ratios (ORs) or relative risks. Estimated AUC values were similar for lower ORs (<1.2). When AUCs were larger (>0.7) due to variants with strong effects, differences in estimated AUCs between methods increased. The simulation methods accurately reproduced the AUC values of empirical studies, but the analytical methods did not. We conclude that despite differences in input parameters, the modeling methods estimate similar AUC for realistic values of the ORs. When one or more variants have stronger effects and AUC values are higher, the simulation methods tend to be more accurate.

, , , ,,
European Journal of Human Genetics
Erasmus MC: University Medical Center Rotterdam

Kundu, S., Karssen, L., & Janssens, C. (2012). Analytical and simulation methods for estimating the potential predictive ability of genetic profiling: A comparison of methods and results. European Journal of Human Genetics (Vol. 20, pp. 1270–1274). doi:10.1038/ejhg.2012.89