Background: Discrete choice experiments (DCEs) are increasingly used for health state valuations. However, the values derived from initial DCE studies vary widely. We hypothesize that these findings indicate the presence of unknown sources of bias that must be recognized and minimized. Against this background, we studied whether values derived from a DCE are sensitive to how well the DCE design spans the severity range.
Methods: We constructed an experiment involving three variants of DCE tasks for health state valuation: standard DCE, DCE-death, and DCE-duration. For each type of DCE, an experimental design was generated under two different conditions, enabling a comparison of health state values derived from current best practice Bayesian efficient DCE designs with values derived from ‘severity-stratified’ designs that control for coverage of the severity range in health state selection. About 3000 respondents participated in the study and were randomly assigned to one of the six study arms.
Results: Imposing the severity-stratified restriction had a large effect on health states sampled for the DCE-duration approach. The unstratified efficient design returned a skewed distribution of selected health states, and this introduced bias. The choice probability of bad health states was underestimated, and time trade-offs to avoid bad states were overestimated, resulting in too low values. Imposing the same restriction had limited effect in the DCE-death approach and standard DCE.
Conclusion: Variation in DCE-derived values can be partially explained by differences in how well selected health states spanned the severity range. Imposing a ‘severity stratification’ on DCE-duration designs is a validity requirement.