Due to variation in test difficulty, the use of pre-fixed cut-off scores in criterion-referenced standard setting methods may lead to variation in grades and pass rates. This paper aims to empirically investigate the strength of this relationship. To this end we examine a dataset of over 500 observations from an institution of higher education in The Netherlands over the period 2008-2013. We measure variation in test difficulty by using students' perceptions of the validity of the examination and by recording personnel changes in the primary instructor. The latter measure is based on the considerable variation in teachers' ability to assess test difficulty that is found in the literature. Other explanatory variables are course evaluations, instructor evaluations and self-reported study time. Variation in student quality is controlled for by measuring course results in deviation from the cohort average. We take a panel approach in estimating the effect of the explanatory variables on the variability in grades and pass rates. Our findings indicate that exam validity and instructor change are significantly related to variation in test results. The latter finding supports the hypothesis that instructors' difficulty in assessing test difficulty may introduce subjectivity in criterion-referenced standard setting methods.

Standard setting, Test difficulty, Test result variability
dx.doi.org/10.1016/j.stueduc.2015.05.002, hdl.handle.net/1765/84978
ERIM Top-Core Articles
Studies in Educational Evaluation
Erasmus School of Economics

Arnold, I.J.M. (2015). Changing the guard: Staff turnover as a source of variation in test results. Studies in Educational Evaluation, 47, 12–18. doi:10.1016/j.stueduc.2015.05.002