We look at the correction for misclassification of possibly corrupted finite count data in epidemiological studies. In general, the misclassification probabilities are estimated from a validation study and used to correct for the distortion. However, most often the validation study is quite small implying that the misclassification probabilities are impossible to calculate or estimate with high variability if based on the multinomial distribution. To increase efficiency, we propose an approach based on the fact that to determine a count the examiner needs to evaluate all items that make up that count, called the double binomial (DB) approach. We suggest various extensions of the DB approach which might mimic better the scoring behaviour of the examiner relative to a gold standard. We evaluate the performance of our approach(es) to estimate the misclassification probabilities in comparison to the multinomial approach in an analytical way and in a simulation study. Finally, the practical use of our methods is exemplified on an oral health survey examining caries experience in 7-year-old Flemish children involving 16 dental examiners.

Count data, Logistic regression, Misclassification, Prevalence, Response error
dx.doi.org/10.1177/1471082X0800900201, hdl.handle.net/1765/25313
Statistical Modelling
Erasmus MC: University Medical Center Rotterdam

Lesaffre, E.M.E.H, Küchenhoff, H, Mwalili, S.M, & Declerck, D. (2009). On the estimation of the misclassification table for finite count data with an application in caries research. Statistical Modelling, 9(2), 99–118. doi:10.1177/1471082X0800900201