Quantifying outcome misclassification in multi-database studies: The case study of pertussis in the ADVANCE project
Background: The Accelerated Development of VAccine beNefit-risk Collaboration in Europe (ADVANCE) is a public-private collaboration aiming to develop and test a system for rapid benefit-risk (B/R) monitoring of vaccines using European healthcare databases. Event misclassification can result in biased estimates. Using different algorithms for identifying cases of Bordetella pertussis (BorPer) infection as a test case, we aimed to describe a strategy to quantify event misclassification, when manual chart review is not feasible. Methods: Four participating databases retrieved data from primary care (PC) setting: BIFAP: (Spain), THIN and RCGP RSC (UK) and PEDIANET (Italy); SIDIAP (Spain) retrieved data from both PC and hospital settings. BorPer algorithms were defined by healthcare setting, data domain (diagnoses, drugs, or laboratory tests) and concept sets (specific or unspecified pertussis). Algorithm- and database-specific BorPer incidence rates (IRs) were estimated in children aged 0–14 years enrolled in 2012 and 2014 and followed up until the end of each calendar year and compared with IRs of confirmed pertussis from the ECDC surveillance system (TESSy). Novel formulas were used to approximate validity indices, based on a small set of assumptions. They were applied to approximately estimate positive predictive value (PPV) and sensitivity in SIDIAP. Results: The number of cases and the estimated BorPer IRs per 100,000 person-years in PC, using data representing 3,173,268 person-years, were 0 (IR = 0.0), 21 (IR = 4.3), 21 (IR = 5.1), 79 (IR = 5.7), and 2 (IR = 2.3) in BIFAP, SIDIAP, THIN, RCGP RSC and PEDIANET respectively. The IRs for combined specific/unspecified pertussis were higher than TESSy, suggesting that some false positives had been included. In SIDIAP the estimated IR was 45.0 when discharge diagnoses were included. The sensitivity and PPV of combined PC specific and unspecific diagnoses for BorPer cases in SIDIAP were approximately 85% and 72%, respectively. Conclusion: Retrieving BorPer cases using only specific concepts has low sensitivity in PC databases, while including cases retrieved by unspecified concepts introduces false positives, which were approximately estimated to be 28% in one database. The share of cases that cannot be retrieved from a PC database because they are only seen in hospital was approximately estimated to be 15% in one database. This study demonstrated that quantifying the impact of different event-finding algorithms across databases and benchmarking with disease surveillance data can provide approximate estimates of algorithm validity.
|Keywords||Event misclassification, Event-finding algorithms, Incidence of pertussis, Positive predictive value|
|Persistent URL||dx.doi.org/10.1016/j.vaccine.2019.07.045, hdl.handle.net/1765/118566|
Gini, R, Dodd, C.N, Bollaerts, K. (Kaatje), Bartolini, C. (Claudia), Roberto, G. (Giuseppe), Huerta Alvarez, C, … Sturkenboom, M. (Miriam). (2019). Quantifying outcome misclassification in multi-database studies: The case study of pertussis in the ADVANCE project. Vaccine. doi:10.1016/j.vaccine.2019.07.045