Introduction Drug safety researchers seek to know the degree of certainty with which a particular drug is associated with an adverse drug reaction. There are different sources of information used in pharmacovigilance to identify, evaluate, and disseminate medical product safety evidence including spontaneous reports, published peer-reviewed literature, and product labels. Automated data processing and classification using these evidence sources can greatly reduce the manual curation currently required to develop reference sets of positive and negative controls (i.e. drugs that cause adverse drug events and those that do not) to be used in drug safety research. Methods In this paper we explore a method for automatically aggregating disparate sources of information together into a single repository, developing a predictive model to classify drug-adverse event relationships, and applying those predictions to a real world problem of identifying negative controls for statistical method calibration. Results Our results showed high predictive accuracy for the models combining all available evidence, with an area under the receiver-operator curve of ⩾0.92 when tested on three manually generated lists of drugs and conditions that are known to either have or not have an association with an adverse drug event. Conclusions Results from a pilot implementation of the method suggests that it is feasible to develop a scalable alternative to the time-and-resource-intensive, manual curation exercise previously applied to develop reference sets of positive and negative controls to be used in drug safety research.

, , , ,,
Journal of Biomedical Informatics
Erasmus MC: University Medical Center Rotterdam

Voss, E., Boyce, C., Ryan, P., van der Lei, J., Rijnbeek, P., & Schuemie, M. (2017). Accuracy of an automated knowledge base for identifying drug adverse reactions. Journal of Biomedical Informatics, 66, 72–81. doi:10.1016/j.jbi.2016.12.005