IT Infrastructure to Support the Secondary Use of Routinely Acquired Clinical Imaging Data for Research
We propose an infrastructure for the automated anonymization, extraction and processing of image data stored in clinical data repositories to make routinely acquired imaging data available for research purposes. The automated system, which was tested in the context of analyzing routinely acquired MR brain imaging data, consists of four modules: subject selection using PACS query, anonymization of privacy sensitive information and removal of facial features, quality assurance on DICOM header and image information, and quantitative imaging biomarker extraction. In total, 1,616 examinations were selected based on the following MRI scanning protocols: dementia protocol (246), multiple sclerosis protocol (446) and open question protocol (924). We evaluated the effectiveness of the infrastructure in accessing and successfully extracting biomarkers from routinely acquired clinical imaging data. To examine the validity, we compared brain volumes between patient groups with positive and negative diagnosis, according to the patient reports. Overall, success rates of image data retrieval and automatic processing were 82.5 %, 82.3 % and 66.2 % for the three protocol groups respectively, indicating that a large percentage of routinely acquired clinical imaging data can be used for brain volumetry research, despite image heterogeneity. In line with the literature, brain volumes were found to be significantly smaller (p-value <0.001) in patients with a positive diagnosis of dementia (915 ml) compared to patients with a negative diagnosis (939 ml). This study demonstrates that quantitative image biomarkers such as intracranial and brain volume can be extracted from routinely acquired clinical imaging data. This enables secondary use of clinical images for research into quantitative biomarkers at a hitherto unprecedented scale.