The information explosion in science has become a different problem, not the sheer amount per se, but the multiplicity and heterogeneity of massive sets of data sources. Relations mined from these heterogeneous sources, namely texts, database records, and ontologies have been mapped to Resource Description Framework (RDF) triples in an integrated database. The subject and object resources are expressed as references to concepts in a biomedical ontology consisting of the Unified Medical Language System (UMLS), UniProt and EntrezGene and for the predicate resource to a predicate thesaurus. All RDF triples have been stored in a graph database, including provenance. For evaluation we used an actual formal PRISMA literature study identifying 61 cerebral spinal fluid biomarkers and 200 blood biomarkers for migraine. These biomarkers sets could be retrieved with weighted mean average precision values of 0.32 and 0.59, respectively, and can be used as a first reference for further refinements.

Additional Metadata
Keywords Graph databases, Knowledge based discovery
Persistent URL hdl.handle.net/1765/98905
Conference 9th International Conference Semantic Web Applications and Tools for Life Sciences, SWAT4LS 2016
Citation
Van Mulligen, E.M, Vlietstra, W.J, Vos, R, & Kors, J.A. (2016). Discovering information from an integrated graph database. In CEUR Workshop Proceedings. Retrieved from http://hdl.handle.net/1765/98905