Locus-specific database domain and data content analysis: Evolution and content maturation toward clinical use
Genetic variation databases have become indispensable in many areas of health care. In addition, more and more experts are depositing published and unpublished disease-causing variants of particular genes into locus-specific databases (LSDBs). Some of these databases contain such extensive information that they have become known as knowledge bases. Here, we analyzed 1,188 LSDBs and their content for the presence or absence of 44 content criteria related to database features (general presentation, locus-specific information, database structure) and data content (data collection, summary table of variants, database querying). Our analyses revealed that several elements have helped to advance the field and reduce data heterogeneity, such as the development of specialized database management systems and the creation of data querying tools. We also identified a number of deficiencies, namely, the lack of detailed disease and phenotypic descriptions for each genetic variant and links to relevant patient organizations, which, if addressed, would allow LSDBs to better serve the clinical genetics community. We propose a structure, based on LSDBs and closely related repositories (namely, clinical genetics databases), which would contribute to a federated genetic variation browser and also allow the maintenance of variation data.
- Genetic variation
- Diagnostic laboratories
- Domain analysis
- Genetic diseases
- Locus-specific databases (LSDBs)