This thesis describes the development of text-mining algorithms for molecular biology, in particular for DNA microarray data analysis. Concept profiles were introduced, which characterize the context in which a gene is mentioned in literature, to retrieve functional associations between genes. The method was shown to efficiently annotate DNA microarray data and complement existing methods. Concept profiles were also used for other types of concepts and were successfully applied for functional annotation of genes through automatic assignment of Gene Ontology terms to genes. A generic framework has been developed based on concept profiles, dubbed Anni (, to provide researchers with an ontology-based interface to the literature and we demonstrated its utility for literature-based knowledge discovery. Use and development of text-mining tools to identify relations between genes and to automatically annotate sets of genes resulting from ! microarray experiments. Comparing DNA microarray studies can reveal interesting parallels. However, such analyses are hampered by the large influences of design, technical and statistical factors on the found differentially expressed genes. Comparisons based on perturbed biological processes could be more robust. Concept profiles were used to reveal overlapping biological processes between microarray studies in a comparative meta- analysis of 102 muscle-related microarray studies. We demonstrated that many more biologically meaningful links could be retrieved between studies, even between studies without differentially expressed genes in common.

Additional Metadata
Keywords DNA microarray data analysis, molecular biology, text-mining algorithms
Promotor J. van der Lei (Johan)
Publisher Erasmus University Rotterdam
Sponsor Lei, Prof. Dr. J. van der (promotor), Wiki Professional Initiative, BAZIS Foundation, SUWO
ISBN 978-90-8559-335-5
Persistent URL
Jelier, R. (2008, January 10). Text Mining applied to Molecular Biology. Erasmus University Rotterdam. Retrieved from