Domain taxonomy learning from text: The subsumption method versus hierarchical clustering

de Knijff, Jeroen; Frasincar, Flavius; Hogenboom, Frederik

doi:10.1016/j.datak.2012.10.002

J. de Knijff (Jeroen), F. Frasincar (Flavius) and F.P. Hogenboom (Frederik)

2013

Domain taxonomy learning from text: The subsumption method versus hierarchical clustering

Data and Knowledge Engineering , Volume 83 p. 54- 69

This paper proposes a framework to automatically construct taxonomies from a corpus of text documents. This framework first extracts terms from documents using a part-of-speech parser. These terms are then filtered using domain pertinence, domain consensus, lexical cohesion, and structural relevance. The remaining terms represent concepts in the taxonomy. These concepts are arranged in a hierarchy with either the extended subsumption method that accounts for concept ancestors in determining the parent of a concept or a hierarchical clustering algorithm that uses various text-based window and document scopes for concept co-occurrences. Our evaluation in the field of management and economics indicates that a trade-off between taxonomy quality and depth must be made when choosing one of these methods. The subsumption method is preferable for shallow taxonomies, whereas the hierarchical clustering algorithm is recommended for deep taxonomies.

Additional Metadata
Keywords	Association rules, Classification, Clustering, Ontologies, Text mining
Persistent URL	doi.org/10.1016/j.datak.2012.10.002, hdl.handle.net/1765/68448
Series	ERIM Top-Core Articles
Journal	Data and Knowledge Engineering
Organisation	Erasmus Research Institute of Management
Citation APA APA Style APA-ALL Style AAA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	de Knijff, J., Frasincar, F.& Hogenboom, F. (2013). Domain taxonomy learning from text: The subsumption method versus hierarchical clustering. Data and Knowledge Engineering, 83, 54–69.https://doi.org/10.1016/j.datak.2012.10.002

Domain taxonomy learning from text: The subsumption method versus hierarchical clustering

Publication

Publication

About

Domain taxonomy learning from text: The subsumption method versus hierarchical clustering

Publication

Publication

Workflow

Workflow

Add Content