2014
A semantic approach for extracting domain taxonomies from text
Publication
Publication
Decision Support Systems , Volume 62 p. 78- 93
In this paper we present a framework for the automatic building of a domain taxonomy from text corpora, called Automatic Taxonomy Construction from Text (ATCT). This framework comprises four steps. First, terms are extracted from a corpus of documents. From these extracted terms the ones that are most relevant for a specific domain are selected using a filtering approach in the second step. Third, the selected terms are disambiguated by means of a word sense disambiguation technique and concepts are generated. In the final step, the broader-narrower relations between concepts are determined using a subsumption technique that makes use of concept co-occurrences in a text. For evaluation, we assess the performance of the ATCT framework using the semantic precision, semantic recall, and the taxonomic F-measure that take into account the concept semantics. The proposed framework is evaluated in the field of economics and management as well as the medical domain.
Additional Metadata | |
---|---|
, , , , , | |
doi.org/10.1016/j.dss.2014.03.006, hdl.handle.net/1765/72904 | |
ERIM Top-Core Articles | |
Decision Support Systems | |
Organisation | Erasmus Research Institute of Management |
Meijer, K., Frasincar, F., & Hogenboom, F. (2014). A semantic approach for extracting domain taxonomies from text. Decision Support Systems, 62, 78–93. doi:10.1016/j.dss.2014.03.006 |