In order to cope with the ever increasing amount of data available on the Web, information extraction patterns are frequently employed to gather relevant information. Currently, most patterns use lexical and syntactic elements, but fail to exploit domain semantics. We propose a lexico-semantic pattern-based rule language, i.e., the Hermes Information Extraction Language (HIEL), which exploits a domain ontology for pattern creation. Experiments on financial news show that HIEL rules outperform lexico-syntactic rules and state-of-the-art lexico-semantic JAPE rules in terms of rule creation times and F1 scores.

hdl.handle.net/1765/90155
24th Benelux Conference on Artificial Intelligence, BNAIC 2012
Erasmus University Rotterdam

Hogenboom, F., IJntema, W., & Frasincar, F. (2012). Text-based information extraction using lexico-semantic patterns. Presented at the 24th Benelux Conference on Artificial Intelligence, BNAIC 2012. Retrieved from http://hdl.handle.net/1765/90155