Email is one of the most powerful tools for communication. Many businesses use email as the main channel for communication, so it is possible that substantial data are included in email content. In order to help businesses grow faster, a workflow management system may be required. The data gathered from email content might be a robust source for a workflow management system. This research proposes an email extraction system to extract data from any incoming emails into suitable database fields. The database, which is created by the program, has been planned for the implementation of a workflow management system. The research is presented in three phases: (1) define suitable criteria to extract data; (2) implement a program to extract data, and store them in a database; and (3) implement a program for validating data in a database. Four criteria are applied for an email extraction system. The first criterion is to select contact information at the end of the email content; the second criterion is to select specified keywords, such as tel, email, and mobile; the third criterion is to select unique names, which start with a capital letter, such as the names of people, places, and corporates; the fourth criterion is to select special texts, such as Co. Ltd, .com, and www. The empirical results suggest that when all four criteria are considered, the accuracy of a program and percentage of blank fields are at an acceptable level compared with the results from other criteria. When four criteria are applied to extract 7,340 emails in English, the accuracy of this experiment is approximately 68.66%, while the percentage of blank fields in a database is approximately 68.05. The database created by the experiment can be applied in a workflow management system.

, , , , , , , ,
, , ,
hdl.handle.net/1765/100162
Econometric Institute Research Papers
Erasmus School of Economics

Chaipornkaew, P., Prexawanprasut, T., & McAleer, M. (2017). You’ve Got Email: a Workflow Management Extraction System (No. EI2017-11). Econometric Institute Research Papers. Retrieved from http://hdl.handle.net/1765/100162