Prior studies highlight the merits of integrating Linked Data to aid investors’ analyses of company financial disclosures. Non-financial disclosures, including reporting on a company’s environmental footprint (corporate sustainability), remains an unexplored area of research. One reason cited by investors is the need for earth science knowledge to interpret such disclosures. To address this challenge, we propose an automated system which employs Latent Dirichlet Allocation (LDA) for the discovery of earth science topics in corporate sustainability text. The LDA model is seeded with a vocabulary generated by terms retrieved via a SPARQL endpoint. The terms are seeded as lexical priors into the LDA model. An ensemble tree combines the resulting topic probabilities and classifies the quality of sustainability disclosures using domain expert ratings published by Google Finance. From an applications stance, our results may be of interest to investors seeking to integrate corporate sustainability considerations into their investment decisions.

Automated ontology learning, LDA, Sustainability, Topic modeling,
Erasmus School of History, Culture and Communication (ESHCC)

Moniz, A.J, & de Jong, F.M.G. (2015). Analysis of companies’ non-financial disclosures: Ontology learning by topic modeling. doi:10.1007/978-3-319-25639-9_19