We propose the Hierarchical Product Classification (HPC) framework for the purpose of classifying products using a hierarchical product taxonomy. The framework uses a classification system with multiple classification nodes, each residing on a different level of the taxonomy. The innovative part of the framework stems from the definition of classification recipes that can be used to construct high-quality classifier nodes, using the product descriptions in the most optimal way. These classifier recipes are specifically tailored for the e-commerce domain. The use of these classifier recipes enables flexible classifiers that adjust to the taxonomy depth-specific characteristics of product taxonomies. Furthermore, in order to gain insight into which components are required to perform high quality product classification, we evaluate several feature selection methods and classification techniques in the context of our framework. Based on 3000 product descriptions obtained from Amazon.com, HPC achieves an overall accuracy of 76.80% for product classification. Using 110 categories from CircuitCity.com and Amazon.com, we obtain a precision of 93.61% for mapping the categories to the taxonomy of shopping.com.

Additional Metadata
Keywords E-commerce, Feature selection, Hierarchical clustering, Product descriptions
Persistent URL hdl.handle.net/1765/105060
Journal Journal of Web Engineering
Rights No subscription
Vandic, D, Frasincar, F, & Kaymak, U. (2018). A framework for product description classification in e-commerce. Journal of Web Engineering, 17(1-2), 1–27. Retrieved from http://hdl.handle.net/1765/105060

Additional Files
publisher's information