On joint dimension reduction and clustering of categorical data
There exist several methods for clustering high-dimensional data. One popular approach is to use a two-step procedure. In the first step, a dimension reduction technique is used to reduce the dimensionality of the data. In the second step, cluster analysis is applied to the data in the reduced space. This method may be referred to as the tandem approach. An important drawback of this method is that the dimension reduction may distort or hide the cluster structure. As an alternative, various authors have proposed joint dimension reduction and clustering approaches. In this paper we review some of these existing joint dimension reduction and clustering methods for categorical data in a unified framework that facilitates comparison.