Model selection for mixture-based clustering for ordinal data
One of the key questions in the use of mixture models concerns the choice of the number of components most suitable for a given data set. In this paper we investigate answers to this problem in the context of likelihood-based clustering of the rows of a matrix of ordinal data modelled by the ordered stereotype model. Two methodologies for selecting the best model are demonstrated and compared. The first approach fits a separate model to the data for each possible number of clusters, and then uses an information criterion to select the best model. The second approach uses a Bayesian construction in which the parameters and the number of clusters are estimated simultaneously from their joint posterior distribution. Simulation studies are presented which include a variety of scenarios in order to test the reliability of both approaches. Finally, the results of the application of model selection to two real data sets are shown. ; The authors are sincerely grateful to Prof. Shirley Pledger for making many valuablesuggestions and stimulating discussions about this work. We are grateful to two referees and an asso-ciate editor for helpful comments and references. Daniel Fernández acknowledges Victoria University ofWellington for providing financial support through Victoria Doctoral Scholarship program. This researchwas supported by the Marsden Fund Council from Government funding, administered by the Royal Societyof New Zealand, Grant E2317-2922 ; Peer Reviewed ; Postprint (author's final draft)