Bayesian clustering with priors on partitions
In: Statistica Neerlandica: journal of the Netherlands Society for Statistics and Operations Research, Band 65, Heft 4, S. 371-386
ISSN: 1467-9574
Traditional clustering algorithms are deterministic in the sense that a given dataset always leads to the same output partition. This article modifies traditional clustering algorithms whereby data are associated with a probability model, and clustering is carried out on the stochastic model parameters rather than the data. This is done in a principled way using a Bayesian approach which allows the assignment of posterior probabilities to output partitions. In addition, the approach incorporates prior knowledge of the output partitions using Bayesian melding. The methodology is applied to two substantive problems: (i) a question of stylometry involving a simulated dataset and (ii) the assessment of potential champions of the 2010 FIFA World Cup.