A critical note on the evaluation of clustering algorithms

T Zhang, L Zhong, B Yuan - arXiv preprint arXiv:1908.03782, 2019 - arxiv.org
arXiv preprint arXiv:1908.03782, 2019arxiv.org
Experimental evaluation is a major research methodology for investigating clustering
algorithms and many other machine learning algorithms. For this purpose, a number of
benchmark datasets have been widely used in the literature and their quality plays a key
role on the value of the research work. However, in most of the existing studies, little
attention has been paid to the properties of the datasets and they are often regarded as
black-box problems. For example, it is common to use datasets intended for classification in …
Experimental evaluation is a major research methodology for investigating clustering algorithms and many other machine learning algorithms. For this purpose, a number of benchmark datasets have been widely used in the literature and their quality plays a key role on the value of the research work. However, in most of the existing studies, little attention has been paid to the properties of the datasets and they are often regarded as black-box problems. For example, it is common to use datasets intended for classification in clustering research and assume class la-bels as the ground truth for judging the quality of cluster-ing. In our work, with the help of advanced visualization and dimension reduction techniques, we show that this practice may seriously compromise the research quality and produce misleading results. We suggest that the applicability of existing benchmark datasets should be carefully revisited and significant efforts need to be devoted to improving the current practice of experimental evaluation of clustering algorithms to ensure an essential match between algorithms and problems.
arxiv.org