skip to main content
research-article

Weighted clustering ensemble: : Towards learning the weights of the base clusterings

Published: 01 January 2017 Publication History

Abstract

Clustering ensemble refers to the problem of obtaining a final clustering of a dataset by combining multiple partitions computed by different clustering algorithms. The clustering ensemble has emerged as a prominent method for improving robustness of unsupervised classification solutions. This problem has been received an increasing attention in recent years but a little attention has been paid to weight the combined clusterings without access the original data. We address in this paper the problem of weighted clustering ensemble problem by defining an unsupervised method to compute the weight of each combined clustering without access the original data. The weight of each base clustering is computed using its quality and the quality of its neighbouring clusterings. The proposed method permits to estimate the right number of clusters of the final clustering before the combining step by exploiting the generated weights.

References

[1]
A. Strehl and J. Ghosh, Cluster ensembles: A knowledge reuse framework for combining multiple partitions, Journal of Machine Learning Research 3(1) (2002), 583–617.
[2]
A. Topchy, A.K. Jain and W. Punch, A mixture model for clustering ensembles, in: Proceedings of the SIAM International Conference on Data Mining, Florida, USA (22–24 April 2004), 379–390.
[3]
A. Topchy, A.K. Jain and W. Punch, Combining multiple weak clusterings, in: Proceedings of the 3rd IEEE International Conference on Data Mining, Melbourne (19–22 November 2003), 331–338.
[4]
B. Fischer and J.M. Buhmann, Path-based clustering for grouping of smooth curves and texture segmentation, IEEE Transaction on Pattern Analysis and Machine Intelligence 25(4) (2003), 513–518.
[5]
B. Rouba and S. Nait Bahloul, A multicriteria clustering approach based on similarity indices and clustering ensemble techniques, International Journal of Information Technology and Decision Making 13(4) (2014), 811–837.
[6]
B. Rouba and S. Nait Bahloul, Minimization of the disagreements in clustering aggregation, in: Proceedings of the International Conference of Intelligent Computing, Shanghai, China (15–18 September 2008), 517–524.
[7]
B. Rouba, S.N. Bahloul, D.N. Ammour and D. Zaaf, GACMC: A binary method for combining multiple clusterings using a genetic algorithm, The Mediterranean Journal of Computers and Networks MEDJCN 8(3) (2012), 93–101.
[8]
D. Huang, C.-D. Wang and J.-H. Lai, Locally weighted ensemble clustering, IEEE Transactions on Cybernetics PP(99) (2017), 1–14.
[9]
D. Huang, J.H. Lai and C.D. Wang, Combining multiple clusterings via crowd agreement estimation and multi-granularity link analysis, Neurocomputing 170(3) (2015), 240–250.
[10]
G. Sheikholeslami, S. Chaterjee and A. Zhang, WaveCluster: A multi-resolution clustering approach for very large databases, in: Proceedings of the 24th International Conference of Very Large Databases VLDB, New York, USA (24–27 August 1998), 428–439.
[11]
J. McQueen, Some methods for classification and analysis of multivariate observations, in: Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability 1 (1967), (Univ of Calif Press), 281–297.
[12]
K. Hornik and W. Bohm, Hard and soft Euclidean consensus partitions, Studies in Classification, Data Analysis, and Knowledge Organization, Springer (2008), 147–154.
[13]
L. Kaufman and P.J. Rousseeuw, Groups in data: An introduction to cluster analysis, Ed Wiley, New York, 1990.
[14]
M. Al-Razgan and C. Domeniconi, Weighted clustering ensembles, ACM Transactions on Knowledge Discovery from Data 2(4) (2009), 1–40.
[15]
M. Ester, H.P. Kriegel, J. Sander and X. Xu, A density based algorithm for discovering clusters in large spatial databases with noise, in: Proceedings of the 2nd International Conference of Knowledge Discovery and Data Mining, Portland, Oregon (2–4 August 1996), 226–231.
[16]
M. Mohammadi, A. Azade, M. Saberi and A. Azaron, Genetic algorithm-based clustering ensemble: Determination number of clusters, International Journal of Business Forecasting and Marketing Intelligence 1(3) (2010), 201–216.
[17]
M. Mohammadi, A. Nikanjam and A. Rahmani, An evolutionary approach to clustering ensemble, in: Proceedings of the 4th International Conference of Natural Computation, Jinan, China (18–20 October 2008), 77–82.
[18]
N. Li and L.J. Latecki, Clustering aggregation as maximum-weight independent set, in: Proceedings of the 25th Conference on Advances in Neural Information Processing Systems, Lake Tahoe, Nevada, USA, (3–6 December 2012), 791–799.
[19]
R. Ghaemi, N. Sulaiman, H. Ibrahim and N. Mustapha, A survey: Clustering ensembles techniques, in: Proceedings of the World Academy of Science, Engineering and Technology 38 (2009), 636–645.
[20]
S. Guha, R. Rastogi and K. Shim, CURE: An efficient clustering algorithm for large databases, in: Proceedings of the ACM SIGMOD International Conference of Management of Data, Seattle, WA, USA (1–4 June 1998), 73–84.
[21]
S. Mimaroglu and E. Erdil, Combining multiple clusterings using similarity graph, Pattern Recognition 44 (2011), 694–703.
[22]
S. Vega-Pons, J. Correa-Morris and J. Ruiz-Shulcloper, Weighted partition consensus via kernels, Pattern Recognition 43(8) (2010), 2712–2724.
[23]
T. Li and C. Ding, Weighted consensus clustering, in: Proceedings of the 8th SIAM International Conference on Data Mining, Atlanta, Georgia (24–26 April 2008), 798–809.
[24]
Z. Zhang, H. Cheng, S. Zhang, W. Chen and Q. Fang, Clustering aggregation based on genetic algorithm for document clusterings, in: Proceedings of the IEEE Congress on Evolutionary Computation, Hong Kong, China (1–6 June 2008), 3156–3161.

Index Terms

  1. Weighted clustering ensemble: Towards learning the weights of the base clusterings
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Information & Contributors

          Information

          Published In

          cover image Multiagent and Grid Systems
          Multiagent and Grid Systems  Volume 13, Issue 4
          2017
          98 pages

          Publisher

          IOS Press

          Netherlands

          Publication History

          Published: 01 January 2017

          Author Tags

          1. Clustering
          2. clustering ensemble
          3. weight
          4. evidence accumulation

          Qualifiers

          • Research-article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • 0
            Total Citations
          • 0
            Total Downloads
          • Downloads (Last 12 months)0
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 23 Jan 2025

          Other Metrics

          Citations

          View Options

          View options

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media