skip to main content
research-article

Robust Graph Regularized Nonnegative Matrix Factorization for Clustering

Published: 06 March 2017 Publication History

Abstract

Matrix factorization is often used for data representation in many data mining and machine-learning problems. In particular, for a dataset without any negative entries, nonnegative matrix factorization (NMF) is often used to find a low-rank approximation by the product of two nonnegative matrices. With reduced dimensions, these matrices can be effectively used for many applications such as clustering. The existing methods of NMF are often afflicted with their sensitivity to outliers and noise in the data. To mitigate this drawback, in this paper, we consider integrating NMF into a robust principal component model, and design a robust formulation that effectively captures noise and outliers in the approximation while incorporating essential nonlinear structures. A set of comprehensive empirical evaluations in clustering applications demonstrates that the proposed method has strong robustness to gross errors and superior performance to current state-of-the-art methods.

References

[1]
Sanjeev Arora, Rong Ge, Ravindran Kannan, and Ankur Moitra. 2012. Computing a nonnegative matrix factorization--provably. In Proceedings of the 44th Annual ACM Symposium on Theory of Computing. ACM, 145--162.
[2]
Ronen Basri and David W. Jacobs. 2003. Lambertian reflectance and linear subspaces. IEEE Transactions on Pattern Analysis and Machine Intelligence 25, 2 (2003), 218--233.
[3]
Peter N. Belhumeur, João P. Hespanha, and David J. Kriegman. 1997. Eigenfaces vs. fisherfaces: Recognition using class specific linear projection. IEEE Transactions on Pattern Analysis and Machine Intelligence 19, 7 (1997), 711--720.
[4]
Michael W. Berry, Murray Browne, Amy N. Langville, V. Paul Pauca, and Robert J. Plemmons. 2007. Algorithms and applications for approximate nonnegative matrix factorization. Computational Statistics and Data Analysis 52, 1 (2007), 155--173.
[5]
Dimitri P. Bertsekas. 1982. Constrained optimization and lagrange multiplier methods. Computer Science and Applied Mathematics, Boston: Academic Press, 1982 1 (1982).
[6]
Thomas Blumensath and Mike E. Davies. 2009. Iterative hard thresholding for compressed sensing. Applied and Computational Harmonic Analysis 27, 3 (2009), 265--274.
[7]
Deng Cai. 2011. Litekmeans: The fastest matlab implementation of kmeans. Available at: http://www.zjucadcg.cn/dengcai/Data/Clustering.html (2011).
[8]
Deng Cai, Xiaofei He, Jiawei Han, and Thomas S. Huang. 2011. Graph regularized nonnegative matrix factorization for data representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 8 (2011), 1548--1560.
[9]
Emmanuel J. Candès, Xiaodong Li, Yi Ma, and John Wright. 2011. Robust principal component analysis? Journal of the ACM (JACM) 58, 3 (2011), 11.
[10]
M. Catral, Lixing Han, Michael Neumann, and R. J. Plemmons. 2004. On reduced rank nonnegative matrix factorization for symmetric nonnegative matrices. Linear Algebra and its Applications 393 (2004), 107--126.
[11]
Fan R. K. Chung. 1997. Spectral Graph Theory. Vol. 92. American Mathematical Soc.
[12]
Matthew Cooper and Jonathan Foote. 2002. Summarizing video using non-negative similarity matrix factorization. In Proceedings of IEEE Workshop on Multimedia Signal Processing, 2002. IEEE, 25--28.
[13]
Scott C. Deerwester, Susan T. Dumais, Thomas K. Landauer, George W. Furnas, and Richard A. Harshman. 1990. Indexing by latent semantic analysis. Journal of the American Society for Information Science 41, 6 (1990), 391--407.
[14]
Richard O. Duda, Peter E. Hart, and David G. Stork. 2012. Pattern Classification. John Wiley 8 Sons.
[15]
Ehsan Elhamifar and Rene Vidal. 2013. Sparse subspace clustering: Algorithm, theory, and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 11 (2013), 2765--2781.
[16]
Ernie Esser, Michael Möller, Stanley Osher, Guillermo Sapiro, and Jack Xin. 2012. A convex model for nonnegative matrix factorization and dimensionality reduction on physical space. IEEE Transactions on Image Processing 21, 7 (2012), 3239--3252.
[17]
Cédric Févotte and Jérôme Idier. 2011. Algorithms for nonnegative matrix factorization with the β-divergence. Neural Computation 23, 9 (2011), 2421--2456.
[18]
Athinodoros S. Georghiades, Peter N. Belhumeur, and David J. Kriegman. 2001. From few to many: Illumination cone models for face recognition under variable lighting and pose. IEEE Transactions on Pattern Analysis and Machine Intelligence 23, 6 (2001), 643--660.
[19]
Nicolas Gillis and Stephen A. Vavasis. 2014. Fast and robust recursive algorithms for separable nonnegative matrix factorization. IEEE Transactions on Pattern Analysis and Machine Intelligence 36, 4 (2014), 698--714.
[20]
Naiyang Guan, Dacheng Tao, Zhigang Luo, and Bo Yuan. 2012a. NeNMF: An optimal gradient method for nonnegative matrix factorization. IEEE Transactions on Signal Processing 60, 6 (2012), 2882--2898.
[21]
Naiyang Guan, Dacheng Tao, Zhigang Luo, and Bo Yuan. 2012b. Online nonnegative matrix factorization with robust stochastic approximation. IEEE Transactions on Neural Networks and Learning Systems 23, 7 (2012), 1087--1099.
[22]
A. Ben Hamza and David J. Brady. 2006. Reconstruction of reflectance spectra using robust nonnegative matrix factorization. IEEE Transactions on Signal Processing 54, 9 (2006), 3637--3642.
[23]
Magnus R. Hestenes. 1969. Multiplier and gradient methods. Journal of Optimization Theory and Applications 4, 5 (1969), 303--320.
[24]
Darryl Hond and Libor Spacek. 1997. Distinctive descriptions for face processing. In 8 British Machine Vision Conference. Vol. 1. 320--329.
[25]
Jin Huang, Feiping Nie, Heng Huang, and Chris Ding. 2014. Robust manifold nonnegative matrix factorization. ACM Transactions on Knowledge Discovery from Data (TKDD) 8, 3 (2014), 11.
[26]
Ian Jolliffe. 2002. Principal Component Analysis. Wiley Online Library.
[27]
Zhao Kang, Chong Peng, and Qiang Cheng. 2015a. Robust PCA via nonconvex rank approximation. In Proceedings of IEEE International Conference on Data Mining (ICDM), 2015. IEEE, 211--220.
[28]
Zhao Kang, Chong Peng, and Qiang Cheng. 2015b. Robust subspace clustering via smoothed rank approximation. Signal Processing Letters, IEEE 22, 11 (2015), 2088--2092.
[29]
Zhao Kang, Chong Peng, and Qiang Cheng. 2016. Top-N recommender system via matrix completion. In Proceeding of American Association for Artificial Intelligence. Citeseer, 179--185.
[30]
Jingu Kim, Yunlong He, and Haesun Park. 2014. Algorithms for nonnegative matrix and tensor factorizations: A unified view based on block coordinate descent framework. Journal of Global Optimization 58, 2 (2014), 285--319.
[31]
Da Kuang, Haesun Park, and Chris HQ Ding. 2012. Symmetric nonnegative matrix factorization for graph clustering. In Proceedings of the 2012 SIAM International Conference on Data Mining, Vol. 12. SIAM, 106--117.
[32]
Daniel D. Lee and H. Sebastian Seung. 1999. Learning the parts of objects by non-negative matrix factorization. Nature 401, 6755 (1999), 788--791.
[33]
Daniel D. Lee and H. Sebastian Seung. 2001. Algorithms for non-negative matrix factorization. In Advances in Neural Information Processing Systems. MIT Press, 556--562.
[34]
M. Lichman. 2013. UCI Machine Learning Repository. (2013). http://archive.ics.uci.edu/ml.
[35]
Chih-Jen Lin. 2007. On the convergence of multiplicative update algorithms for nonnegative matrix factorization. IEEE Transactions on Neural Networks 18, 6 (2007), 1589--1596.
[36]
Zhouchen Lin, Minming Chen, and Yi Ma. 2010. The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices. arXiv:1009.5055 (2010).
[37]
Guangcan Liu, Zhouchen Lin, Shuicheng Yan, Ju Sun, Yong Yu, and Yi Ma. 2013. Robust recovery of subspace structures by low-rank representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 1 (2013), 171--184.
[38]
Michael J. Lyons, Shigeru Akamatsu, Miyuki Kamachi, Jiro Gyoba, and Julien Budynek. 1998. The Japanese female facial expression (JAFFE) database. (1998).
[39]
A. M. Martinez and R. Benavente. 1998. The AR face database. CVC Tech. Report #24.
[40]
Sameer A. Nene, Shree K. Nayar, Hiroshi Murase, and others. 1996. Columbia Object Image Library (COIL-20). Technical Report. Technical Report CUCS-005-96.
[41]
Andrew Y. Ng, Michael I. Jordan, Yair Weiss, and others. 2002. On spectral clustering: Analysis and an algorithm. Advances in Neural Information Processing Systems 2 (2002), 849--856.
[42]
Feiping Nie, Chris Ding, Dijun Luo, and Heng Huang. 2010. Improved minmax cut graph clustering with nonnegative relaxation. In Machine Learning and Knowledge Discovery in Databases. Springer, 451--466.
[43]
Feiping Nie, Heng Huang, Xiao Cai, and Chris H. Ding. 2010. Efficient and robust feature selection via joint 2, 1-norms minimization. In Advances in Neural Information Processing Systems. 1813--1821.
[44]
Feiping Nie, Heng Huang, and Chris H. Q. Ding. 2012. Low-rank matrix recovery via efficient schatten p-norm minimization. In Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence. AAAI Press, 655--661.
[45]
Feiping Nie, Hua Wang, Xiao Cai, Heng Huang, and Chibiao Ding. 2012. Robust matrix completion via joint schatten p-norm and lp-norm minimization. In Proceeding of the 12th IEEE International Conference on Data Mining (ICDM), 2012. IEEE, 566--574.
[46]
Feiping Nie, Xiaoqian Wang, and Heng Huang. 2014. Clustering and projected clustering with adaptive neighbors. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 977--986.
[47]
Feiping Nie, Xiaoqian Wang, Michael I. Jordan, and Heng Huang. 2016. The constrained Laplacian rank algorithm for graph-based clustering. In AAAI. Citeseer, 1969--1976.
[48]
Feiping Nie, Dong Xu, and Xuelong Li. 2012. Initialization independent clustering with actively self-training method. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 42, 1 (2012), 17--27.
[49]
Feiping Nie, Zinan Zeng, Ivor W. Tsang, Dong Xu, and Changshui Zhang. 2011. Spectral embedded clustering: A framework for in-sample and out-of-sample spectral clustering. IEEE Transactions on Neural Networks 22, 11 (2011), 1796--1808.
[50]
Alexey Ozerov and Cédric Févotte. 2010. Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation. IEEE Transactions on Audio, Speech, and Language Processing 18, 3 (2010), 550--563.
[51]
Pentti Paatero and Unto Tapper. 1994. Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5, 2 (1994), 111--126.
[52]
Chong Peng, Zhao Kang, and Qiang Cheng. 2016. A fast factorization-based approach to robust PCA. In 2016 IEEE 16th International Conference on Data Mining (ICDM). 1137--1142.
[53]
Chong Peng, Zhao Kang, Huiqing Li, and Qiang Cheng. 2015. Subspace clustering using log-determinant rank approximation. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 925--934.
[54]
Chong Peng, Zhao Kang, Ming Yang, and Qiang Cheng. 2016. Feature selection embedded subspace clustering. IEEE Signal Processing Letters 23, 7 (July 2016), 1018--1022.
[55]
Ferdinando S. Samaria and Andy C. Harter. 1994. Parameterisation of a stochastic model for human face identification. In Proceedings of the 2nd IEEE Workshop on Applications of Computer Vision, 1994. IEEE, 138--142.
[56]
Peter H. Schönemann. 1966. A generalized solution of the orthogonal Procrustes problem. Psychometrika 31, 1 (1966), 1--10.
[57]
Bin Shen, Bao-Di Liu, Qifan Wang, and Rongrong Ji. 2014. Robust nonnegative matrix factorization via L 1 norm regularization by multiplicative updating rules. In Proceeding of IEEE International Conference on Image Processing (ICIP), 2014. IEEE, 5282--5286.
[58]
Jianbo Shi and Jitendra Malik. 2000. Normalized cuts and image segmentation. Pattern Analysis and Machine Intelligence, IEEE Transactions on 22, 8 (2000), 888--905.
[59]
Stephen A. Vavasis. 2009. On the complexity of nonnegative matrix factorization. SIAM Journal on Optimization 20, 3 (2009), 1364--1377.
[60]
Yu-Xiong Wang and Yu-Jin Zhang. 2013. Nonnegative matrix factorization: A comprehensive review. IEEE Transactions on Knowledge and Data Engineering 25, 6 (2013), 1336--1353.
[61]
John Wright, Arvind Ganesh, Shankar Rao, Yigang Peng, and Yi Ma. 2009. Robust principal component analysis: Exact recovery of corrupted low-rank matrices via convex optimization. In Advances in Neural Information Processing Systems. Curran Associates, Inc., 2080--2088.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data
ACM Transactions on Knowledge Discovery from Data  Volume 11, Issue 3
August 2017
372 pages
ISSN:1556-4681
EISSN:1556-472X
DOI:10.1145/3058790
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 March 2017
Accepted: 01 September 2016
Revised: 01 May 2016
Received: 01 October 2015
Published in TKDD Volume 11, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Nonnegative factorization
  2. clustering
  3. manifold
  4. robust principal component analysis

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)27
  • Downloads (Last 6 weeks)6
Reflects downloads up to 15 Sep 2024

Other Metrics

Citations

Cited By

View all

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media