skip to main content
10.1145/1553374.1553531acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
research-article

Prototype vector machine for large scale semi-supervised learning

Published: 14 June 2009 Publication History

Abstract

Practical data mining rarely falls exactly into the supervised learning scenario. Rather, the growing amount of unlabeled data poses a big challenge to large-scale semi-supervised learning (SSL). We note that the computational intensiveness of graph-based SSL arises largely from the manifold or graph regularization, which in turn lead to large models that are difficult to handle. To alleviate this, we proposed the prototype vector machine (PVM), a highly scalable, graph-based algorithm for large-scale SSL. Our key innovation is the use of "prototypes vectors" for efficient approximation on both the graph-based regularizer and model representation. The choice of prototypes are grounded upon two important criteria: they not only perform effective low-rank approximation of the kernel matrix, but also span a model suffering the minimum information loss compared with the complete model. We demonstrate encouraging performance and appealing scaling properties of the PVM on a number of machine learning benchmark data sets.

References

[1]
Bie, T. D., & Cristianini, N. (2004). Convex methods for transduction. Advances in Neural Information Processing Systems 16 (pp. 73--80).
[2]
Collobert, R., Sinz, F., Weston, J., Bottou, L., & Joachims, T. (2006). Large scale transductive svms. Journal of Machine Learning Research, 7, 2006.
[3]
Delalleau, O., Bengio, Y., & Roux, N. (2005). Efficient non-parametric function induction in semi-supervised learning. Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics (pp. 96--103).
[4]
Fung, G., & Mangasarian, O. L. (2001). Semi-supervised support vector machines for unlabeled data classification. Optimization Methods and Software, 15, 29--44.
[5]
Goldberger, J., & Roweis, S. (2005). Hierarchical clustering of a mixture model. Advances in Neural Information Processing Systems 17 (pp. 505--512).
[6]
Gustavo, C., Marsheva, T., & Zhou, D. (2007). Semi-supervised graph-based hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing, 45, 3044--3054.
[7]
Joachims, T. (1999). Transductive inference for text classification using support vector machines. International Conference on Machine Learning (pp. 200--209). Morgan Kaufmann.
[8]
Lawrence, N., & Jordan, M. (2003). Semi-supervised learning via gaussian processes. Advances in Neural Information Processing Systems 14 (pp. 753--760).
[9]
M. Belkin, M., Niyogi, P., & Sindhwani, V. (2006). Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. Journal of Machine Learning Research, 7, 2399--2434.
[10]
Olivier Chapelle, B. S., & Zien, A. (2006). Semi-supervised learning. MIT.
[11]
Platt, J. (1999). Fast training of support vector machines using sequential minimal optimization. In Advances in kernel methods -- Support vector learning, 185--208.
[12]
Williams, C., & Seeger, M. (2000). The effect of the input density distribution on kernel-based classifiers. Proceedings of the 17th International Conference on Machine Learning (pp. 1159--1166).
[13]
Williams, C., & Seeger, M. (2001). Using the Nyströöm method to speed up kernel machines. Advances in Neural Information Processing Systems 13 (pp. 682--688).
[14]
Xu, Z., Jin, R., Zhu, J., King, I., & Lyu, M. (2008). Efficient convex relaxation for transductive support vector machine. In Advances in neural information processing systems 20, 1641--1648.
[15]
Zhang, K., & Kwok, J. (2008). Improved Nyströöm low rank approximation and error analysis. Proceedings of the 25th international conference on Machine learning (pp. 1232--1239).
[16]
Zhou, D., Bousquet, O., Lal, T. N., Weston, J., & Schöölkopf, B. (2003). Learning with local and global consistency. Neural Information Processing Systems 16 (pp. 321--328).
[17]
Zhu, X., Ghahramani, Z., & Lafferty, J. (2003). Semi-supervised learning using gaussian fields and harmonic functions. In ICML (pp. 912--919).
[18]
Zhu, X., & Lafferty, J. (2005). Harmonic mixtures: combining mixture models and graph-based methods for inductive and scalable semi-supervised learning. The 22nd International Conference on Machine Learning (pp. 1052--1059).

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning
June 2009
1331 pages
ISBN:9781605585161
DOI:10.1145/1553374

Sponsors

  • NSF
  • Microsoft Research: Microsoft Research
  • MITACS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 June 2009

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

Conference

ICML '09
Sponsor:
  • Microsoft Research

Acceptance Rates

Overall Acceptance Rate 140 of 548 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)38
  • Downloads (Last 6 weeks)0
Reflects downloads up to 06 Nov 2024

Other Metrics

Citations

Cited By

View all

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media