skip to main content
10.5555/2997046.2997159guideproceedingsArticle/Chapter ViewAbstractPublication PagesnipsConference Proceedingsconference-collections
Article

Multiple kernel learning and the SMO algorithm

Published: 06 December 2010 Publication History

Abstract

Our objective is to train p-norm Multiple Kernel Learning (MKL) and, more generally, linear MKL regularised by the Bregman divergence, using the Sequential Minimal Optimization (SMO) algorithm. The SMO algorithm is simple, easy to implement and adapt, and efficiently scales to large problems. As a result, it has gained widespread acceptance and SVMs are routinely trained using SMO in diverse real world applications. Training using SMO has been a long standing goal in MKL for the very same reasons. Unfortunately, the standard MKL dual is not differentiable, and therefore can not be optimised using SMO style co-ordinate ascent. In this paper, we demonstrate that linear MKL regularised with the p-norm squared, or with certain Bregman divergences, can indeed be trained using SMO. The resulting algorithm retains both simplicity and efficiency and is significantly faster than state-of-the-art specialised p-norm MKL solvers. We show that we can train on a hundred thousand kernels in approximately seven minutes and on fifty thousand points in less than half an hour on a single core.

References

[1]
http://www.csie.ntu.edu.tw/ cjlin/libsvmtools/datasets/binary.html.
[2]
F. R. Bach. Exploring large feature spaces with hierarchical multiple kernel learning. In NIPS, pages 105-112, 2008.
[3]
F. R. Bach, G. R. G. Lanckriet, and M. I. Jordan. Multiple kernel learning, conic duality, and the SMO algorithm. In ICML, pages 6-13, 2004.
[4]
A. Ben-Tal, T. Margalit, and A. Nemirovski. The ordered subsets mirror descent optimization method with applications to tomography. SIAM Journal of Opimization, 12(1):79-108, 2001.
[5]
C.-C. Chang and C.-J. Lin. LIBSVM: a library for support vector machines, 2001. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
[6]
C. Cortes, M. Mohri, and A. Rostamizadeh. L2 regularization for learning kernels. In UAI, 2009.
[7]
C. Cortes, M. Mohri, and A. Rostamizadeh. Learning non-linear combinations of kernels. In NIPS, 2009.
[8]
R. E. Fan, P. H. Chen, and C. J. Lin. Working set selection using second order information for training SVM. JMLR, 6:1889-1918, 2005.
[9]
C. Gentile. Robustness of the p-norm algorithms. ML, 53(3):265-299, 2003.
[10]
M. Gonen and E. Alpaydin. Localized multiple kernel learning. In ICML, 2008.
[11]
J. Kivinen, M. K. Warmuth, andB. Hassibi. The p-norm generaliziation of the LMS algorithm for adaptive filtering. IEEE Trans. Signal Processing, 54(5):1782-1793, 2006.
[12]
M. Kloft, U. Brefeld, S. Sonnenburg, P. Laskov, K.-R. Muller, and A. Zien. Efficient and accurate lp-norm Multiple Kernel Learning. In NIPS, 2009.
[13]
G. R. G. Lanckriet, N. Cristianini, P. Bartlett, L. El Ghaoui, and M. I. Jordan. Learning the kernel matrix with semidefinite programming. JMLR, 5:27-72, 2004.
[14]
C. J. Lin, S. Lucidi, L. Palagi, A. Risi, and M. Sciandrone. Decomposition algorithm model for singly linearly-constrained problems subject to lower and upper bounds. JOTA, 141(1):107-126, 2009.
[15]
J. Platt. Fast training of support vector machines using sequential minimal optimization. In Advances in Kernel Methods - Support Vector Learning, pages 185-208, 1999.
[16]
A. Rakotomamonjy, F. Bach, Y. Grandvalet, and S. Canu. SimpleMKL. JMLR, 9:2491-2521, 2008.
[17]
S. Sonnenburg, G. Raetsch, C. Schaefer, and B. Schoelkopf. Large scale multiple kernel learning. JMLR, 7:1531-1565, 2006.
[18]
M. Varma and B. R. Babu. More generality in efficient multiple kernel learning. In ICML, 2009.
[19]
A. Vedaldi, V. Gulshan, M. Varma, and A. Zisserman. Multiple kernels for object detection. In ICCV, 2009.
[20]
S. V. N. Vishwanathan, Z. Sun, N. Theera-Ampornpunt, and M. Varma, 2010. The SMO-MKL code http://research.microsoft.com/~manik/code/SMO-MKL/download.html.
[21]
J. Yang, Y. Li, Y. Tian, L. Duan, and W. Gao. Group-sensitive multiple kernel learning for object categorization. In ICCV, 2009.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
NIPS'10: Proceedings of the 24th International Conference on Neural Information Processing Systems - Volume 2
December 2010
2630 pages

Publisher

Curran Associates Inc.

Red Hook, NY, United States

Publication History

Published: 06 December 2010

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media