skip to main content
10.5555/1795114.1795128guideproceedingsArticle/Chapter ViewAbstractPublication PagesicpsprocConference Proceedingsconference-collections
research-article
Free access

L2 regularization for learning kernels

Published: 18 June 2009 Publication History

Abstract

The choice of the kernel is critical to the success of many learning algorithms but it is typically left to the user. Instead, the training data can be used to learn the kernel by selecting it out of a given family, such as that of non-negative linear combinations of p base kernels, constrained by a trace or L1 regularization. This paper studies the problem of learning kernels with the same family of kernels but with an L2 regularization instead, and for regression problems. We analyze the problem of learning kernels with ridge regression. We derive the form of the solution of the optimization problem and give an efficient iterative algorithm for computing that solution. We present a novel theoretical analysis of the problem based on stability and give learning bounds for orthogonal kernels that contain only an additive term O(√p/m) when compared to the standard kernel ridge regression stability bound. We also report the results of experiments indicating that L1 regularization can lead to modest improvements for a small number of kernels, but to performance degradations in larger-scale cases. In contrast, L2 regularization never degrades performance and in fact achieves significant improvements with a large number of kernels.

References

[1]
Argyriou, A., Hauser, R., Micchelli, C., & Pontil, M. (2006). A DC-programming algorithm for kernel selection. ICML.
[2]
Argyriou, A., Micchelli, C., & Pontil, M. (2005). Learning convex combinations of continuously parameterized basic kernels. COLT.
[3]
Bach, F. (2008). Exploring large feature spaces with hierarchical multiple kernel learning. NIPS.
[4]
Blitzer, J., Dredze, M., & Pereira, F. (2007). Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification. Association for Computational Linguistics.
[5]
Boser, B., Guyon, I., & Vapnik, V. (1992). A training algorithm for optimal margin classifiers. COLT.
[6]
Bousquet, O., & Elisseeff, A. (2002). Stability and generalization. JMLR, 2.
[7]
Bousquet, O., & Herrmann, D. J. L. (2002). On the complexity of learning the kernel matrix. NIPS.
[8]
Cortes, C., Mohri, M., & Rostamizadeh, A. (2008). Learning sequence kernels. MLSP.
[9]
Cortes, C., & Vapnik, V. (1995). Support-Vector Networks. Machine Learning, 20.
[10]
Jebara, T. (2004). Multi-task feature and kernel selection for SVMs. ICML.
[11]
Lanckriet, G., Cristianini, N., Bartlett, P., Ghaoui, L. E., & Jordan, M. (2004). Learning the kernel matrix with semidefinite programming. JMLR, 5.
[12]
Lewis, D. P., Jebara, T., & Noble, W. S. (2006). Nonstationary kernel combination. ICML.
[13]
Micchelli, C., & Pontil, M. (2005). Learning the kernel function via regularization. JMLR, 6.
[14]
Ong, C. S., Smola, A., & Williamson, R. (2005). Learning the kernel with hyperkernels. JMLR, 6.
[15]
Saunders, C., Gammerman, A., & Vovk, V. (1998). Ridge Regression Learning Algorithm in Dual Variables. ICML.
[16]
Schölkopf, B., & Smola, A. (2002). Learning with kernels. MIT Press: Cambridge, MA.
[17]
Shawe-Taylor, J., & Cristianini, N. (2004). Kernel methods for pattern analysis. Cambridge Univ. Press.
[18]
Srebro, N., & Ben-David, S. (2006). Learning bounds for support vector machines with learned kernels. COLT.
[19]
Vapnik, V. N. (1998). Statistical learning theory. John Wiley & Sons.
[20]
von Neumann, J. (1937). Uber ein ökonomisches Gleichungssystem. Ergebn. Math. Kolloq. Wein 8.
[21]
Zien, A., & Ong, C. S. (2007). Multiclass multiple kernel learning. ICML.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
UAI '09: Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
June 2009
667 pages
ISBN:9780974903958

Sponsors

  • Google Inc.
  • IBMR: IBM Research
  • Intel: Intel
  • Microsoft Research: Microsoft Research

Publisher

AUAI Press

Arlington, Virginia, United States

Publication History

Published: 18 June 2009

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)96
  • Downloads (Last 6 weeks)16
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media