skip to main content
10.1145/1081870.1081952acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
Article

Efficient computations via scalable sparse kernel partial least squares and boosted latent features

Published: 21 August 2005 Publication History

Abstract

Kernel partial least squares (KPLS) has been known as a generic kernel regression method and proven to be competitive with other kernel regression methods such as support vector machines for regression (SVM) and kernel ridge regression. Kernel boosted latent features (KBLF) is a variant of KPLS for any differentiable convex loss functions. It provides a more flexible framework for various predictive modeling tasks such as classification with logistic loss and robust regression with L1 norm loss, etc. However, KPLS and KBLF solutions are dense and thus not suitable for large-scale computations. Sparsification of KPLS solutions has been studied for dual and primal forms. For dual sparsity, it requires solving a nonlinear optimization problem at every iteration step and its computational burden limits its applicability to general regression tasks.In this paper, we propose simple heuristics to approximate sparse solutions for KPLS and the framework is also applied for sparsifying KBLF solutions. The algorithm provides an interesting "path" from a maximum residual criterion based algorithm with orthogonality conditions to the dense KPLS/KBLF. With the orthogonality, it differentiates itself from many existing forward selection-type algorithms. The computational advantage is illustrated by benchmark datasets and comparison to SVM is done.

References

[1]
DELVE: Data for evaluating learning in valid experiments. http://www.cs.toronto.edu/~delve/.
[2]
K. P. Bennett and M. J. Embrechts. An optimization perspective on kernel partial least squares. In Proceedings of the NATO Advanced Study Institute on Learning Theory and Practice (LTP 2002), 2003.
[3]
T. D. Bie, M. Momma, and N. Cristianini. Efficiently learning the metric with side-information. In Algorithmic Learning Theory (ALT2003), pages 175--189. Springer-Verlag Heidelberg, 2003.
[4]
C. L. Blake and C. J. Merz. UCI Repository of machine learning databases, 1998. http://www.ics.uci.edu/~mlearn/MLRepository.html.
[5]
R. Collobert and S. Bengio. Support vector machines for large-scale regression problems. IDIAP-RR-00-17, 2000.
[6]
N. Cristianini and J. Shawe-Taylor. An Introduction to Support Vector Machines and Other Kernel-Based Methods. Cambridge University Press, 2000.
[7]
S. de Jong. SIMPLS: an alternetive approach to partial least squares regression. Chemometrics and Intelligent Laboratory Systems, 18:251--273, 1993.
[8]
M. C. Denham. Implementing partial least squares. Statics and Computing, 5:191--202, 1995.
[9]
J. H. Friedman and B. E. Popescu. Gradient directed regularization of linear regression and classification. Technical report, Stanford University, 2004.
[10]
I. S. Helland. On the structure of partial least squares regression. Communications in Statistics, Simulation and Computation, 17-2:581--607, 1988.
[11]
L. Hoegaerts, J. Suykens, J. Vandewalle, and B. D. Moor. Primal space kernel partial least squares regression for large scale problems. In Proceedings of IEEE International Joint Conference on Neural Networks (IJCNN 2004).
[12]
L. Mason, J. Baxter, P. Bartlett, and M. Frean. Boosting algorithms as gradient descent in function space. Technical report, RSISE, Australian National University, 1999.
[13]
M. Momma. Efficient computations via scalable sparse kernel partial least squares and boosted latent features. Technical report, 2005. http://www.rpi.edu/~mommam/.
[14]
M. Momma and K. Bennett. Constructing orthogonal latent features for arbitrary loss. In I. Guyon, S. Gunn, M. Nikravesh, and L. Zadeh, editors, Feature Extraction, Foundations and Applications. Springer, 2004.
[15]
M. Momma and K. P. Bennett. Sparse kernel partial least squares regression. In Proceedings of The Sixteenth Annual Conference on Learning Theory, pages 216--230, 2003.
[16]
P. B. Nair, A. Choudhury, and A. J. Keane. Some greedy learning algorithms for sparse regression and classification with mercer kernels. Journal of Machine Learning Research, 3:781--801, 2002.
[17]
R. Rosipal, M. Girolami, and L. J. Trejo. Kernel PLS-SVC for linear and nonlinear classification. In Proceedings of the Twentieth International Conference on Machine Learning (ICML-2003), 2003.
[18]
R. Rosipal and L. T. Trejo. Kernel partial least squares regression in reproducing kernel hilbert space. Journal of Machine Learning Research, 2:97--123, 2001.
[19]
H. Wold. Estimation of principal components and related models by iterative least squares. In Multivariate Analysis, pages 391--420, New York, 1966. Academic Press.

Cited By

View all

Index Terms

  1. Efficient computations via scalable sparse kernel partial least squares and boosted latent features

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '05: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
    August 2005
    844 pages
    ISBN:159593135X
    DOI:10.1145/1081870
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 21 August 2005

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. boosted latent features
    2. partial least squares
    3. scalable and sparse kernel method

    Qualifiers

    • Article

    Conference

    KDD05

    Acceptance Rates

    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Upcoming Conference

    KDD '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 27 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media