Article

Efficient computations via scalable sparse kernel partial least squares and boosted latent features

Author:

Michinari MommaAuthors Info & Claims

KDD '05: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining

Pages 654 - 659

https://doi.org/10.1145/1081870.1081952

Published: 21 August 2005 Publication History

Abstract

Kernel partial least squares (KPLS) has been known as a generic kernel regression method and proven to be competitive with other kernel regression methods such as support vector machines for regression (SVM) and kernel ridge regression. Kernel boosted latent features (KBLF) is a variant of KPLS for any differentiable convex loss functions. It provides a more flexible framework for various predictive modeling tasks such as classification with logistic loss and robust regression with L1 norm loss, etc. However, KPLS and KBLF solutions are dense and thus not suitable for large-scale computations. Sparsification of KPLS solutions has been studied for dual and primal forms. For dual sparsity, it requires solving a nonlinear optimization problem at every iteration step and its computational burden limits its applicability to general regression tasks.In this paper, we propose simple heuristics to approximate sparse solutions for KPLS and the framework is also applied for sparsifying KBLF solutions. The algorithm provides an interesting "path" from a maximum residual criterion based algorithm with orthogonality conditions to the dense KPLS/KBLF. With the orthogonality, it differentiates itself from many existing forward selection-type algorithms. The computational advantage is illustrated by benchmark datasets and comparison to SVM is done.

References

[1]

DELVE: Data for evaluating learning in valid experiments. http://www.cs.toronto.edu/~delve/.

[2]

K. P. Bennett and M. J. Embrechts. An optimization perspective on kernel partial least squares. In Proceedings of the NATO Advanced Study Institute on Learning Theory and Practice (LTP 2002), 2003.

[3]

T. D. Bie, M. Momma, and N. Cristianini. Efficiently learning the metric with side-information. In Algorithmic Learning Theory (ALT2003), pages 175--189. Springer-Verlag Heidelberg, 2003.

[4]

C. L. Blake and C. J. Merz. UCI Repository of machine learning databases, 1998. http://www.ics.uci.edu/~mlearn/MLRepository.html.

[5]

R. Collobert and S. Bengio. Support vector machines for large-scale regression problems. IDIAP-RR-00-17, 2000.

Digital Library

[6]

N. Cristianini and J. Shawe-Taylor. An Introduction to Support Vector Machines and Other Kernel-Based Methods. Cambridge University Press, 2000.

Digital Library

[7]

S. de Jong. SIMPLS: an alternetive approach to partial least squares regression. Chemometrics and Intelligent Laboratory Systems, 18:251--273, 1993.

[8]

M. C. Denham. Implementing partial least squares. Statics and Computing, 5:191--202, 1995.

[9]

J. H. Friedman and B. E. Popescu. Gradient directed regularization of linear regression and classification. Technical report, Stanford University, 2004.

[10]

I. S. Helland. On the structure of partial least squares regression. Communications in Statistics, Simulation and Computation, 17-2:581--607, 1988.

[11]

L. Hoegaerts, J. Suykens, J. Vandewalle, and B. D. Moor. Primal space kernel partial least squares regression for large scale problems. In Proceedings of IEEE International Joint Conference on Neural Networks (IJCNN 2004).

[12]

L. Mason, J. Baxter, P. Bartlett, and M. Frean. Boosting algorithms as gradient descent in function space. Technical report, RSISE, Australian National University, 1999.

[13]

M. Momma. Efficient computations via scalable sparse kernel partial least squares and boosted latent features. Technical report, 2005. http://www.rpi.edu/~mommam/.

[14]

M. Momma and K. Bennett. Constructing orthogonal latent features for arbitrary loss. In I. Guyon, S. Gunn, M. Nikravesh, and L. Zadeh, editors, Feature Extraction, Foundations and Applications. Springer, 2004.

[15]

M. Momma and K. P. Bennett. Sparse kernel partial least squares regression. In Proceedings of The Sixteenth Annual Conference on Learning Theory, pages 216--230, 2003.

[16]

P. B. Nair, A. Choudhury, and A. J. Keane. Some greedy learning algorithms for sparse regression and classification with mercer kernels. Journal of Machine Learning Research, 3:781--801, 2002.

Digital Library

[17]

R. Rosipal, M. Girolami, and L. J. Trejo. Kernel PLS-SVC for linear and nonlinear classification. In Proceedings of the Twentieth International Conference on Machine Learning (ICML-2003), 2003.

[18]

R. Rosipal and L. T. Trejo. Kernel partial least squares regression in reproducing kernel hilbert space. Journal of Machine Learning Research, 2:97--123, 2001.

Digital Library

[19]

H. Wold. Estimation of principal components and related models by iterative least squares. In Multivariate Analysis, pages 391--420, New York, 1966. Academic Press.

Cited By

Rosipal R(2011)Nonlinear Partial Least Squares An OverviewChemoinformatics and Advanced Machine Learning Perspectives10.4018/978-1-61520-911-8.ch009(169-189)Online publication date: 2011
https://doi.org/10.4018/978-1-61520-911-8.ch009
Zhang BMa J(2011)Coal Price Index Forecast by a New Partial Least-Squares RegressionProcedia Engineering10.1016/j.proeng.2011.08.93415(5025-5029)Online publication date: 2011
https://doi.org/10.1016/j.proeng.2011.08.934
KASHIMA HIDÉ TKATO TSUGIYAMA M(2009)Recent Advances and Trends in Large-Scale Kernel MethodsIEICE Transactions on Information and Systems10.1587/transinf.E92.D.1338E92-D:7(1338-1353)Online publication date: 2009
https://doi.org/10.1587/transinf.E92.D.1338
Show More Cited By

Index Terms

Efficient computations via scalable sparse kernel partial least squares and boosted latent features
1. Computing methodologies
  1. Machine learning

Recommendations

Partial least squares regression and projection on latent structure regression (PLS Regression)
Abstract
Partial least squares (PLS) regression (a.k.a. projection on latent structures) is a recent technique that combines features from and generalizes principal component analysis (PCA) and multiple linear regression. Its goal is to predict a set of ...
Sparse functional partial least squares regression with a locally sparse slope function
Abstract
The partial least squares approach has been particularly successful in spectrometric prediction in chemometrics. By treating the spectral data as realizations of a stochastic process, the functional partial least squares can be applied. Motivated ...
Widely linear complex partial least squares for latent subspace regression
Highlights
- A PLS algorithm for complex-valued widely linear regression, WL-CPLS, is proposed.
Abstract
The method of partial least squares (PLS) has become a preferred tool for ill-posed linear estimation problems in the real domain, both in the regression and correlation analysis context. However, many modern applications involve ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '05: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining

August 2005

844 pages

ISBN:159593135X

DOI:10.1145/1081870

General Chair:
Robert Grossman
University of Illinois at Chicago & Open Data Partners, USA
,
Program Chairs:
Roberto Bayardo
IBM Almaden Research, USA
,
Kristin Bennett
RPI, USA

Copyright © 2005 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 August 2005

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

KDD05

Sponsor:

KDD05: The Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

August 21 - 24, 2005

Illinois, Chicago, USA

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Sponsor:
sigkdd
sigkdd

The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 3 - 7, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
531
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Reflects downloads up to 27 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Rosipal R(2011)Nonlinear Partial Least Squares An OverviewChemoinformatics and Advanced Machine Learning Perspectives10.4018/978-1-61520-911-8.ch009(169-189)Online publication date: 2011
https://doi.org/10.4018/978-1-61520-911-8.ch009
Zhang BMa J(2011)Coal Price Index Forecast by a New Partial Least-Squares RegressionProcedia Engineering10.1016/j.proeng.2011.08.93415(5025-5029)Online publication date: 2011
https://doi.org/10.1016/j.proeng.2011.08.934
KASHIMA HIDÉ TKATO TSUGIYAMA M(2009)Recent Advances and Trends in Large-Scale Kernel MethodsIEICE Transactions on Information and Systems10.1587/transinf.E92.D.1338E92-D:7(1338-1353)Online publication date: 2009
https://doi.org/10.1587/transinf.E92.D.1338
Dhanjal CGunn SShawe-Taylor J(2009)Efficient Sparse Kernel Feature Extraction Based on Partial Least SquaresIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2008.17131:8(1347-1361)Online publication date: 1-Aug-2009
https://dl.acm.org/doi/10.1109/TPAMI.2008.171
Dhanjal CGunn SShawe-Taylor J(2006)Sparse Feature Extraction using Generalised Partial Least Squares2006 16th IEEE Signal Processing Society Workshop on Machine Learning for Signal Processing10.1109/MLSP.2006.275558(27-32)Online publication date: Sep-2006
https://doi.org/10.1109/MLSP.2006.275558
Rosipal RKrämer N(2005)Overview and recent advances in partial least squaresProceedings of the 2005 international conference on Subspace, Latent Structure and Feature Selection10.1007/11752790_2(34-51)Online publication date: 23-Feb-2005
https://dl.acm.org/doi/10.1007/11752790_2

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents