skip to main content
10.1145/1390156.1390212acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
research-article

Efficient bandit algorithms for online multiclass prediction

Published: 05 July 2008 Publication History

Abstract

This paper introduces the Banditron, a variant of the Perceptron [Rosenblatt, 1958], for the multiclass bandit setting. The multiclass bandit setting models a wide range of practical supervised learning applications where the learner only receives partial feedback (referred to as "bandit" feedback, in the spirit of multi-armed bandit models) with respect to the true label (e.g. in many web applications users often only provide positive "click" feedback which does not necessarily fully disclose a true label). The Banditron has the ability to learn in a multiclass classification setting with the "bandit" feedback which only reveals whether or not the prediction made by the algorithm was correct or not (but does not necessarily reveal the true label). We provide (relative) mistake bounds which show how the Banditron enjoys favorable performance, and our experiments demonstrate the practicality of the algorithm. Furthermore, this paper pays close attention to the important special case when the data is linearly separable --- a problem which has been exhaustively studied in the full information setting yet is novel in the bandit setting.

References

[1]
R. I. Arriaga and S. Vempala. An algorithmic theory of learning: Robust concepts and random projection. Mach. Learn., 63(2), 2006.
[2]
P. Auer, N. Cesa-Bianchi, Y. Freund, and R. E. Schapire. Gambling in a rigged casino: the adversarial multi-armed bandit problem. In Proceedings of the 36th Annual FOCS, 1998.
[3]
K. Crammer and Y. Singer. Ultraconservative online algorithms for multiclass problems. Journal of Machine Learning Research, 3:951--991, 2003.
[4]
K. Crammer, O. Dekel, J. Keshet, S. Shalev-Shwartz, and Y. Singer. Online passive aggressive algorithms. Journal of Machine Learning Research, 7:551--585, Mar 2006.
[5]
R. O. Duda and P. E. Hart. Pattern Classification and Scene Analysis. Wiley, 1973.
[6]
A. Elisseeff and J. Weston. A kernel method for multi-labeled classification. In Advances in Neural Information Processing Systems 14, 2001.
[7]
M. Fink, S. Shalev-Shwartz, Y. Singer, and S. Ullman. Online multiclass learning by interclass hypothesis sharing. In Proceedings of the 23rd International Conference on Machine Learning, 2006.
[8]
A. Flaxman, A. Kalai, and H. B. McMahan. Online convex optimization in the bandit setting: gradient descent without a gradient. In Proceedings of the Sixteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 385--394, 2005.
[9]
Y. Freund and R. E. Schapire. Large margin classification using the perceptron algorithm. Machine Learning, 37 (3):277--296, 1999.
[10]
J. Kivinen and M. Warmuth. Exponentiated gradient versus gradient descent for linear predictors. Information and Computation, 132(1):1--64, January 1997.
[11]
R. D. Kleinberg. Nearly tight bounds for the continuumarmed bandit problem. NIPS, 2004.
[12]
J. Langford and T. Zhang. The epoch-greedy algorithm for contextual multi-armed bandits. NIPS, 2007.
[13]
N. Littlestone. Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. Machine Learning, 2:285--318, 1988.
[14]
F. Rosenblatt. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65:386--407, 1958. (Reprinted in Neurocomputing (MIT Press, 1988).).
[15]
S. Shalev-Shwartz and Y. Singer. A primal-dual perspective of online learning algorithms. Machine Learning Journal, 2007.
[16]
V. N. Vapnik. Statistical Learning Theory. Wiley, 1998.
[17]
J. Weston and C. Watkins. Support vector machines for multi-class pattern recognition. In Proceedings of the Seventh European Symposium on Artificial Neural Networks, April 1999.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICML '08: Proceedings of the 25th international conference on Machine learning
July 2008
1310 pages
ISBN:9781605582054
DOI:10.1145/1390156
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

  • Pascal
  • University of Helsinki
  • Xerox
  • Federation of Finnish Learned Societies
  • Google Inc.
  • NSF
  • Machine Learning Journal/Springer
  • Microsoft Research: Microsoft Research
  • Intel: Intel
  • Yahoo!
  • Helsinki Institute for Information Technology
  • IBM: IBM

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 July 2008

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

ICML '08
Sponsor:
  • Microsoft Research
  • Intel
  • IBM

Acceptance Rates

Overall Acceptance Rate 140 of 548 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)74
  • Downloads (Last 6 weeks)4
Reflects downloads up to 09 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media