article

Free access

Follow the leader if you can, hedge if you must

Editors: Kevin Murphy, Bernhard Schölkopf Authors:

Steven De Rooij,

Peter D. Grünwald,

Wouter M. KoolenAuthors Info & Claims

The Journal of Machine Learning Research, Volume 15, Issue 1

Pages 1281 - 1316

Published: 01 January 2014 Publication History

Abstract

Follow-the-Leader (FTL) is an intuitive sequential prediction strategy that guarantees constant regret in the stochastic setting, but has poor performance for worst-case data. Other hedging strategies have better worst-case guarantees but may perform much worse than FTL if the data are not maximally adversarial. We introduce the FlipFlop algorithm, which is the first method that provably combines the best of both worlds. As a stepping stone for our analysis, we develop AdaHedge, which is a new way of dynamically tuning the learning rate in Hedge without using the doubling trick. AdaHedge refines a method by Cesa-Bianchi, Mansour, and Stoltz (2007), yielding improved worst-case guarantees. By interleaving AdaHedge and FTL, FlipFlop achieves regret within a constant factor of the FTL regret, without sacrificing AdaHedge's worst-case guarantees. AdaHedge and FlipFlop do not need to know the range of the losses in advance; moreover, unlike earlier methods, both have the intuitive property that the issued weights are invariant under rescaling and translation of the losses. The losses are also allowed to be negative, in which case they may be interpreted as gains.

References

[1]

Jean-Yves Audibert. PAC-Bayesian statistical learning theory. PhD thesis, Université Paris VI, 2004.

[2]

Peter Auer, Nicolò Cesa-Bianchi, and Claudio Gentile. Adaptive and self-confident on-line learning algorithms. Journal of Computer and System Sciences, 64:48-75, 2002.

[3]

Olivier Catoni. PAC-Bayesian Supervised Classification. Lecture Notes-Monograph Series. IMS, 2007.

[4]

Nicolò Cesa-Bianchi and Gábor Lugosi. Prediction, learning, and games. Cambridge University Press, 2006.

[5]

Nicolò Cesa-Bianchi, Yoav Freund, David Haussler, David P. Helmbold, Robert E. Schapire, and Manfred K. Warmuth. How to use expert advice. Journal of the ACM, 44(3):427-485, 1997.

[6]

Nicolò Cesa-Bianchi, Yishay Mansour, and Gilles Stoltz. Improved second-order bounds for prediction with expert advice. Machine Learning, 66(2/3):321-352, 2007.

[7]

Kamalika Chaudhuri, Yoav Freund, and Daniel Hsu. A parameter-free hedging algorithm. In Advances in Neural Information Processing Systems 22 (NIPS 2009), pages 297-305, 2009.

[8]

Alexey V. Chernov and Vladimir Vovk. Prediction with advice of unknown number of experts. In Peter Grünwald and Peter Spirtes, editors, UAI, pages 117-125. AUAI Press, 2010.

[9]

Marie Devaine, Pierre Gaillard, Yannig Goude, and Gilles Stoltz. Forecasting electricity consumption by aggregating specialized experts; a review of the sequential aggregation of specialized experts, with an application to Slovakian and French country-wide one-day-ahead (half-)hourly predictions. Machine Learning, 90(2):231-260, February 2013.

[10]

Yoav Freund and Robert E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55:119-139, 1997.

[11]

Yoav Freund and Robert E. Schapire. Adaptive game playing using multiplicative weights. Games and Economic Behavior, 29:79-103, 1999.

[12]

Sébastien Gerchinovitz. Prédiction de suites individuelles et cadre statistique classique: étude de quelques liens autour de la régression parcimonieuse et des techniques d'agrégation. PhD thesis, Université Paris-Sud, 2011.

[13]

Peter Grünwald. Safe learning: bridging the gap between Bayes, MDL and statistical learning theory via empirical convexity. In Proceedings of the 24th International Conference on Learning Theory (COLT 2011), pages 551-573, 2011.

[14]

Peter Grünwald. The safe Bayesian: learning the learning rate via the mixability gap. In Proceedings of the 23rd International Conference on Algorithmic Learning Theory (ALT 2012), 2012.

[15]

Peter Grünwald and John Langford. Suboptimal behavior of Bayes and MDL in classification under misspecification. Machine Learning, 66(2-3):119-149, 2007. DOI 10.1007/s10994-007-0716-7.

[16]

László Györfi and György Ottucsák. Sequential prediction of unbounded stationary time series. IEEE Transactions on Information Theory, 53(5):1866-1872, 2007.

[17]

Elad Hazan and Satyen Kale. Extracting certainty from uncertainty: Regret bounded by variation in costs. In Proceedings of the 21st Annual Conference on Learning Theory (COLT), pages 57-67, 2008.

[18]

Marcus Hutter and Jan Poland. Adaptive online prediction by following the perturbed leader. Journal of Machine Learning Research, 6:639-660, 2005.

[19]

Adam Kalai and Santosh Vempala. Efficient algorithms for online decision. In Proceedings of the 16st Annual Conference on Learning Theory (COLT), pages 506-521, 2003.

[20]

Yuri Kalnishkan and Michael V. Vyugin. The weak aggregating algorithm and weak mixability. In Proceedings of the 18th Annual Conference on Learning Theory (COLT), pages 188-203, 2005.

[21]

Wouter M. Koolen, Manfred K. Warmuth, and Jyrki Kivinen. Hedging structured concepts. In A.T. Kalai and M. Mohri, editors, Proceedings of the 23rd Annual Conference on Learning Theory (COLT 2010), pages 93-105, 2010.

[22]

Nick Littlestone and Manfred K. Warmuth. The weighted majority algorithm. Information and Computation, 108(2):212-261, 1994.

[23]

Koji Tsuda, Gunnar Rätsch, and Manfred K. Warmuth. Matrix exponentiated gradient updates for on-line learning and Bregman projection. Journal of Machine Learning Research, 6:995-1018, 2005.

[24]

Tim van Erven, Peter Grünwald, Wouter M. Koolen, and Steven de Rooij. Adaptive hedge. In Advances in Neural Information Processing Systems 24 (NIPS 2011), pages 1656-1664, 2011.

[25]

Vladimir Vovk. A game of prediction with expert advice. Journal of Computer and System Sciences, 56(2):153-173, 1998.

[26]

Vladimir Vovk. Competitive on-line statistics. International Statistical Review, 69(2):213- 248, 2001.

[27]

Vladimir Vovk, Akimichi Takemura, and Glenn Shafer. Defensive forecasting. In Proceedings of AISTATS 2005, 2005. Archive version available at http://www.vovk.net/df.

[28]

Tong Zhang. Information theoretical upper and lower bounds for statistical estimation. IEEE Transactions on Information Theory, 52(4):1307-1321, 2006.

Cited By

Grand-Clément JKroer C(2024)Solving Optimization Problems with Blackwell ApproachabilityMathematics of Operations Research10.1287/moor.2023.137649:2(697-728)Online publication date: 1-May-2024
https://dl.acm.org/doi/10.1287/moor.2023.1376
van der Hoeven DZhivotovskiy NCesa-Bianchi NKoyejo SMohamed SAgarwal ABelgrave DCho KOh A(2022)A regret-variance trade-off in online learningProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3602820(35188-35200)Online publication date: 28-Nov-2022
https://dl.acm.org/doi/10.5555/3600270.3602820
Pérez-Ortiz MKoolen WKoyejo SMohamed SAgarwal ABelgrave DCho KOh A(2022)Luckiness in multiscale online learningProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3602094(25160-25170)Online publication date: 28-Nov-2022
https://dl.acm.org/doi/10.5555/3600270.3602094
Show More Cited By

Index Terms

Follow the leader if you can, hedge if you must

Recommendations

Regret to the best vs. regret to the average

We study online regret minimization algorithms in an experts setting. In this setting, the algorithm chooses a distribution over experts at each time step and receives a gain that is a weighted average of the experts' instantaneous gains. We consider a ...
Extracting certainty from uncertainty: regret bounded by variation in costs

Prediction from expert advice is a fundamental problem in machine learning. A major pillar of the field is the existence of learning algorithms whose average loss approaches that of the best expert in hindsight (in other words, whose average regret ...
Follow the perturbed approximate leader for solving semi-bandit combinatorial optimization
Abstract
Combinatorial optimization in the face of uncertainty is a challenge in both operational research and machine learning. In this paper, we consider a special and important class called the adversarial online combinatorial optimization with semi-... $^{\frac{}{}}$

Comments

Information & Contributors

Information

Published In

cover image The Journal of Machine Learning Research

The Journal of Machine Learning Research Volume 15, Issue 1

January 2014

4085 pages

ISSN:1532-4435

EISSN:1533-7928

Editors:
Kevin Murphy
Google
,
Bernhard Schölkopf

Issue’s Table of Contents

Publisher

JMLR.org

Publication History

Published: 01 January 2014

Revised: 01 January 2014

Published in JMLR Volume 15, Issue 1

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

26
Total Citations
View Citations
291
Total Downloads

Downloads (Last 12 months)125
Downloads (Last 6 weeks)16

Reflects downloads up to 22 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Grand-Clément JKroer C(2024)Solving Optimization Problems with Blackwell ApproachabilityMathematics of Operations Research10.1287/moor.2023.137649:2(697-728)Online publication date: 1-May-2024
https://dl.acm.org/doi/10.1287/moor.2023.1376
van der Hoeven DZhivotovskiy NCesa-Bianchi NKoyejo SMohamed SAgarwal ABelgrave DCho KOh A(2022)A regret-variance trade-off in online learningProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3602820(35188-35200)Online publication date: 28-Nov-2022
https://dl.acm.org/doi/10.5555/3600270.3602820
Pérez-Ortiz MKoolen WKoyejo SMohamed SAgarwal ABelgrave DCho KOh A(2022)Luckiness in multiscale online learningProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3602094(25160-25170)Online publication date: 28-Nov-2022
https://dl.acm.org/doi/10.5555/3600270.3602094
Wang GHu ZMuthukumar VAbernethy JKoyejo SMohamed SAgarwal ABelgrave DCho KOh A(2022)Adaptive oracle-efficient online learningProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601970(23398-23411)Online publication date: 28-Nov-2022
https://dl.acm.org/doi/10.5555/3600270.3601970
Réda CTirinzoni ADegenne RRanzato MBeygelzimer ADauphin YLiang PVaughan J(2021)Dealing with misspecification in fixed-confidence linear top-m identificationProceedings of the 35th International Conference on Neural Information Processing Systems10.5555/3540261.3542213(25489-25501)Online publication date: 6-Dec-2021
https://dl.acm.org/doi/10.5555/3540261.3542213
Russac YKatsimerou CBohle DCappé OGarivier AKoolen WRanzato MBeygelzimer ADauphin YLiang PVaughan J(2021)A/B/n testing with control in the presence of subpopulationsProceedings of the 35th International Conference on Neural Information Processing Systems10.5555/3540261.3542183(25100-25110)Online publication date: 6-Dec-2021
https://dl.acm.org/doi/10.5555/3540261.3542183
Mourtada JGaïffas S(2021)On the optimality of the Hedge algorithm in the stochastic regimeThe Journal of Machine Learning Research10.5555/3322706.336202420:1(3004-3031)Online publication date: 9-Mar-2021
https://dl.acm.org/doi/10.5555/3322706.3362024
Feng JGhassemi MNaumann TPierson E(2021)Learning to safely approve updates to machine learning algorithmsProceedings of the Conference on Health, Inference, and Learning10.1145/3450439.3451864(164-173)Online publication date: 8-Apr-2021
https://dl.acm.org/doi/10.1145/3450439.3451864
Kocak MRamirez DErkip EShasha D(2021)SafePredict: A Meta-Algorithm for Machine Learning That Uses Refusals to Guarantee CorrectnessIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2019.293241543:2(663-678)Online publication date: 1-Feb-2021
https://dl.acm.org/doi/10.1109/TPAMI.2019.2932415
Degenne RShao HKoolen WDaumé HSingh A(2020)Structure adaptive algorithms for stochastic banditsProceedings of the 37th International Conference on Machine Learning10.5555/3524938.3525166(2443-2452)Online publication date: 13-Jul-2020
https://dl.acm.org/doi/10.5555/3524938.3525166
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents