Online Non-Additive Path Learning under Full and Partial Information

Corinna Cortes; Vitaly Kuznetsov; Mehryar Mohri; Holakou Rahmanian; Manfred Warmuth

Online Non-Additive Path Learning under Full and Partial Information

Corinna Cortes, Vitaly Kuznetsov, Mehryar Mohri, Holakou Rahmanian, Manfred Warmuth

Proceedings of the 30th International Conference on Algorithmic Learning Theory, PMLR 98:274-299, 2019.

Abstract

We study the problem of online path learning with non-additive gains, which is a central problem appearing in several applications, including ensemble structured prediction. We present new online algorithms for path learning with non-additive count-based gains for the three settings of full information, semi-bandit and full bandit with very favorable regret guarantees. A key component of our algorithms is the definition and computation of an intermediate context-dependent automaton that enables us to use existing algorithms designed for additive gains. We further apply our methods to the important application of ensemble structured prediction. Finally, beyond count-based gains, we give an efficient implementation of the EXP3 algorithm for the full bandit setting with an arbitrary (non-additive) gain.

Cite this Paper

BibTeX


@InProceedings{pmlr-v98-cortes19a,
  title = 	 {Online Non-Additive Path Learning under Full and Partial Information},
  author =       {Cortes, Corinna and Kuznetsov, Vitaly and Mohri, Mehryar and Rahmanian, Holakou and Warmuth, Manfred},
  booktitle = 	 {Proceedings of the 30th International Conference on Algorithmic Learning Theory},
  pages = 	 {274--299},
  year = 	 {2019},
  editor = 	 {Garivier, Aurélien and Kale, Satyen},
  volume = 	 {98},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {22--24 Mar},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v98/cortes19a/cortes19a.pdf},
  url = 	 {https://proceedings.mlr.press/v98/cortes19a.html},
  abstract = 	 {  We study the problem of online path learning with non-additive
 gains, which is a central problem appearing in several applications,
 including ensemble structured prediction. We present new online
 algorithms for path learning with non-additive count-based gains for
 the three settings of full information, semi-bandit and full
 bandit with very favorable regret guarantees. A key component of
 our algorithms is the definition and computation of an intermediate
 context-dependent automaton that enables us to use existing
 algorithms designed for additive gains.  We further apply our
 methods to the important application of ensemble structured
 prediction.  Finally, beyond count-based gains, we give an efficient
 implementation of the EXP3 algorithm for the full bandit setting
 with an arbitrary (non-additive) gain.}
}

Endnote

%0 Conference Paper
%T Online Non-Additive Path Learning under Full and Partial Information
%A Corinna Cortes
%A Vitaly Kuznetsov
%A Mehryar Mohri
%A Holakou Rahmanian
%A Manfred Warmuth
%B Proceedings of the 30th International Conference on Algorithmic Learning Theory
%C Proceedings of Machine Learning Research
%D 2019
%E Aurélien Garivier
%E Satyen Kale	
%F pmlr-v98-cortes19a
%I PMLR
%P 274--299
%U https://proceedings.mlr.press/v98/cortes19a.html
%V 98
%X   We study the problem of online path learning with non-additive
 gains, which is a central problem appearing in several applications,
 including ensemble structured prediction. We present new online
 algorithms for path learning with non-additive count-based gains for
 the three settings of full information, semi-bandit and full
 bandit with very favorable regret guarantees. A key component of
 our algorithms is the definition and computation of an intermediate
 context-dependent automaton that enables us to use existing
 algorithms designed for additive gains.  We further apply our
 methods to the important application of ensemble structured
 prediction.  Finally, beyond count-based gains, we give an efficient
 implementation of the EXP3 algorithm for the full bandit setting
 with an arbitrary (non-additive) gain.

APA


Cortes, C., Kuznetsov, V., Mohri, M., Rahmanian, H. & Warmuth, M.. (2019). Online Non-Additive Path Learning under Full and Partial Information. Proceedings of the 30th International Conference on Algorithmic Learning Theory, in Proceedings of Machine Learning Research 98:274-299 Available from https://proceedings.mlr.press/v98/cortes19a.html.

Related Material

Download PDF