Asynchronous Coagent Networks

James Kostas, Chris Nota, Philip Thomas
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:5426-5435, 2020.

Abstract

Coagent policy gradient algorithms (CPGAs) are reinforcement learning algorithms for training a class of stochastic neural networks called coagent networks. In this work, we prove that CPGAs converge to locally optimal policies. Additionally, we extend prior theory to encompass asynchronous and recurrent coagent networks. These extensions facilitate the straightforward design and analysis of hierarchical reinforcement learning algorithms like the option-critic, and eliminate the need for complex derivations of customized learning rules for these algorithms.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-kostas20a, title = {Asynchronous Coagent Networks}, author = {Kostas, James and Nota, Chris and Thomas, Philip}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {5426--5435}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/kostas20a/kostas20a.pdf}, url = {https://proceedings.mlr.press/v119/kostas20a.html}, abstract = {Coagent policy gradient algorithms (CPGAs) are reinforcement learning algorithms for training a class of stochastic neural networks called coagent networks. In this work, we prove that CPGAs converge to locally optimal policies. Additionally, we extend prior theory to encompass asynchronous and recurrent coagent networks. These extensions facilitate the straightforward design and analysis of hierarchical reinforcement learning algorithms like the option-critic, and eliminate the need for complex derivations of customized learning rules for these algorithms.} }
Endnote
%0 Conference Paper %T Asynchronous Coagent Networks %A James Kostas %A Chris Nota %A Philip Thomas %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-kostas20a %I PMLR %P 5426--5435 %U https://proceedings.mlr.press/v119/kostas20a.html %V 119 %X Coagent policy gradient algorithms (CPGAs) are reinforcement learning algorithms for training a class of stochastic neural networks called coagent networks. In this work, we prove that CPGAs converge to locally optimal policies. Additionally, we extend prior theory to encompass asynchronous and recurrent coagent networks. These extensions facilitate the straightforward design and analysis of hierarchical reinforcement learning algorithms like the option-critic, and eliminate the need for complex derivations of customized learning rules for these algorithms.
APA
Kostas, J., Nota, C. & Thomas, P.. (2020). Asynchronous Coagent Networks. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:5426-5435 Available from https://proceedings.mlr.press/v119/kostas20a.html.

Related Material