Generalization to New Actions in Reinforcement Learning

Ayush Jain, Andrew Szot, Joseph Lim
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:4661-4672, 2020.

Abstract

A fundamental trait of intelligence is the ability to achieve goals in the face of novel circumstances, such as making decisions from new action choices. However, standard reinforcement learning assumes a fixed set of actions and requires expensive retraining when given a new action set. To make learning agents more adaptable, we introduce the problem of zero-shot generalization to new actions. We propose a two-stage framework where the agent first infers action representations from action information acquired separately from the task. A policy flexible to varying action sets is then trained with generalization objectives. We benchmark generalization on sequential tasks, such as selecting from an unseen tool-set to solve physical reasoning puzzles and stacking towers with novel 3D shapes. Videos and code are available at https://sites.google.com/view/action-generalization.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-jain20b, title = {Generalization to New Actions in Reinforcement Learning}, author = {Jain, Ayush and Szot, Andrew and Lim, Joseph}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {4661--4672}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/jain20b/jain20b.pdf}, url = {https://proceedings.mlr.press/v119/jain20b.html}, abstract = {A fundamental trait of intelligence is the ability to achieve goals in the face of novel circumstances, such as making decisions from new action choices. However, standard reinforcement learning assumes a fixed set of actions and requires expensive retraining when given a new action set. To make learning agents more adaptable, we introduce the problem of zero-shot generalization to new actions. We propose a two-stage framework where the agent first infers action representations from action information acquired separately from the task. A policy flexible to varying action sets is then trained with generalization objectives. We benchmark generalization on sequential tasks, such as selecting from an unseen tool-set to solve physical reasoning puzzles and stacking towers with novel 3D shapes. Videos and code are available at https://sites.google.com/view/action-generalization.} }
Endnote
%0 Conference Paper %T Generalization to New Actions in Reinforcement Learning %A Ayush Jain %A Andrew Szot %A Joseph Lim %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-jain20b %I PMLR %P 4661--4672 %U https://proceedings.mlr.press/v119/jain20b.html %V 119 %X A fundamental trait of intelligence is the ability to achieve goals in the face of novel circumstances, such as making decisions from new action choices. However, standard reinforcement learning assumes a fixed set of actions and requires expensive retraining when given a new action set. To make learning agents more adaptable, we introduce the problem of zero-shot generalization to new actions. We propose a two-stage framework where the agent first infers action representations from action information acquired separately from the task. A policy flexible to varying action sets is then trained with generalization objectives. We benchmark generalization on sequential tasks, such as selecting from an unseen tool-set to solve physical reasoning puzzles and stacking towers with novel 3D shapes. Videos and code are available at https://sites.google.com/view/action-generalization.
APA
Jain, A., Szot, A. & Lim, J.. (2020). Generalization to New Actions in Reinforcement Learning. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:4661-4672 Available from https://proceedings.mlr.press/v119/jain20b.html.

Related Material