LTL2Action: Generalizing LTL Instructions for Multi-Task RL

Pashootan Vaezipoor, Andrew C Li, Rodrigo A Toro Icarte, Sheila A. Mcilraith
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:10497-10508, 2021.

Abstract

We address the problem of teaching a deep reinforcement learning (RL) agent to follow instructions in multi-task environments. Instructions are expressed in a well-known formal language {–} linear temporal logic (LTL) {–} and can specify a diversity of complex, temporally extended behaviours, including conditionals and alternative realizations. Our proposed learning approach exploits the compositional syntax and the semantics of LTL, enabling our RL agent to learn task-conditioned policies that generalize to new instructions, not observed during training. To reduce the overhead of learning LTL semantics, we introduce an environment-agnostic LTL pretraining scheme which improves sample-efficiency in downstream environments. Experiments on discrete and continuous domains target combinatorial task sets of up to $\sim10^{39}$ unique tasks and demonstrate the strength of our approach in learning to solve (unseen) tasks, given LTL instructions.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-vaezipoor21a, title = {LTL2Action: Generalizing LTL Instructions for Multi-Task RL}, author = {Vaezipoor, Pashootan and Li, Andrew C and Icarte, Rodrigo A Toro and Mcilraith, Sheila A.}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {10497--10508}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/vaezipoor21a/vaezipoor21a.pdf}, url = {https://proceedings.mlr.press/v139/vaezipoor21a.html}, abstract = {We address the problem of teaching a deep reinforcement learning (RL) agent to follow instructions in multi-task environments. Instructions are expressed in a well-known formal language {–} linear temporal logic (LTL) {–} and can specify a diversity of complex, temporally extended behaviours, including conditionals and alternative realizations. Our proposed learning approach exploits the compositional syntax and the semantics of LTL, enabling our RL agent to learn task-conditioned policies that generalize to new instructions, not observed during training. To reduce the overhead of learning LTL semantics, we introduce an environment-agnostic LTL pretraining scheme which improves sample-efficiency in downstream environments. Experiments on discrete and continuous domains target combinatorial task sets of up to $\sim10^{39}$ unique tasks and demonstrate the strength of our approach in learning to solve (unseen) tasks, given LTL instructions.} }
Endnote
%0 Conference Paper %T LTL2Action: Generalizing LTL Instructions for Multi-Task RL %A Pashootan Vaezipoor %A Andrew C Li %A Rodrigo A Toro Icarte %A Sheila A. Mcilraith %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-vaezipoor21a %I PMLR %P 10497--10508 %U https://proceedings.mlr.press/v139/vaezipoor21a.html %V 139 %X We address the problem of teaching a deep reinforcement learning (RL) agent to follow instructions in multi-task environments. Instructions are expressed in a well-known formal language {–} linear temporal logic (LTL) {–} and can specify a diversity of complex, temporally extended behaviours, including conditionals and alternative realizations. Our proposed learning approach exploits the compositional syntax and the semantics of LTL, enabling our RL agent to learn task-conditioned policies that generalize to new instructions, not observed during training. To reduce the overhead of learning LTL semantics, we introduce an environment-agnostic LTL pretraining scheme which improves sample-efficiency in downstream environments. Experiments on discrete and continuous domains target combinatorial task sets of up to $\sim10^{39}$ unique tasks and demonstrate the strength of our approach in learning to solve (unseen) tasks, given LTL instructions.
APA
Vaezipoor, P., Li, A.C., Icarte, R.A.T. & Mcilraith, S.A.. (2021). LTL2Action: Generalizing LTL Instructions for Multi-Task RL. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:10497-10508 Available from https://proceedings.mlr.press/v139/vaezipoor21a.html.

Related Material