Options as responses: Grounding behavioural hierarchies in multi-agent RL

Vezhnevets, Alexander Sasha; Wu, Yuhuai; Leblond, Remi; Leibo, Joel Z.

Computer Science > Machine Learning

arXiv:1906.01470 (cs)

[Submitted on 4 Jun 2019 (v1), last revised 10 Jul 2020 (this version, v3)]

Title:Options as responses: Grounding behavioural hierarchies in multi-agent RL

Authors:Alexander Sasha Vezhnevets, Yuhuai Wu, Remi Leblond, Joel Z. Leibo

View PDF

Abstract:This paper investigates generalisation in multi-agent games, where the generality of the agent can be evaluated by playing against opponents it hasn't seen during training. We propose two new games with concealed information and complex, non-transitive reward structure (think rock/paper/scissors). It turns out that most current deep reinforcement learning methods fail to efficiently explore the strategy space, thus learning policies that generalise poorly to unseen opponents. We then propose a novel hierarchical agent architecture, where the hierarchy is grounded in the game-theoretic structure of the game -- the top level chooses strategic responses to opponents, while the low level implements them into policy over primitive actions. This grounding facilitates credit assignment across the levels of hierarchy. Our experiments show that the proposed hierarchical agent is capable of generalisation to unseen opponents, while conventional baselines fail to generalise whatsoever.

Comments:	First two authors contributed equally
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)
Cite as:	arXiv:1906.01470 [cs.LG]
	(or arXiv:1906.01470v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1906.01470
Journal reference:	International Conference on Machine Learning 2020

Submission history

From: Alexander Vezhnevets [view email]
[v1] Tue, 4 Jun 2019 14:18:47 UTC (206 KB)
[v2] Thu, 6 Jun 2019 15:10:59 UTC (206 KB)
[v3] Fri, 10 Jul 2020 13:31:16 UTC (654 KB)

Computer Science > Machine Learning

Title:Options as responses: Grounding behavioural hierarchies in multi-agent RL

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Options as responses: Grounding behavioural hierarchies in multi-agent RL

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators