research-article

Exploiting separability in multiagent planning with continuous-state MDPs

Authors:

Jilles S. Dibangoye,

Christopher Amato,

Olivier Buffet,

Francçois CharpilletAuthors Info & Claims

AAMAS '14: Proceedings of the 2014 international conference on Autonomous agents and multi-agent systems

Pages 1281 - 1288

Published: 05 May 2014 Publication History

Abstract

Recent years have seen significant advances in techniques for optimally solving multiagent problems represented as decentralized partially observable Markov decision processes (Dec-POMDPs). A new method achieves scalability gains by converting Dec-POMDPs into continuous state MDPs. This method relies on the assumption of a centralized planning phase that generates a set of decentralized policies for the agents to execute. However, scalability remains limited when the number of agents or problem variables becomes large. In this paper, we show that, under certain separability conditions of the optimal value function, the scalability of this approach can increase considerably. This separability is present when there is locality of interaction, which --- as other approaches (such as those based on the ND-POMDP subclass) have already shown --- can be exploited to improve performance. Unlike most previous methods, the novel continuous-state MDP algorithm retains optimality and convergence guarantees. Results show that the extension using separability can scale to a large number of agents and domain variables while maintaining optimality.

References

[1]

C. Amato, J. S. Dibangoye, and S. Zilberstein. Incremental policy generation for finite-horizon DEC-POMDPs. In ICAPS, 2009.

Digital Library

[2]

R. Aras and A. Dutech. An investigation into mathematical programming for finite horizon decentralized POMDPs. JAIR, 37:329--396, 2010.

Digital Library

[3]

R. Becker, S. Zilberstein, V. R. Lesser, and C. V. Goldman. Solving transition independent decentralized Markov decision processes. JAIR, 22:423--455, 2004.

Digital Library

[4]

D. S. Bernstein, C. Amato, E. A. Hansen, and S. Zilberstein. Policy iteration for decentralized control of Markov decision processes. JAIR, 34:89--132, 2009.

Digital Library

[5]

D. S. Bernstein, R. Givan, N. Immerman, and S. Zilberstein. The complexity of decentralized control of Markov decision processes. Math. Oper. Res., 27(4), 2002.

Digital Library

[6]

A. Boularias and B. Chaib-draa. Exact dynamic programming for decentralized POMDPs with lossless policy compression. In ICAPS, pages 20--27, 2008.

[7]

C. Boutilier, R. Dearden, and M. Goldszmidt. Stochastic dynamic programming with factored representations. Artif. Intell., 121(1--2):49--107, 2000.

Digital Library

[8]

S. de Givry, F. Heras, M. Zytnicki, and J. Larrosa. Existential arc consistency: Getting closer to full arc consistency in weighted CSPs. In IJCAI, pages 84--89, 2005.

Digital Library

[9]

R. Dechter. Bucket elimination: a unifying framework for processing hard and soft constraints. Constraints, 2(1):51--55, 1997.

Digital Library

[10]

R. Dechter. Bucket elimination: A unifying framework for reasoning. Artif. Intell., 113(1--2):41--85, 1999.

Digital Library

[11]

J. S. Dibangoye, C. Amato, O. Buffet, and F. Charpillet. Optimally solving Dec-POMDPs as continuous-state MDPs. In IJCAI, 2013.

Digital Library

[12]

J. S. Dibangoye, C. Amato, and A. Doniec. Scaling up decentralized MDPs through heuristic search. In UAI, pages 217--226, 2012.

[13]

J. S. Dibangoye, C. Amato, A. Doniec, and F. Charpillet. Producing efficient error-bounded solutions for transition independent decentralized MDPs. In AAMAS, 2013.

Digital Library

[14]

J. S. Dibangoye, A.-I. Mouaddib, and B. Chaib-draa. Point-based incremental pruning heuristic for solving finite-horizon DEC-POMDPs. In AAMAS (1), pages 569--576, 2009.

Digital Library

[15]

J. S. Dibangoye, G. Shani, B. Chaib-Draa, and A.-I. Mouaddib. Topological order planner for POMDPs. In IJCAI, pages 1684--1689, 2009.

Digital Library

[16]

C. Guestrin, D. Koller, and R. Parr. Multiagent planning with factored MDPs. In NIPS, pages 1523--1530, 2001.

[17]

C. Guestrin, D. Koller, R. Parr, and S. Venkataraman. Efficient solution algorithms for factored MDPs. J. Artif. Intell. Res. (JAIR), 19:399--468, 2003.

Digital Library

[18]

D. Koller and R. Parr. Computing factored value functions for policies in structured MDPs. In IJCAI, pages 1332--1339, 1999.

Digital Library

[19]

A. Kumar and S. Zilberstein. Constraint-based dynamic programming for decentralized POMDPs with structured interactions. In AAMAS, pages 561--568, 2009.

Digital Library

[20]

A. Kumar and S. Zilberstein. Point-based backup for decentralized POMDPs: complexity and new algorithms. In AAMAS, pages 1315--1322, 2010.

Digital Library

[21]

A. Kumar, S. Zilberstein, and M. Toussaint. Scalable multiagent planning using probabilistic inference. In IJCAI, pages 2140--2146, 2011.

Digital Library

[22]

B. Kveton, M. Hauskrecht, and C. Guestrin. Solving factored MDPs with hybrid state and action variables. J. Artif. Intell. Res. (JAIR), 27:153--201, 2006.

Digital Library

[23]

J. Marecki, T. Gupta, P. Varakantham, M. Tambe, and M. Yokoo. Not all agents are equal: scaling up distributed POMDPs for agent networks. In AAMAS (1), pages 485--492, 2008.

Digital Library

[24]

R. Nair, P. Varakantham, M. Tambe, and M. Yokoo. Networked distributed POMDPs: A synthesis of distributed constraint optimization and POMDPs. In AAAI, pages 133--139, 2005.

Digital Library

[25]

F. A. Oliehoek. Sufficient plan-time statistics for decentralized POMDPs. In IJCAI, 2013.

Digital Library

[26]

F. A. Oliehoek, M. T. J. Spaan, C. Amato, and S. Whiteson. Incremental clustering and expansion for faster optimal planning in Dec-POMDPs. JAIR, 46:449--509, 2013.

Digital Library

[27]

F. A. Oliehoek, S. J. Witwicki, and L. P. Kaelbling. Influence-based abstraction for multiagent systems. In AAAI, 2012.

Digital Library

[28]

R. Patrascu, P. Poupart, D. Schuurmans, C. Boutilier, and C. Guestrin. Greedy linear value-approximation for factored Markov decision processes. In AAAI/IAAI, pages 285--291, 2002.

Digital Library

[29]

M. Petrik and S. Zilberstein. A bilinear programming approach for multiagent planning. JAIR, 35:235--274, 2009.

Digital Library

[30]

M. L. Puterman. Markov Decision Processes, Discrete Stochastic Dynamic Programming. Wiley-Interscience, Hoboken, New Jersey, 1994.

Digital Library

[31]

R. D. Smallwood and E. J. Sondik. The optimal control of partially observable Markov decision processes over a finite horizon. Operations Research, 21(5):1071--1088, 1973.

Digital Library

[32]

T. Smith and R. Simmons. Heuristic search value iteration for POMDPs. In Proc. of UAI, pages 520--527, 2004.

Digital Library

[33]

D. Szer, F. Charpillet, and S. Zilberstein. MAA*: A heuristic search algorithm for solving decentralized POMDPs. In UAI, pages 568--576, 2005.

[34]

P. Varakantham, J. Marecki, M. Tambe, and M. Yokoo. Letting loose a SPIDER on a network of POMDPs: Generating quality guaranteed policies. In AAMAS, 2007.

Digital Library

[35]

S. J. Witwicki and E. H. Durfee. Influence-based policy abstraction for weakly-coupled Dec-POMDPs. In ICAPS, pages 185--192, 2010.

Digital Library

Cited By

Sokota SD'Orazio RLing CWu DKolter JBrown NKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)Abstracting imperfect information away from two-player zero-sum gamesProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619741(32169-32193)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3619741
Dibangoye JAmato CBuffet OCharpillet F(2015)Exploiting separability in multiagent planning with continuous-state MDPsProceedings of the 24th International Conference on Artificial Intelligence10.5555/2832747.2832850(4254-4260)Online publication date: 25-Jul-2015
https://dl.acm.org/doi/10.5555/2832747.2832850
Oliehoek FSpaan MWitwicki S(2015)Factored upper bounds for multiagent planning problems under uncertainty with non-factored value functionsProceedings of the 24th International Conference on Artificial Intelligence10.5555/2832415.2832478(1645-1651)Online publication date: 25-Jul-2015
https://dl.acm.org/doi/10.5555/2832415.2832478
Show More Cited By

Index Terms

Exploiting separability in multiagent planning with continuous-state MDPs
1. Computing methodologies
  1. Artificial intelligence
    1. Distributed artificial intelligence
      1. Multi-agent systems

Recommendations

Decentralized planning under uncertainty for teams of communicating agents
AAMAS '06: Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems

Decentralized partially observable Markov decision processes (DEC-POMDPs) form a general framework for planning for groups of cooperating agents that inhabit a stochastic and partially observable environment. Unfortunately, computing optimal plans in a ...
Producing efficient error-bounded solutions for transition independent decentralized mdps
AAMAS '13: Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems

There has been substantial progress on algorithms for single-agent sequential decision making using partially observable Markov decision processes (POMDPs). A number of efficient algorithms for solving POMDPs share two desirable properties: error-bounds ...
Exploiting separability in multiagent planning with continuous-state MDPs
IJCAI'15: Proceedings of the 24th International Conference on Artificial Intelligence

Decentralized partially observable Markov decision processes (Dec-POMDPs) provide a general model for decision-making under uncertainty in co-operative decentralized settings, but are difficult to solve optimally (NEXP-Complete). As a new way of solving ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

AAMAS '14: Proceedings of the 2014 international conference on Autonomous agents and multi-agent systems

May 2014

1774 pages

ISBN:9781450327381

General Chairs:
Ana Bazzan
UFRGS, Brazil
,
Michael Huhns
University of South Carolina, USA
,
Program Chairs:
Alessio Lomuscio
Imperial College London, UK
,
Paul Scerri
Carnegie Mellon University, USA

Sponsors

IFAAMAS

In-Cooperation

SIGAI: ACM Special Interest Group on Artificial Intelligence

Publisher

International Foundation for Autonomous Agents and Multiagent Systems

Richland, SC

Publication History

Published: 05 May 2014

Check for updates

Author Tags

Qualifiers

Research-article

Conference

AAMAS '14

Sponsor:

AAMAS '14: International conference on Autonomous Agents and Multi-Agent Systems

May 5 - 9, 2014

Paris, France

Acceptance Rates

AAMAS '14 Paper Acceptance Rate 169 of 709 submissions, 24%;

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
93
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 20 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Sokota SD'Orazio RLing CWu DKolter JBrown NKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)Abstracting imperfect information away from two-player zero-sum gamesProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619741(32169-32193)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3619741
Dibangoye JAmato CBuffet OCharpillet F(2015)Exploiting separability in multiagent planning with continuous-state MDPsProceedings of the 24th International Conference on Artificial Intelligence10.5555/2832747.2832850(4254-4260)Online publication date: 25-Jul-2015
https://dl.acm.org/doi/10.5555/2832747.2832850
Oliehoek FSpaan MWitwicki S(2015)Factored upper bounds for multiagent planning problems under uncertainty with non-factored value functionsProceedings of the 24th International Conference on Artificial Intelligence10.5555/2832415.2832478(1645-1651)Online publication date: 25-Jul-2015
https://dl.acm.org/doi/10.5555/2832415.2832478
Dibangoye JBuffet OSimonin O(2015)Structural results for cooperative decentralized control modelsProceedings of the 24th International Conference on Artificial Intelligence10.5555/2832249.2832256(46-52)Online publication date: 25-Jul-2015
https://dl.acm.org/doi/10.5555/2832249.2832256
Claes DRobbel POliehoek FTuyls KHennes Dvan der Hoek WWeiss GYolum PBordini RElkind E(2015)Effective Approximations for Multi-Robot Coordination in Spatially Distributed TasksProceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems10.5555/2772879.2773265(881-890)Online publication date: 4-May-2015
https://dl.acm.org/doi/10.5555/2772879.2773265

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents