skip to main content
10.5555/2615731.2617452acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaamasConference Proceedingsconference-collections
research-article

Exploiting separability in multiagent planning with continuous-state MDPs

Published: 05 May 2014 Publication History

Abstract

Recent years have seen significant advances in techniques for optimally solving multiagent problems represented as decentralized partially observable Markov decision processes (Dec-POMDPs). A new method achieves scalability gains by converting Dec-POMDPs into continuous state MDPs. This method relies on the assumption of a centralized planning phase that generates a set of decentralized policies for the agents to execute. However, scalability remains limited when the number of agents or problem variables becomes large. In this paper, we show that, under certain separability conditions of the optimal value function, the scalability of this approach can increase considerably. This separability is present when there is locality of interaction, which --- as other approaches (such as those based on the ND-POMDP subclass) have already shown --- can be exploited to improve performance. Unlike most previous methods, the novel continuous-state MDP algorithm retains optimality and convergence guarantees. Results show that the extension using separability can scale to a large number of agents and domain variables while maintaining optimality.

References

[1]
C. Amato, J. S. Dibangoye, and S. Zilberstein. Incremental policy generation for finite-horizon DEC-POMDPs. In ICAPS, 2009.
[2]
R. Aras and A. Dutech. An investigation into mathematical programming for finite horizon decentralized POMDPs. JAIR, 37:329--396, 2010.
[3]
R. Becker, S. Zilberstein, V. R. Lesser, and C. V. Goldman. Solving transition independent decentralized Markov decision processes. JAIR, 22:423--455, 2004.
[4]
D. S. Bernstein, C. Amato, E. A. Hansen, and S. Zilberstein. Policy iteration for decentralized control of Markov decision processes. JAIR, 34:89--132, 2009.
[5]
D. S. Bernstein, R. Givan, N. Immerman, and S. Zilberstein. The complexity of decentralized control of Markov decision processes. Math. Oper. Res., 27(4), 2002.
[6]
A. Boularias and B. Chaib-draa. Exact dynamic programming for decentralized POMDPs with lossless policy compression. In ICAPS, pages 20--27, 2008.
[7]
C. Boutilier, R. Dearden, and M. Goldszmidt. Stochastic dynamic programming with factored representations. Artif. Intell., 121(1--2):49--107, 2000.
[8]
S. de Givry, F. Heras, M. Zytnicki, and J. Larrosa. Existential arc consistency: Getting closer to full arc consistency in weighted CSPs. In IJCAI, pages 84--89, 2005.
[9]
R. Dechter. Bucket elimination: a unifying framework for processing hard and soft constraints. Constraints, 2(1):51--55, 1997.
[10]
R. Dechter. Bucket elimination: A unifying framework for reasoning. Artif. Intell., 113(1--2):41--85, 1999.
[11]
J. S. Dibangoye, C. Amato, O. Buffet, and F. Charpillet. Optimally solving Dec-POMDPs as continuous-state MDPs. In IJCAI, 2013.
[12]
J. S. Dibangoye, C. Amato, and A. Doniec. Scaling up decentralized MDPs through heuristic search. In UAI, pages 217--226, 2012.
[13]
J. S. Dibangoye, C. Amato, A. Doniec, and F. Charpillet. Producing efficient error-bounded solutions for transition independent decentralized MDPs. In AAMAS, 2013.
[14]
J. S. Dibangoye, A.-I. Mouaddib, and B. Chaib-draa. Point-based incremental pruning heuristic for solving finite-horizon DEC-POMDPs. In AAMAS (1), pages 569--576, 2009.
[15]
J. S. Dibangoye, G. Shani, B. Chaib-Draa, and A.-I. Mouaddib. Topological order planner for POMDPs. In IJCAI, pages 1684--1689, 2009.
[16]
C. Guestrin, D. Koller, and R. Parr. Multiagent planning with factored MDPs. In NIPS, pages 1523--1530, 2001.
[17]
C. Guestrin, D. Koller, R. Parr, and S. Venkataraman. Efficient solution algorithms for factored MDPs. J. Artif. Intell. Res. (JAIR), 19:399--468, 2003.
[18]
D. Koller and R. Parr. Computing factored value functions for policies in structured MDPs. In IJCAI, pages 1332--1339, 1999.
[19]
A. Kumar and S. Zilberstein. Constraint-based dynamic programming for decentralized POMDPs with structured interactions. In AAMAS, pages 561--568, 2009.
[20]
A. Kumar and S. Zilberstein. Point-based backup for decentralized POMDPs: complexity and new algorithms. In AAMAS, pages 1315--1322, 2010.
[21]
A. Kumar, S. Zilberstein, and M. Toussaint. Scalable multiagent planning using probabilistic inference. In IJCAI, pages 2140--2146, 2011.
[22]
B. Kveton, M. Hauskrecht, and C. Guestrin. Solving factored MDPs with hybrid state and action variables. J. Artif. Intell. Res. (JAIR), 27:153--201, 2006.
[23]
J. Marecki, T. Gupta, P. Varakantham, M. Tambe, and M. Yokoo. Not all agents are equal: scaling up distributed POMDPs for agent networks. In AAMAS (1), pages 485--492, 2008.
[24]
R. Nair, P. Varakantham, M. Tambe, and M. Yokoo. Networked distributed POMDPs: A synthesis of distributed constraint optimization and POMDPs. In AAAI, pages 133--139, 2005.
[25]
F. A. Oliehoek. Sufficient plan-time statistics for decentralized POMDPs. In IJCAI, 2013.
[26]
F. A. Oliehoek, M. T. J. Spaan, C. Amato, and S. Whiteson. Incremental clustering and expansion for faster optimal planning in Dec-POMDPs. JAIR, 46:449--509, 2013.
[27]
F. A. Oliehoek, S. J. Witwicki, and L. P. Kaelbling. Influence-based abstraction for multiagent systems. In AAAI, 2012.
[28]
R. Patrascu, P. Poupart, D. Schuurmans, C. Boutilier, and C. Guestrin. Greedy linear value-approximation for factored Markov decision processes. In AAAI/IAAI, pages 285--291, 2002.
[29]
M. Petrik and S. Zilberstein. A bilinear programming approach for multiagent planning. JAIR, 35:235--274, 2009.
[30]
M. L. Puterman. Markov Decision Processes, Discrete Stochastic Dynamic Programming. Wiley-Interscience, Hoboken, New Jersey, 1994.
[31]
R. D. Smallwood and E. J. Sondik. The optimal control of partially observable Markov decision processes over a finite horizon. Operations Research, 21(5):1071--1088, 1973.
[32]
T. Smith and R. Simmons. Heuristic search value iteration for POMDPs. In Proc. of UAI, pages 520--527, 2004.
[33]
D. Szer, F. Charpillet, and S. Zilberstein. MAA*: A heuristic search algorithm for solving decentralized POMDPs. In UAI, pages 568--576, 2005.
[34]
P. Varakantham, J. Marecki, M. Tambe, and M. Yokoo. Letting loose a SPIDER on a network of POMDPs: Generating quality guaranteed policies. In AAMAS, 2007.
[35]
S. J. Witwicki and E. H. Durfee. Influence-based policy abstraction for weakly-coupled Dec-POMDPs. In ICAPS, pages 185--192, 2010.

Cited By

View all

Index Terms

  1. Exploiting separability in multiagent planning with continuous-state MDPs

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    AAMAS '14: Proceedings of the 2014 international conference on Autonomous agents and multi-agent systems
    May 2014
    1774 pages
    ISBN:9781450327381

    Sponsors

    • IFAAMAS

    In-Cooperation

    Publisher

    International Foundation for Autonomous Agents and Multiagent Systems

    Richland, SC

    Publication History

    Published: 05 May 2014

    Check for updates

    Author Tags

    1. cooperative multiagent systems
    2. decentralized pomdps
    3. nd-pomdps
    4. planning under uncertainty

    Qualifiers

    • Research-article

    Conference

    AAMAS '14
    Sponsor:

    Acceptance Rates

    AAMAS '14 Paper Acceptance Rate 169 of 709 submissions, 24%;
    Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 20 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Abstracting imperfect information away from two-player zero-sum gamesProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619741(32169-32193)Online publication date: 23-Jul-2023
    • (2015)Exploiting separability in multiagent planning with continuous-state MDPsProceedings of the 24th International Conference on Artificial Intelligence10.5555/2832747.2832850(4254-4260)Online publication date: 25-Jul-2015
    • (2015)Factored upper bounds for multiagent planning problems under uncertainty with non-factored value functionsProceedings of the 24th International Conference on Artificial Intelligence10.5555/2832415.2832478(1645-1651)Online publication date: 25-Jul-2015
    • (2015)Structural results for cooperative decentralized control modelsProceedings of the 24th International Conference on Artificial Intelligence10.5555/2832249.2832256(46-52)Online publication date: 25-Jul-2015
    • (2015)Effective Approximations for Multi-Robot Coordination in Spatially Distributed TasksProceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems10.5555/2772879.2773265(881-890)Online publication date: 4-May-2015

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media