research-article

Free access

Point-based incremental pruning heuristic for solving finite-horizon DEC-POMDPs

Authors:

Jilles S. Dibangoye,

Abdel-Illah Mouaddib,

Brahim Chai-draaAuthors Info & Claims

AAMAS '09: Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1

Pages 569 - 576

Published: 10 May 2009 Publication History

Abstract

Recent scaling up of decentralized partially observable Markov decision process (DEC-POMDP) solvers towards realistic applications is mainly due to approximate methods. Of this family, Memory Bounded Dynamic Programming (MBDP), which combines in a suitable manner top-down heuristics and bottom-up value function updates, can solve DEC-POMDPs with large horizons. The performances of MBDP, can be, however, drastically improved by avoiding the systematic generation and evaluation of all possible policies which result from the exhaustive backup. To achieve that, we suggest a heuristic search method, namely Point Based Incremental Pruning (PBIP), which is able to distinguish policies with different heuristic estimates. Taking this insight into account, PBIP searches only among the most promising policies, finds those useful, and prunes dominated ones. Doing so permits us to reduce clearly the amount of computation required by the exhaustive backup. The computation experiment shows that PBIP solves DEC-POMDP benchmarks up to 800 times faster than the current best approximate algorithms, while providing solutions with higher values.

References

[1]

R. E. Bellman. Dynamic Programming. Dover Publications, Incorporated, 1957.

Digital Library

[2]

D. S. Bernstein, R. Givan, N. Immerman, and S. Zilberstein. The complexity of decentralized control of markov decision processes. Math. Oper. Res., 27(4), 2002.

Digital Library

[3]

C. Besse and B. Chaib-draa. Parallel rollout for online solution of dec-pomdps. In FLAIRS Conference, pages 619--624, 2008.

[4]

A. Boularias and B. Chaib-draa. Exact dynamic programming for decentralized pomdps with lossless policy compression. In ICAPS, pages 20--27, 2008.

[5]

A. Carlin and S. Zilberstein. Value-based observation compression for DEC-POMDPs. In AAMAS, 2008.

Digital Library

[6]

E. A. Hansen, D. S. Bernstein, and S. Zilberstein. Dynamic programming for partially observable stochastic games. In AAAI, pages 709--715, 2004.

Digital Library

[7]

R. Nair, P. Varakantham, M. Tambe, and M. Yokoo. Networked distributed POMDPs: A synthesis of distributed constraint optimization and POMDPs. In AAAI, pages 133--139, 2005.

Digital Library

[8]

D. S. Nau, V. Kumar, and L. Kanal. General branch and bound, and its relation to a* and ao*. Artif. Intell., 23(1):29--58, 1984.

Digital Library

[9]

J. Pineau, G. Gordon, and S. Thrun. Point-based value iteration: An anytime algorithm for POMDPs. In IJCAI, 2003.

Digital Library

[10]

Z. Rabinovich, C. V. Goldman, and J. S. Rosenschein. The complexity of multiagent systems: the price of silence. In AAMAS, pages 1102--1103, 2003.

Digital Library

[11]

S. Seuken and S. Zilberstein. Improved Memory-Bounded Dynamic Programming for DEC-POMDPs. In UAI, 2007.

[12]

S. Seuken and S. Zilberstein. Memory-bounded dynamic programming for DEC-POMDPs. In IJCAI, pages 2009--2015, 2007.

Digital Library

[13]

D. Szer and F. Charpillet. Point-based dynamic programming for DEC-POMDPs. In AAAI, pages 16--20, July 2006.

Digital Library

[14]

D. Szer, F. Charpillet, and S. Zilberstein. Maa*: A heuristic search algorithm for solving decentralized POMDPs. In UAI, pages 568--576, 2005.

[15]

P. Varakantham, J. Marecki, M. Tambe, and M. Yokoo. Letting loose a spider on a network of POMDPs: Generating quality guaranteed policies. In AAMAS, May 2007.

Digital Library

Cited By

Koops WJunges SJansen NLarson K(2024)Approximate dec-POMDP solving using multi-agent A*Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/745(6743-6751)Online publication date: 3-Aug-2024
https://dl.acm.org/doi/10.24963/ijcai.2024/745
Kraemer LBanerjee B(2016)Multi-agent reinforcement learning as a rehearsal for decentralized planningNeurocomputing10.1016/j.neucom.2016.01.031190:C(82-94)Online publication date: 19-May-2016
https://dl.acm.org/doi/10.1016/j.neucom.2016.01.031
Dibangoye JAmato CBuffet OCharpillet F(2015)Exploiting separability in multiagent planning with continuous-state MDPsProceedings of the 24th International Conference on Artificial Intelligence10.5555/2832747.2832850(4254-4260)Online publication date: 25-Jul-2015
https://dl.acm.org/doi/10.5555/2832747.2832850
Show More Cited By

Index Terms

Point-based incremental pruning heuristic for solving finite-horizon DEC-POMDPs
1. Computing methodologies
  1. Artificial intelligence
    1. Distributed artificial intelligence

Recommendations

Information Gathering in Decentralized POMDPs by Policy Graph Improvement
AAMAS '19: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems

Decentralized policies for information gathering are required when multiple autonomous agents are deployed to collect data about a phenomenon of interest without the ability to communicate. Decentralized partially observable Markov decision processes (...
Producing efficient error-bounded solutions for transition independent decentralized mdps
AAMAS '13: Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems

There has been substantial progress on algorithms for single-agent sequential decision making using partially observable Markov decision processes (POMDPs). A number of efficient algorithms for solving POMDPs share two desirable properties: error-bounds ...
Value-based observation compression for DEC-POMDPs
AAMAS '08: Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1

Representing agent policies compactly is essential for improving the scalability of multi-agent planning algorithms. In this paper, we focus on developing a pruning technique that allows us to merge certain observations within agent policies, while ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

AAMAS '09: Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1

May 2009

701 pages

ISBN:9780981738161

General Chairs:
Carles Sierra
Artificial Intelligence Research Institute of the Spanish Research Council (Spain)
,
Cristiano Castelfranchi
ISTC-CNR (Italy)
,
Program Chairs:
Keith S. Decker
University of Delaware
,
Jaime Simão Sichman
Politecnic School, University of São Paulo (Brazil)

Sponsors

Drexel University
Wiley-Blackwell
Microsoft Research: Microsoft Research
Whitestein Technologies
European Office of Aerospace Research and Development, Air Force Office of Scientific Research, United States Air Force Research Laboratory
The Foundation for Intelligent Physical Agents

Publisher

International Foundation for Autonomous Agents and Multiagent Systems

Richland, SC

Publication History

Published: 10 May 2009

Author Tags

Qualifiers

Research-article

Acceptance Rates

AAMAS '09 Paper Acceptance Rate 132 of 651 submissions, 20%;

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

11
Total Citations
View Citations
248
Total Downloads

Downloads (Last 12 months)25
Downloads (Last 6 weeks)6

Reflects downloads up to 30 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Koops WJunges SJansen NLarson K(2024)Approximate dec-POMDP solving using multi-agent A*Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/745(6743-6751)Online publication date: 3-Aug-2024
https://dl.acm.org/doi/10.24963/ijcai.2024/745
Kraemer LBanerjee B(2016)Multi-agent reinforcement learning as a rehearsal for decentralized planningNeurocomputing10.1016/j.neucom.2016.01.031190:C(82-94)Online publication date: 19-May-2016
https://dl.acm.org/doi/10.1016/j.neucom.2016.01.031
Dibangoye JAmato CBuffet OCharpillet F(2015)Exploiting separability in multiagent planning with continuous-state MDPsProceedings of the 24th International Conference on Artificial Intelligence10.5555/2832747.2832850(4254-4260)Online publication date: 25-Jul-2015
https://dl.acm.org/doi/10.5555/2832747.2832850
Dibangoye JAmato CBuffet OCharpillet FBazzan AHuhns MLomuscio AScerri P(2014)Exploiting separability in multiagent planning with continuous-state MDPsProceedings of the 2014 international conference on Autonomous agents and multi-agent systems10.5555/2615731.2617452(1281-1288)Online publication date: 5-May-2014
https://dl.acm.org/doi/10.5555/2615731.2617452
Dibangoye JAmato CDoniec ACharpillet FGini MShehory OIto TJonker C(2013)Producing efficient error-bounded solutions for transition independent decentralized mdpsProceedings of the 2013 international conference on Autonomous agents and multi-agent systems10.5555/2484920.2485006(539-546)Online publication date: 6-May-2013
https://dl.acm.org/doi/10.5555/2484920.2485006
Dibangoye JAmato CDoniec A(2012)Scaling up decentralized MDPs through heuristic searchProceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence10.5555/3020652.3020678(217-226)Online publication date: 14-Aug-2012
https://dl.acm.org/doi/10.5555/3020652.3020678
Dibangoye JMouaddib FChaib-draa BSonenberg LStone PTumer KYolum P(2011)Toward error-bounded algorithms for infinite-horizon DEC-POMDPsThe 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 310.5555/2034396.2034404(947-954)Online publication date: 2-May-2011
https://dl.acm.org/doi/10.5555/2034396.2034404
Varakantham P(2011)Social Model Shaping for Solving Generic DEC-POMDPsProceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 0210.1109/WI-IAT.2011.145(180-187)Online publication date: 22-Aug-2011
https://dl.acm.org/doi/10.1109/WI-IAT.2011.145
Kumar AZilberstein SLuck MSen S(2010)Point-based backup for decentralized POMDPsProceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 110.5555/1838206.1838378(1315-1322)Online publication date: 10-May-2010
https://dl.acm.org/doi/10.5555/1838206.1838378
Wu FZilberstein SChen XLuck MSen S(2010)Point-based policy generation for decentralized POMDPsProceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 110.5555/1838206.1838377(1307-1314)Online publication date: 10-May-2010
https://dl.acm.org/doi/10.5555/1838206.1838377
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten