skip to main content
10.5555/2986459.2986462guideproceedingsArticle/Chapter ViewAbstractPublication PagesnipsConference Proceedingsconference-collections
Article

Nonlinear inverse reinforcement learning with Gaussian processes

Published: 12 December 2011 Publication History

Abstract

We present a probabilistic algorithm for nonlinear inverse reinforcement learning. The goal of inverse reinforcement learning is to learn the reward function in a Markov decision process from expert demonstrations. While most prior inverse reinforcement learning algorithms represent the reward as a linear combination of a set of features, we use Gaussian processes to learn the reward as a nonlinear function, while also determining the relevance of each feature to the expert's policy. Our probabilistic algorithm allows complex behaviors to be captured from suboptimal stochastic demonstrations, while automatically balancing the simplicity of the learned reward structure against its consistency with the observed actions.

References

[1]
P. Abbeel and A. Y. Ng. Apprenticeship learning via inverse reinforcement learning. In ICML '04: Proceedings of the 21st International Conference on Machine Learning, 2004.
[2]
M. P. Deisenroth, C. E. Rasmussen, and J. Peters. Gaussian process dynamic programming. Neurocomputing, 72(7-9):1508-1524, 2009.
[3]
K. Dvijotham and E. Todorov. Inverse optimal control with linearly-solvable MDPs. In ICML '10: Proceedings of the 27th International Conference on Machine Learning, pages 335-342, 2010.
[4]
Y. Engel, S. Mannor, and R. Meir. Reinforcement learning with Gaussian processes. In ICML '05: Proceedings of the 22nd International Conference on Machine learning, pages 201-208, 2005.
[5]
S. Levine, Z. Popović, and V. Koltun. Feature construction for inverse reinforcement learning. In Advances in Neural Information Processing Systems 23. 2010.
[6]
G. Neu and C. Szepesvári. Apprenticeship learning using inverse reinforcement learning and gradient methods. In Uncertainty in Artificial Intelligence (UAI), 2007.
[7]
A. Y. Ng and S. J. Russell. Algorithms for inverse reinforcement learning. In ICML 00: Proceedings of the 17th International Conference on Machine Learning, pages 663-670, 2000.
[8]
J. Quiñonero Candela and C. E. Rasmussen. A unifying view of sparse approximate Gaussian process regression. Journal of Machine Learning Research, 6:1939-1959, 2005.
[9]
D. Ramachandran and E. Amir. Bayesian inverse reinforcement learning. In IJCAI'07: Proceedings of the 20th International Joint Conference on Artifical Intelligence, pages 2586-2591, 2007.
[10]
C. E. Rasmussen and M. Kuss. Gaussian processes in reinforcement learning. In Advances in Neural Information Processing Systems 16, 2003.
[11]
C. E. Rasmussen and C. K. I. Williams. Gaussian Processes for Machine Learning. The MIT Press, 2005.
[12]
N. Ratliff, J. A. Bagnell, and M. A. Zinkevich. Maximum margin planning. In ICML '06: Proceedings of the 23rd International Conference on Machine Learning, pages 729-736, 2006.
[13]
N. Ratliff, D. Bradley, J. A. Bagnell, and J. Chestnutt. Boosting structured prediction for imitation learning. In Advances in Neural Information Processing Systems 19, 2007.
[14]
N. Ratliff, D. Silver, and J. A. Bagnell. Learning to search: Functional gradient techniques for imitation learning. Autonomous Robots, 27(1):25-53, 2009.
[15]
U. Syed and R. Schapire. A game-theoretic approach to apprenticeship learning. In Advances in Neural Information Processing Systems 20, 2008.
[16]
B. D. Ziebart. Modeling Purposeful Adaptive Behavior with the Principle of Maximum Causal Entropy. PhD thesis, Carnegie Mellon University, 2010.
[17]
B. D. Ziebart, A. Maas, J. A. Bagnell, and A. K. Dey. Maximum entropy inverse reinforcement learning. In AAAI Conference on Artificial Intelligence (AAAI 2008), pages 1433-1438, 2008.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
NIPS'11: Proceedings of the 24th International Conference on Neural Information Processing Systems
December 2011
2752 pages

Publisher

Curran Associates Inc.

Red Hook, NY, United States

Publication History

Published: 12 December 2011

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Oct 2024

Other Metrics

Citations

Cited By

View all

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media