Path Integral Policy Improvement with Covariance Matrix Adaptation

Stulp, Freek; Sigaud, Olivier

Computer Science > Machine Learning

arXiv:1206.4621 (cs)

[Submitted on 18 Jun 2012]

Title:Path Integral Policy Improvement with Covariance Matrix Adaptation

Authors:Freek Stulp (Ecole Nationale Superieure de Techniques Avancees), Olivier Sigaud (Universite Pierre et Marie Curie)

View PDF

Abstract:There has been a recent focus in reinforcement learning on addressing continuous state and action problems by optimizing parameterized policies. PI2 is a recent example of this approach. It combines a derivation from first principles of stochastic optimal control with tools from statistical estimation theory. In this paper, we consider PI2 as a member of the wider family of methods which share the concept of probability-weighted averaging to iteratively update parameters to optimize a cost function. We compare PI2 to other members of the same family - Cross-Entropy Methods and CMAES - at the conceptual level and in terms of performance. The comparison suggests the derivation of a novel algorithm which we call PI2-CMA for "Path Integral Policy Improvement with Covariance Matrix Adaptation". PI2-CMA's main advantage is that it determines the magnitude of the exploration noise automatically.

Comments:	ICML2012
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:1206.4621 [cs.LG]
	(or arXiv:1206.4621v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1206.4621

Submission history

From: Freek Stulp [view email] [via ICML2012 proxy]
[v1] Mon, 18 Jun 2012 15:05:32 UTC (883 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2012-06

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Freek Stulp
Olivier Sigaud

export BibTeX citation

Computer Science > Machine Learning

Title:Path Integral Policy Improvement with Covariance Matrix Adaptation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Path Integral Policy Improvement with Covariance Matrix Adaptation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators