DiCE: The Infinitely Differentiable Monte-Carlo Estimator

Foerster, Jakob; Farquhar, Gregory; Al-Shedivat, Maruan; Rocktäschel, Tim; Xing, Eric P.; Whiteson, Shimon

Computer Science > Machine Learning

arXiv:1802.05098 (cs)

[Submitted on 14 Feb 2018 (v1), last revised 19 Sep 2018 (this version, v3)]

Title:DiCE: The Infinitely Differentiable Monte-Carlo Estimator

Authors:Jakob Foerster, Gregory Farquhar, Maruan Al-Shedivat, Tim Rocktäschel, Eric P. Xing, Shimon Whiteson

View PDF

Abstract:The score function estimator is widely used for estimating gradients of stochastic objectives in stochastic computation graphs (SCG), eg, in reinforcement learning and meta-learning. While deriving the first-order gradient estimators by differentiating a surrogate loss (SL) objective is computationally and conceptually simple, using the same approach for higher-order derivatives is more challenging. Firstly, analytically deriving and implementing such estimators is laborious and not compliant with automatic differentiation. Secondly, repeatedly applying SL to construct new objectives for each order derivative involves increasingly cumbersome graph manipulations. Lastly, to match the first-order gradient under differentiation, SL treats part of the cost as a fixed sample, which we show leads to missing and wrong terms for estimators of higher-order derivatives. To address all these shortcomings in a unified way, we introduce DiCE, which provides a single objective that can be differentiated repeatedly, generating correct estimators of derivatives of any order in SCGs. Unlike SL, DiCE relies on automatic differentiation for performing the requisite graph manipulations. We verify the correctness of DiCE both through a proof and numerical evaluation of the DiCE derivative estimates. We also use DiCE to propose and evaluate a novel approach for multi-agent learning. Our code is available at this https URL.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
Cite as:	arXiv:1802.05098 [cs.LG]
	(or arXiv:1802.05098v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1802.05098

Submission history

From: Jakob Foerster [view email]
[v1] Wed, 14 Feb 2018 14:05:54 UTC (689 KB)
[v2] Thu, 14 Jun 2018 10:59:41 UTC (6,334 KB)
[v3] Wed, 19 Sep 2018 19:11:15 UTC (3,621 KB)

Computer Science > Machine Learning

Title:DiCE: The Infinitely Differentiable Monte-Carlo Estimator

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:DiCE: The Infinitely Differentiable Monte-Carlo Estimator

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators