Google Scholar

Actor critic with differentially private critic

J Lebensold, W Hamilton, B Balle, D Precup - arXiv preprint arXiv …, 2019 - arxiv.org

J Lebensold, W Hamilton, B Balle, D Precup

arXiv preprint arXiv:1910.05876, 2019•arxiv.org

Reinforcement learning algorithms are known to be sample inefficient, and often
performance on one task can be substantially improved by leveraging information (eg, via
pre-training) on other related tasks. In this work, we propose a technique to achieve such
knowledge transfer in cases where agent trajectories contain sensitive or private
information, such as in the healthcare domain. Our approach leverages a differentially
private policy evaluation algorithm to initialize an actor-critic model and improve the …

Reinforcement learning algorithms are known to be sample inefficient, and often performance on one task can be substantially improved by leveraging information (e.g., via pre-training) on other related tasks. In this work, we propose a technique to achieve such knowledge transfer in cases where agent trajectories contain sensitive or private information, such as in the healthcare domain. Our approach leverages a differentially private policy evaluation algorithm to initialize an actor-critic model and improve the effectiveness of learning in downstream tasks. We empirically show this technique increases sample efficiency in resource-constrained control problems while preserving the privacy of trajectories collected in an upstream task.

arxiv.org

Show moreShow less

Save Cite Cited by 10 Related articles All 2 versions View as HTML

Showing the best result for this search. See all results

Cite

Advanced search

Saved to My library

Actor critic with differentially private critic