Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML

Raghu, Aniruddh; Raghu, Maithra; Bengio, Samy; Vinyals, Oriol

Computer Science > Machine Learning

arXiv:1909.09157 (cs)

[Submitted on 19 Sep 2019 (v1), last revised 12 Feb 2020 (this version, v2)]

Title:Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML

Authors:Aniruddh Raghu, Maithra Raghu, Samy Bengio, Oriol Vinyals

View PDF

Abstract:An important research direction in machine learning has centered around developing meta-learning algorithms to tackle few-shot learning. An especially successful algorithm has been Model Agnostic Meta-Learning (MAML), a method that consists of two optimization loops, with the outer loop finding a meta-initialization, from which the inner loop can efficiently learn new tasks. Despite MAML's popularity, a fundamental open question remains -- is the effectiveness of MAML due to the meta-initialization being primed for rapid learning (large, efficient changes in the representations) or due to feature reuse, with the meta initialization already containing high quality features? We investigate this question, via ablation studies and analysis of the latent representations, finding that feature reuse is the dominant factor. This leads to the ANIL (Almost No Inner Loop) algorithm, a simplification of MAML where we remove the inner loop for all but the (task-specific) head of a MAML-trained network. ANIL matches MAML's performance on benchmark few-shot image classification and RL and offers computational improvements over MAML. We further study the precise contributions of the head and body of the network, showing that performance on the test tasks is entirely determined by the quality of the learned features, and we can remove even the head of the network (the NIL algorithm). We conclude with a discussion of the rapid learning vs feature reuse question for meta-learning algorithms more broadly.

Comments:	ICLR 2020
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1909.09157 [cs.LG]
	(or arXiv:1909.09157v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1909.09157

Submission history

From: Aniruddh Raghu [view email]
[v1] Thu, 19 Sep 2019 16:30:42 UTC (566 KB)
[v2] Wed, 12 Feb 2020 15:29:39 UTC (580 KB)

Computer Science > Machine Learning

Title:Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML

Submission history

Access Paper:

References & Citations

1 blog link

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML

Submission history

Access Paper:

References & Citations

1 blog link

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators