Towards Efficient Continual Learning in Deep Neural Networks
Date
2022
Authors
Advisors
Journal Title
Journal ISSN
Volume Title
Repository Usage Stats
views
downloads
Abstract
Deep learning, trained primarily on a single task under the assumption of independent and identically distributed (\textit{i.i.d.}) data, has made enormous progress in recent years. However, when naively trained sequentially on multiple tasks, without revisiting previous tasks, neural networks are known to suffer catastrophic forgetting: the ability to perform old tasks is often lost while learning new ones. In contrast, biological life is capable of learning many tasks throughout a lifetime from decidedly non-\textit{i.i.d.} experiences, acquiring new skills, and reusing old ones to learn fresh abilities, all while retaining important previous knowledge.As we strive to make artificial systems increasingly more intelligent, natural life's ability to learn continually is an important capability to emulate.
In a continual learning setting, we desire a deep neural network to learn multiple tasks. Given differences between tasks sequentially, the model may require a different set of parameters for each task. While multiple models can be trained separately for each task, such a system does not incorporate knowledge reuse, and the number of model parameters grow linearly as the number of tasks increase; consequently, learning multiple models is inefficient in both computation and data. In this dissertation, I will discuss my contributions and propose methodologies that can be used to enable continual learning in deep neural networks.
Namely, I will present a probabilistic framework for the expansion of a neural network to support learning on a sequence of tasks. I will also show how expansion-based methods can be implemented for large-scale discriminative and generative tasks. Then, I will discuss the challenging setting of continual zero-shot learning, where the goal is to generalize a model to a sequence of unseen tasks. Finally, I will present an approach for novelty class detection that can be potentially used to trigger a continual learning routine to train the model on a new task.Through these works, I show that continual learning of deep learning models can be done efficiently, resulting in knowledge transfer across tasks, while mitigating catastrophic forgetting of previously acquired knowledge.
Type
Department
Description
Provenance
Subjects
Citation
Citation
Mehta, Nikhil (2022). Towards Efficient Continual Learning in Deep Neural Networks. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/26806.
Collections
Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.