Towards Efficient Continual Learning in Deep Neural Networks

Mehta, Nikhil

Towards Efficient Continual Learning in Deep Neural Networks

View / Download6.09 MB

Date

2022

Authors

Mehta, Nikhil

Advisors

Carin, Lawrence

Repository Usage Stats

28
views

39
downloads

Abstract

Deep learning, trained primarily on a single task under the assumption of independent and identically distributed (\textit{i.i.d.}) data, has made enormous progress in recent years. However, when naively trained sequentially on multiple tasks, without revisiting previous tasks, neural networks are known to suffer catastrophic forgetting: the ability to perform old tasks is often lost while learning new ones. In contrast, biological life is capable of learning many tasks throughout a lifetime from decidedly non-\textit{i.i.d.} experiences, acquiring new skills, and reusing old ones to learn fresh abilities, all while retaining important previous knowledge.As we strive to make artificial systems increasingly more intelligent, natural life's ability to learn continually is an important capability to emulate.

In a continual learning setting, we desire a deep neural network to learn multiple tasks. Given differences between tasks sequentially, the model may require a different set of parameters for each task. While multiple models can be trained separately for each task, such a system does not incorporate knowledge reuse, and the number of model parameters grow linearly as the number of tasks increase; consequently, learning multiple models is inefficient in both computation and data. In this dissertation, I will discuss my contributions and propose methodologies that can be used to enable continual learning in deep neural networks.

Namely, I will present a probabilistic framework for the expansion of a neural network to support learning on a sequence of tasks. I will also show how expansion-based methods can be implemented for large-scale discriminative and generative tasks. Then, I will discuss the challenging setting of continual zero-shot learning, where the goal is to generalize a model to a sequence of unseen tasks. Finally, I will present an approach for novelty class detection that can be potentially used to trigger a continual learning routine to train the model on a new task.Through these works, I show that continual learning of deep learning models can be done efficiently, resulting in knowledge transfer across tasks, while mitigating catastrophic forgetting of previously acquired knowledge.

Type

Dissertation

Department

Electrical and Computer Engineering

Subjects

Computer engineering

Permalink

https://hdl.handle.net/10161/26806

Citation

Mehta, Nikhil (2022). Towards Efficient Continual Learning in Deep Neural Networks. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/26806.

Collections

Dissertations

Full item page

Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.

Towards Efficient Continual Learning in Deep Neural Networks

Date

Authors

Advisors

Journal Title

Journal ISSN

Volume Title

Repository Usage Stats

Abstract

Type

Department

Description

Provenance

Subjects

Citation

Permalink

Citation

Collections